A new look at the dynamics of scientific collaborations

The structure of scientific collaborations has been the subject of intense study both for its importance for innovation and scientific progress, and as a model system for the coordination and formation of social groups thanks to the availability of data on authorship.

In recent years, complex network approaches to this problem have yielded important insights and shaped our understanding of scientific communities. In our recently published article on EPJ Data Sciencewe propose to integrate the framework provided by network tools with that coming from the analysis of topological data, which has at its center the notion of multi-agent interactions.

This topological approach it allows us to go beyond k-clique descriptions, as it can easily distinguish between sums of pairwise interactions and genuine higher-order interactions. Furthermore, without relying on local properties or global distributions, it allows us to discover mesoscopic properties of the dataset through new tools such as homology, which encodes a notion of multidimensional shape.

We examined the differences between scientific fields, focusing on the properties of arXiv categories in terms of higher-order elements, especially the set of different collaborations to which the authors belong.

We classify each category, which represents a proxy for the corresponding scientific community, into one of three groups based on the different functional forms of the collaboration size distribution. Thus, our analyzes highlight different organizational structures probably due to the different topics (for example, group s1 is mainly theoretical work, s3 is mainly experimental work).

Furthermore, our results reveal that while categories are characterized by organizational and cultural differences, individual ability to participate in collaborations is similar across categories.

The results suggest that authors in experimental categories tend to collaborate in larger, not completely overlapping groups. Theoretical communities of the same disciplines tend to have smaller and repeated collaborations within larger groups.

These results suggest that authors in experimental categories tend to collaborate in larger, not completely overlapping groups. Thinking about the dynamics of large experiments, this is reasonable as they gain new authors and lose others over time leading to larger and slightly different collaborative groups. Conversely, in more theoretical communities of the same disciplines, collaborative groups tend to have slower membership turnover over time and smaller, more repeated collaborations within larger groups.

The topological framework allows us to introduce a higher order version of the concept of triadic closure, quantifying the probability that a trio of authors who collaborated in pairs also collaborated as a group. We find very strong closure in all categories of our dataset, indicating the presence of higher-order clustering and consistent with what would be expected in the existing models of local growth reported for co-authorship and social networks.

Finally, we focus on the connection patterns between network communities, finding them well correlated with the hole structure of an associated topological object, highlighting an unexpected separation between the local and long-range collaboration scale. Overall, our results suggest that several mechanisms structuring the high-order connectivity characteristics of scientific collaborations may be at play.

Read the full study Here.