Big Data in Studies of Science:
New Research Field

Guba K.S.

Cand. Sci. (Sociol.), Head of the Center for Institutional Analysis of Science and Education, European University at Saint Petersburg, St. Petersburg, Russia

DOI: 10.31857/S013216250013878-8
ID of the Article:

This article was prepared with support from the Russian Science Foundation, project No. 21-18-00519.

For citation:

Guba K.S. Big Data in Studies of Science: New Research Field. Sotsiologicheskie issledovaniya [Sociological Studies]. 2021. No 6. P. 24-33


The article discusses the unprecedented opportunities of bringing big data for studies of science. Due to the dramatic change in how quickly and in what volumes data can be extracted from open sources, the science of science has been developed offering research of science based on large-scale metadata. The scale of the data is especially valuable for the study of science, which is characterized by a high level of stratification and segmentation. In turn, the techniques of network and computational text analysis have influenced how research questions were proposed. These new tools declare far-reaching implications for the science of science because researchers have the possibility to employ a flexible approach and refuse to rely on pre-defined categories, as was common for previous studies in the sociology of science. New opportunities in data collection and analysis have attracted researchers from diverse scientific fields. The result is the application of new conceptual models that are no longer limited to sociological conceptualizations.

sociology of science; big data; scientometrics; science


Берман Ш. Большие данные и историческая социальная наука // Социологические исследования. 2020. № 2. С. 144–149. [Bearman P.S. (2020) Вig Data and Historical Social Science. Sotsiologicheskiye issledovaniya [Sociological Studies]. No. 2: 144–149. (In Russ.)] DOI: 10.31857/S013216250008501-4.

Волков В., Скугаревский Д., Титаев К. Проблемы и перспективы исследований на основе Big Data (на примере социологии права) // Социологические исследования. 2016. № 1. С. 48–57. [Volkov V., Skugarevsky D., Titaev K. (2016) Problems and Prospects for Studies Based on Big Data (the Case of Sociology of Law). Sotsiologicheskiye issledovaniya [Sociological Studies]. No. 1: 48–57. (In Russ.)]

Губа К. Большие данные в социологии: новые данные, новая социология? // Социологическое обозрение. 2018. № 1. С. 41–64. [Guba K. (2018) Big Data in Sociology: New Data, New Sociology? Sotsiologicheskoye obozreniye [Russian Sociological Review]. No. 1: 41–64. (In Russ.)] DOI: 10.17323/1728-192X-2018-1-213-236.

Одинцов А.В. Открытость баз данных как условие формирования «больших данных» в социологии // Научно-методический электронный журнал «Концепт». 2017. № 12. URL: 2017/173020.htm (дата обращения: 12.05.2021). [Odincov A.V. (2017) Openness of Databases as a Condition for the Formation of “Large Data” in Sociology. Nauchno-metodicheskiy elektronnyj zhurnal “Kontsept” [Scientific-methodological electronic journal “Koncept”]. No. 12. URL: 2017/173020.htm (accessed 12.05.2021). (In Russ.)]

Сивков Д. Большие данные в этнографии: вызовы и возможности // Социология науки и технологий. 2017. Т. 8. № 1. С. 56–68. [Sivkov D. (2017) Big Data and Ethnography: Challenges and Opportunities. Sociologia Nauki i Tehnologij [Sociology of Science & Technology]. Vol. 8. No. 1: 56–68. (In Russ.)]

Baldi S. (1995) Prestige Determinants of First Academic Jobs of New Sociology PhD. s, 1983–1992. The Sociological Quarterly. Vol. 36. No. 4: 777–789. DOI: 10.1111/j.1533-8525.1995.tb00464.x.

Crane D. (1969) Social Structure in a Group of Scientists: A Test of the “Invisible College” Hypothesis. American Sociological Review. Vol. 34. No. 3: 335–352. DOI: 10.2307/2092499.

Deville P. et al. (2014) Career on the Move: Geography, Stratification, and Scientific Impact. Scientific Reports. No. 4. Article no. 4770. DOI: 10.1038/srep04770.

Dimaggio P. (2015) Adapting Computational Text Analysis to Social Science (and Vice Versa). Big Data & Society. Vol. 2. No. 2: 1–5. DOI: 10.1177/2053951715602908.

Evans J., Foster J. (2011) Metaknowledge. Science. Vol. 331. No. 6018: 721–725. DOI: 10.1126/ science.1201765.

Fortunato S. et al. (2018) Science of Science. Science. Vol. 359. No. 6379. Article no. eaao0185. DOI: 10.1126/science.

Foster J., Evans J., Rzhetsky A. (2015) Tradition and Innovation in Scientists’ Research Strategies. American Sociological Review. 2015. Vol. 80. No. 5: 875–908. DOI: 10.1177/0003122415601618.

Friedkin N. (1998) A Structural Theory of Social Influence. Santa Barbara: University of California.

Golder S., Macy W. (2014) Digital Footprints: Opportunities and Challenges for Online Social Research. Annual Review of Sociology. Vol. 40. No. 1: 129–152. DOI: 10.1146/annurev-soc-071913-043145.

Greenberg S. (2009) How Citation Distortions Create Unfounded Authority: Analysis of a Citation Network. BMJ. Vol. 339. Article no. b2680. URL: (accessed 12.05.2021). DOI: 10.1136/bmj.b2680.

Guskov A., Kosyakov D., Selivanova I. (2018) Boosting Research Productivity in Top Russian Universities: The Circumstances of Breakthrough. Scientometrics. Vol. 117. No. 2: 1053–1080. DOI: 10.1007/ s11192-018-2890-8.

Hofstra B. et al. (2020) The Diversity-Innovation Paradox in Science. Proceedings of the National Academy of Sciences of the United States of America. Vol. 117. No. 17: 9284–9291. DOI: 10.1073/pnas.1915378117.

Janosov M., Battiston F., Sinatra R. (2020) Success and Luck in Creative Careers. EPJ Data Sci. A Springer Open Journal. Vol. 9. No. 1. DOI: 10.1140/epjds/s13688-020-00227-w.

Kitchin R. (2014) Big Data, New Epistemologies and Paradigm Shifts. Big Data & Society. Vol. 1. No. 1. DOI: 10.1177/2053951714528481.

Kitchin R., McArdle G. (2016) What Makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society. Vol. 3. No. 1. DOI: 10.1177/2053951716631130.

Leahey E., Reikowsky R. (2008) Research Specialization and Collaboration Patterns in Sociology. Social Studies of Science. Vol. 38. No. 3: 425–440. DOI: 10.1177/0306312707086190.

Long J. (1978) Productivity and Academic Position in the Scientific Career. American Sociological Review. Vol. 43. No. 6: 889–908. DOI: 10.2307/2094628.

Mazloumian A., Young H., Helbing D., Lozano S., Fortunato S. (2011) How Citation Boosts Promote Scientific Paradigm Shifts and Nobel Prizes. PLoS ONE. Vol. 6. No. 5. Article no. e18975. DOI: 10.1371/ journal.pone.0018975.

McFarland D. et al. (2013) Differentiating Language Usage through Topic Models. Poetics. Vol. 41. No. 6: 607–625. DOI: 10.1016/j.poetic.2013.06.004.

McFarland D., Lewis K., Goldberg A. (2016) Sociology in the Era of Big Data: The Ascent of Forensic Social Science. American Sociologist. Vol. 47. No. 1: 12–35. DOI: 10.1007/s12108-015-9291-8.

Moskaleva O., Pislyakov V., Sterligov I., Akoev M., Shabanova S. (2018) Russian Index of Science Citation: Overview and Review. Scientometrics. Vol. 116. No. 1: 1076–1086. DOI: 10.1007/s11192-018-2758-y.

Seeber M., Cattaneo M., Meoli M., Malighetti P. (2019) Self-citations as Strategic Response to the Use of Metrics for Career Decisions. Research Policy. Vol. 48. No. 2: 478–491. DOI: 10.1016/j. respol.2017.12.004.

Shen H., Barabási A. (2014) Collective Credit Allocation in Science. Proceedings of the National Academy of Sciences. Vol. 111. No. 34: 12325–12330. DOI: 10.1073/pnas.1401992111.

Shi F., Foster J., Evans J. (2015) Weaving the Fabric of Science: Dynamic Network Models of Science’s Unfolding Structure. Social Networks. Vol. 43: 73–85. DOI: 10.1016/j.socnet.2015.02.006.

Sinatra R., Deville P., Szell M., Wang D., Barabási A. (2015) A Century of Physics. Nature. Vol. 11. No. 10: 791–796. DOI: 10.1038/nphys3494.

Vilhena D. et al. (2014) Finding Cultural Holes: How Structure and Culture Diverge in Networks of Scholarly Communication. Sociological Science. Vol. 1. No. 15: 221–238. DOI: 10.15195/v1.a15.

Content No 6, 2021