13 Equity, diversity, and inclusion in research

Guest author: Poppy Riddle

13.1 Introduction

Latonia Harris provided these definitions for a 2021 workshop on diversity, equity, inclusion, and anti-racism in STEMM organizations (Scherer 2021):

Diversity: the numerical representation of groups of individuals based on their primary and secondary characteristics and identities.
Equity: the treatment of individuals in terms of access, opportunity, and advancement.
Inclusion: the ability to meaningfully participate and contribute, both for the benefit of the individual and the organization.
Racism: the devaluation and the denial of rights, dignity, and value of individuals due to their race or geographical origin.

In this chapter, we examine the ways in which diversity, equity and inclusion are lacking, how racism may be/is present in research, and how we can address those issues. We first revisit the social stratification of science by looking at sociological theories as a means of understanding how citations are valued outside of the monetary compensation of labour. We then examine citational justice and epistemic justice as frameworks to understand how we create imbalances as part of our citational behaviours. We then examine some of the biases that have been investigated in bibliometrics and suggest what other biases may be present. We also look at a few data sources to gain an understanding of the limitations and advantages of such tools as an example of the critical perspective needed to understand the complexity of the citation reward system.

13.2 The social stratification of science (revisited)

Merton (1968) wrote of citations as the scientific community’s recognition and signification of knowledge and its source that has been accepted or rejected by the community (as a citation is agnostic whether you cited it to confirm or reject their results). In his psychosociological analysis of science as a social institution (this is a sociological perspective), Merton (1968) finds correlations in the self-assurance in highly acknowledged researchers that is concurrently inherent, yet also socially constructed and supported, so that pursuit of high-risk problems yields disproportionate recognition when communicated. This process of social selection leads to a “concentration of science resources and talent” [Merton (1968)]. Merton’s original 1968 article pointed out two aspects of the Matthew effect, one in which a person is given over-recognition for contributions and the other as misattribution of work for which they did not do (Merton 1968). However, the term is used most commonly for the first condition to describe the accumulation effect of ‘the rich get richer’ in which those at the center, who have already been recognized, are attributed more than those at the margins. To put this another way, the prestige of an author can affect citation behaviours and subsequent patterns.

Bourdieu (1975) recognized this accumulation in the form of citation amounts as social capital, which draws those without towards those that have it. Similarly, those that have it are often awarded funding and resources which may support those around them, not only gaining more social capital for themselves in the process, but also benefiting those contributing to it. This dynamic process results in communities with centralized figures that are highly recognized and rewarded with other agents surrounding them, not only trying their best to be a central figure but also preventing their movement to the margins. The accumulation, then, is a dynamic driving force in a society, regardless of whether the accumulation is in citations, resources, attention, or otherwise that contributes to their social capital.

Citations differ from other forms of capital, however, in that while aggregate citation rates may follow a normally distributed curve over the course of an author’s career, the number of citations only continues to grow as a symbolic reward (Cole and Cole 1973). Citations can be thought of as a currency that can only accumulate, and the gaps between those who have a lot of it and those who do not keep getting larger (Kwon 2022).

Perhaps less central in the scholarship of Merton and Bourdieu is the recognition of the behaviours which result in the oppression, marginalization, or erasure to maintain control of the centre. Margaret Rossiter names the “Matilda effect” (Rossiter 1993) after the suffragist and feminist Matilda Joslyn Gage, who criticized the phenomena in which “women scientists […] have been ignored, denied credit, or otherwise dropped from sight” (Rossiter 1993). Rossiter seizes upon Merton’s original definition of the Matthew effect to explore the phenomena of misattribution through purposeful or innocent ignorance of the contributions of women. She also points out the beautiful irony that the book of Matthew was not written until “two or three generations after his death”, further reinforcing her point of misattribution. Rossiter’s key contribution is the criticism of Merton’s functionalist approach to the effect that there was no emancipatory call as a result of discovering this but rather a map for how new scientists could capitalize on this effect. The Matilda effect represents not just disproportionate attributions benefiting those who already accumulated recognition but the purposeful obfuscation or erasure of women in science whose accomplishments were attributed to men.

This obfuscation and erasure of information are explained as epistemic injustice by Fricker (2007) as both testimonial and hermeneutical injustices in the information world. Testimonial injustice is a form of social power in which identity is used as an oppressive instrument of power. It occurs when one receives or is denied credibility by another, such as when a listener dismisses what a speaker is saying. Hermeneutical injustice is another form of social power and occurs when the speaker is denied “interpretive resources” to understand the unfair disadvantages that they are experiencing. For example, when someone is denied education on domestic violence, they may lack the ability to interpret their situation as being abused.

Both testimonial and hermeneutical injustices exist within LIS (Patin et al. 2020), and without a critical examination of our own contributions, we will continue to unknowingly support systems that contribute to the eradication or devaluation of information. As librarians, we have been part of a history of “privileging certain knowledge systems and ways of knowing over others” (Patin et al. 2020). While academic institutions work with a very small amount of the knowledge that humanity produces, it is a critical part of the scientific contribution to society. As such, moving beyond individual awareness to collective action against epistemic injustices in our information systems is important for LIS to address all injustices, even the accidental omissions that have occurred historically. Within the context of scholarly communication, citations are one of the main ways we can see evidence of biased behaviours and injustices from the scientific community.

In contrast to the observed phenomena from Merton and Bourdieu, Sara Ahmed reframes, from a feminist lens, citations as reproductive technology or “a way of reproducing the world around certain bodies” (Ahmed 2013). “Who appears [and ]who does not appear” (Ahmed 2013) is both a conscious and unconscious choice supported by assumptions and beliefs. Are we inserting ourselves into a viewpoint/perspective, or are we establishing our own perspective or that of our community? With Ahmed’s perspective in mind, perhaps there is an emancipatory approach to citations.

13.3 Measuring bias/disparities in research

From a sociological perspective, which acknowledges and seeks to understand cultural and structural context, biases can manifest in implicit or explicit ways that have long-term ramifications on perceptions of “confidence, capability, trustworthiness” among others (Scherer 2021). On an individual level, biases are learned behaviours and associations that happen quickly, and over time, unconsciously. Biases are also complex with two levels or layers, with the first based upon observable qualities or traits and the second including associations or connections with behaviours that are then compared with those of the observer. Contextual conditions, such as beliefs, can add validation and reinforce biased associations and permit assumptions. The goal of diversity science is to bring awareness of those biases to challenge assumptions so that acknowledgement of the “disparities in resources and opportunities across groups” can be addressed (Scherer 2021).

Studies examining gender bias in scholarly communication utilize algorithms to categorize names within gender categories (typically binary) based on geographic and cultural inferences. NamSor, genderize.io, GenderAPI, and Wiki-Gendersort are the main ones found in bibliometric studies investigating gender bias. The algorithms work by harvesting names from openly available databases and also collect other data such as the country of origin and the family name and language as cultural context identifiers. All names are assigned a gender with a certain probability calculated by the algorithms. These algorithms have some benefits: they are cheap, effective, and can be applied retrospectively to datasets. But they also have limitations, such as the fact that they rely on name-gender databases that may not include self-identification. Moreover, gender probabilities based on names and locations are obviously not perfect and may fail to attribute the right gender in some cases. That said, their accuracy remains acceptable at the aggregate level. While this still presents data along binary categories of gender (not to mention the common conflation of sex and gender as identity), the algorithms are often used to address and dismantle the historical and current oppression rather so that rejecting their use would deprive us from valuable knowledge around gender biases and disparities that exist at a large-scale. Here are several kinds of gender biases or disparities that have been observed in bibliometric studies.

A citation disparity is observed by simply comparing citations indicator between groups (Traag and Waltman 2022). Studies have shown that works by women tend to get less cited than work by men (Larivière et al. 2013) and that women represent only 14% of the highly cited researchers (the group of researchers who publish highly cited publications) in the Web of Science (Meho 2022).
A citation bias is observed when there is a causal relationship between a variable (e.g., gender) and the act of citing a paper (Traag and Waltman 2022). Causation is however difficult to demonstrate in bibliometrics because experiments are extremely rare in the field, and most studies are correlational. Because correlation does not imply causation, it is very difficult to demonstrate a bias (defined as a causal relationship) using bibliometric methods.
Citation homophily is observed when members of one group tend to cite members of the same group more than researchers from other groups. Ghiasi et al. (2018) found that citation homophily occurs in all fields of science but that it is stronger in the Social Sciences and Humanities.

While the examples above refer to citation disparities, biases, and homophily, the same situations or mechanisms can be observed for other indicators such as research outputs, collaboration, funding, awards, hiring, promotions, etc. Understanding how biases and discriminatory practices exist in academia is important for closing the gender gap. Furthermore, disparities, biases, and homophily can be observed for other variables than gender:

Biases based on ethnicity or race impose disadvantages on persons based on their perceived identity. Ethnical biases can include race, ethnicity, and nationality. Secondary associations with race, ethnicity, nationality, and their intersections can further create or maintain harmful stereotypes when authors’ works are perceived as less than those of another group.
Biases in the perceived value of works from certain countries or regions. Examples seen previously include ignoring or devaluing works that are from other countries or regions, the assumption that issues in X country are not applicable to one’s own situation, or assumptions of research quality or rigour if the author has institutional affiliations outside of the perceived ‘norm’. The Global North produces far more publications and receives more citations than the Global South, which also produces more local and geographically contextualized work than other geographic regions (Mongeon et al. 2022). Reinforcing this bias of geographic citations is the evaluation of works for quality, with Global North/Western authors possessing the privilege of not citing authors from other regions with any deleterious effect on their perceived quality, whereas non-Global North/Western authors must cite references from the Global North as evidence of their research quality (Chakrabarty 2007). Other studies grouped countries by income level and found that research in low to middle-income countries tends to be evaluated less favourably than those in high-income countries (Harris et al. 2017).
A devaluation or dismissal of work written in languages other than English exists in citation and pee-review (Lee et al. 2013). There are differences in acceptance rates of manuscripts from authors of English-speaking countries and those of non-English-speaking countries, and sometimes language and writing style is given as reasons for rejection when there is no other problem with the manuscript. Databases of Scopus and Web of Science have a disproportionate coverage of English articles, affecting fields such as social sciences and humanities, where there are more books in languages other than English due to their subject matter and regional specificity (Mongeon and Paul-Hus 2015). Compounding this is that US and English-speaking countries dominate web development, particularly academic web development, contributing to even more bias against non-English sources. As such, all indicators, including those that are web-based, are inherently biased toward English documents from database sources, social media outlets, or search tools (Mas-Bleda and Thelwall 2016).
In their study of peer-review biases, Lee et al. (2013) point to other forms of bias, including affiliation bias (evaluating more favourably work from prestigious institutions), content bias (favouring specific topics or methodologies), confirmation bias (favouring work that support one’s views), or publication bias (favouring positive results). Double-blind peer review has been found to be an effective mediation of these biases. However, manuscripts contain many identifiable characteristics that can provide a reviewer with enough information to correctly identify an author (Baggs et al. 2008), with highly specialized fields, such as bibliometrics, possibly making it easier.

13.4 How do we do better?

Ray et al. (2022) propose citation diversity statements as a reflexive tool to reinforce the commitment to your community of researchers. The following is an example citation diversity statement from Ray et al. (2022):

We are committed to promoting intellectual and social diversity in science and academic scholarship and took this commitment into consideration while researching and writing this article. We actively worked to promote diversity in our reference list while ensuring all the references cited were relevant and appropriate. We have included some references to enhance diversity but have not omitted any references for this purpose. To assess the diversity of our references, we obtained the predicted gender of the first and last author of each reference by using a database that stores the probability of a first name being carried by a woman (gender-api.com). Using this measure and removing self-citations, our references contain 30% woman(first)/woman(last), 11% man/woman, 15% woman/man, and 44% man/man. This method is limited in that a) names, pronouns, and social media profiles used to construct the database may not, in every case, be indicative of gender identity and b) it cannot account for intersex, non-binary, or transgender people. We look forward to future work that could help us to better understand how to support equitable practices in science.

Because it is easy to imagine how citation diversity statements could lead to tokenism (diversifying citations artificially for the sole purpose of “looking good”), Ray et al. (2022) insist on the ethical importance of citing works that provide information relevant to a paper, and not simply because of some box on a manuscript submission form that needs to be checked. That said, unconscious biases in citing behaviours may not support the best interests of researchers and their research community. Investment in thoughtful, purposeful citations of works one is engaged with will not only strengthen communities but, when done with an awareness of having diverse voices as a strengthening practice, will also improve the overall quality of scholarly works.

Given that disparities exist historically, basing decisions upon results such as these with stereotypes about the quality of all articles in the Global South (also problematic) contributes further to the disparity. From an emancipatory perspective, the path to fixing this is making time to explore, engage with, and understand scholarly production from geographic locations beyond the norm. This not only enriches one’s own writing through a more balanced view but also respects and recognizes advances by researchers.

Is citation technology compatible with “social equity, freedom, and cultural pluralism” or does its existence require centralized control through ownership, market forces, and power concentrations Winner (1980)? On the one hand, there is the rather functional view of the phenomena of social capital in which we see centers of power within scholarly communication and citations as part of the reward system of science, and that by citing, we associate our work with these centralized actors. On the other hand, there is an emancipatory view in which we view citations as a technology that enables us to redistribute and acknowledge those that we have engaged with, recognize, and proliferate ideas that are meaningful to us and our part within a community.

13.4.1 Citational justice

Kumar and Karusala (2021) introduce Iris Marion Young’s faces of oppression as a framework for understanding and addressing citational (in)justice. They define justice as “a relational value of the actions, structures, and institutions in which persons stand to each other as social and political subjects, be they structures of the production and distribution of material goods or of the exercise of political power”,and view the citation as “anti-racist, feminist technologies” (Kumar and Karusala 2021) with the potential to correct the imbalances have occurred. The authors present some examples of ways in which injustices have shown up in their own work and reviews, which may provide an opportunity for self-reflection upon your own citation practices.

Exploitation – occurs when the balance of work and compensation is leveraged, creating inequality and power dynamics. This supports the rich-get-richer aspect of Merton’s Matthew effect by leveraging power. The authors identify several types of citation behaviours found in their own work.
- The Cite-Me Cite can occur when submitting papers to journals and the editors pressure the authors to cite their work in return for an acceptance. This is particularly a concern/signal of predatory journal practices.
- The Name-Agnostic Cite occurs when hard-to-recognize/pronounce/read names are othered, as in “other authors have investigated…” whereas Western names are clearly cited.
- The In-the-Global-South and Unrelated-to-the-North Cite falls along similar lines as othering or even making certain work irrelevant. See Linxen et al. (2021) for a study exploring this issue.
- The Throwaway Cite occurs when citations are lumped together without individual attention or recognition, as in “studies in LIS have examined the effect of unicorns (Name, 1986; Name, 1993; Name, 2000; Name et al., 2002; Name et al., 2013).” While this practice may be an intent to be exhaustive yet concise, who is this benefiting and for what purpose?
- The No Cite is when references are not made as conscious or unconscious decisions to omit. While addressing this type of non-cite requires greater rigour, not doing so is a privilege that is being assumed.
Marginalization – when a category of persons is excluded and thereby deprived, not only at the individual level but also at the collective level. This is evident in conferences that privilege some populations, such as conferences that have never been held in the Global South. Some universities dominate some disciplines, which can lead to a misperception of enhanced value, affecting acceptance and possibly citation. Women, scholars of colour, and gender diversity, also exhibit the effect of biases upon their communities, as evidenced by citation gaps.
Powerlessness – Those in the community that “lack significant power”, a voice, or opportunity to contribute to decision making. This occurs when assumptions are made, creating or reinforcing norms that we expect to be accepted. These assumptions, without critical inquiry, can shape not only our readers but ourselves. This can include the assumptions that work from certain groups lacks rigour, works published at certain venues or in certain journals is not of high quality, Wikipedia is not a valid source of knowledge, papers written in other languages are not relevant, etc.
Cultural imperialism – an interpretation of normal within a society that reflects the dominant society’s cultural values, at once othering other groups within society and reinforcing stereotypes that maintain this power imbalance. For example, work in the Global South focuses on the poorest communities, focus on novelty or differences within other cultures, the universality of Western ethical standards, and the expectation that English by those outside the Western North is of low quality. Cultural imperialism also occurs when the research of marginalized communities is interpreted within the frameworks of the dominant society for their own use.
Violence – here, it is important to bring in Young’s words from Kumar and Karusala (2021):

“While the frequency of physical attack on members of these and other racially or sexually marked groups is very disturbing, I also include in this category less severe incidents of harassment, intimidation, or ridicule simply for the purpose of degrading, humiliating, or stigmatizing group members.”

The authors continue to say that it’s less so the violent act than it is the social conditions which continue to permit it to happen. They illustrate this face of oppression in scholarly communications with evidence from reviews in which questions are called about the relevance of a health topic if it only affects a small population, the criticism of research aims of Black scholars as outside of typical scholarship, how disabled persons are singled out for research, or the long-term bias against low citation counts and the assumptions of quality or relevancy.

These five faces of oppression attributed to Young by Kumar and Karusala (2021) provide a framework for understanding and deconstructing biases in our choices and assumptions affecting citational justice in scholarly communication. This is not the only framework or examples that can be found. Unethical citation practices have been around for quite some time, so there are more resources for understanding how these exist.

13.4.1.1 Chicken and Egg

Kwon (2022) argues that citation-based evaluation of individual researchers in any context (e.g., funding, publishing, promotion, awards) needs to change or be reduced in importance and replaced by engaging, recognizing and valorizing ideas from diverse sources. Mott and Cockayne (2017) also recognize citations as a problematic technology but also suggest that citations can act as a feminist and anti-racist “technology of resistance” to correct the imbalance. There appears to be a tension between the need to reduce the influence of citations while at the same time exploiting that influence for the purpose of correcting the injustices perpetrated by them.

13.5 Conclusion

This chapter has examined the imbalances that occur in scholarly communication through Merton, Bourdieu, and Rossiter’s respective sociological theories and how our citation behaviours as authors can represent citational injustice or even epistemic injustice through our conscious and unconscious choices. We critically examined a few of the data sources and tools used for analysis within bibliometrics, such as gender-determining algorithms and global income categorization, for their limitations and advantages, as part of the ongoing attempts by the scientific community to address equitable imbalances within our scholarly communication system. I closed with thoughts on thinking about our citation system objectively as functionalists or as emancipatory activists.

In writing such a chapter, it must be acknowledged that it was written by a privileged white person who has settled in Canada. Some of the sources I draw from are by self-identified persons of colour, and these sources should be read fully so that my filtered version does not take away from the significance of their words and experiences. This filtration is typical in Western society and represents the endemic cultural imperialism in our education system. I appreciate the opportunity, as a queer, transgender woman, to provide my perspective. Still, I recognize it is a very narrow lens as a participant within a global community of knowledge producers.

13.6 References

Ahmed, Sara. 2013. “Making Feminist Points.” https://feministkilljoys.com/2013/09/11/making-feminist-points/.

Baggs, Judith Gedney, Marion E. Broome, Molly C. Dougherty, Margaret C. Freda, and Margaret H. Kearney. 2008. “Blinding in Peer Review: The Preferences of Reviewers for Nursing Journals.” Journal of Advanced Nursing 64 (2): 131–38. https://doi.org/10.1111/j.1365-2648.2008.04816.x.

Bourdieu, Pierre. 1975. “The Specificity of the Scientific Field and the Social Conditions of the Progress of Reason.” Social Science Information 14 (6): 19–47. https://doi.org/10.1177/053901847501400602.

Chakrabarty, Dipesh. 2007. Provincializing Europe: postcolonial thought and historical difference. Princeton studies in culture / power / history. Princeton (N.J.): Princeton university press.

Cole, Jonathan R, and Stephen Cole. 1973. Social Stratification in Science. Chicago, IL: University of Chicago Press.

Fricker, Miranda. 2007. Epistemic Injustice. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780198237907.001.0001.

Ghiasi, G., P. Mongeon, C. Sugimoto, and V. Larivière. 2018. “Gender Homophily in Citations.” In, 1519–25. Centre for Science; Technology Studies (CWTS). https://hdl.handle.net/1887/65291.

Harris, Matthew, James Macinko, Geronimo Jimenez, and Pricila Mullachery. 2017. “Measuring the Bias Against Low-Income Country Research: An Implicit Association Test.” Globalization and Health 13 (1). https://doi.org/10.1186/s12992-017-0304-y.

Kumar, Neha, and Naveena Karusala. 2021. “CHI ’21: CHI Conference on Human Factors in Computing Systems.” In, 1–9. Yokohama Japan: ACM. https://doi.org/10.1145/3411763.3450389.

Kwon, Diana. 2022. “The Rise of Citational Justice: How Scholars Are Making References Fairer.” Nature 603 (7902): 568–71. https://doi.org/10.1038/d41586-022-00793-1.

Larivière, Vincent, Chaoqun Ni, Yves Gingras, Blaise Cronin, and Cassidy R. Sugimoto. 2013. “Bibliometrics: Global Gender Disparities in Science.” Nature 504 (7479): 211–13. https://doi.org/10.1038/504211a.

Lee, Carole J., Cassidy R. Sugimoto, Guo Zhang, and Blaise Cronin. 2013. “Bias in Peer Review.” Journal of the American Society for Information Science and Technology 64 (1): 2–17. https://doi.org/10.1002/asi.22784.

Linxen, Sebastian, Christian Sturm, Florian Brühlmann, Vincent Cassau, Klaus Opwis, and Katharina Reinecke. 2021. “CHI ’21: CHI Conference on Human Factors in Computing Systems.” In, 1–14. Yokohama Japan: ACM. https://doi.org/10.1145/3411764.3445488.

Mas-Bleda, Amalia, and Mike Thelwall. 2016. “Can Alternative Indicators Overcome Language Biases in Citation Counts? A Comparison of Spanish and UK Research.” Scientometrics 109 (3): 2007–30. https://doi.org/10.1007/s11192-016-2118-8.

Meho, Lokman I. 2022. “Gender Gap Among Highly Cited Researchers, 20142021.” Quantitative Science Studies, November, 1–21. https://doi.org/10.1162/qss_a_00218.

Merton, Robert K. 1968. Social Theory and Social Structure. New York: Free Press.

Mongeon, Philippe, and Adèle Paul-Hus. 2015. “The Journal Coverage of Web of Science and Scopus: A Comparative Analysis.” Scientometrics 106 (1): 213–28. https://doi.org/10.1007/s11192-015-1765-5.

Mongeon, Philippe, Adèle Paul-Hus, Maria Henkel, and Vincent Larivière. 2022. “On the Impact of Geo-Contextualized and Local Research in the Global North and South.” https://doi.org/10.5281/ZENODO.6956977.

Mott, Carrie, and Daniel Cockayne. 2017. “Citation Matters: Mobilizing the Politics of Citation Toward a Practice of ‘Conscientious Engagement’.” Gender, Place & Culture 24 (7): 954–73. https://doi.org/10.1080/0966369X.2017.1339022.

Patin, Beth, Melinda Sebastian, Jieun Yeon, and Danielle Bertolini. 2020. “Toward Epistemic Justice: An Approach for Conceptualizing Epistemicide in the Information Professions.” Proceedings of the Association for Information Science and Technology 57 (1). https://doi.org/10.1002/pra2.242.

Ray, Keisha S., Perry Zurn, Jordan D. Dworkin, Dani S. Bassett, and David B. Resnik. 2022. “Citation Bias, Diversity, and Ethics.” Accountability in Research 0 (0): 1–15. https://doi.org/10.1080/08989621.2022.2111257.

Rossiter, Margaret W. 1993. “The Matthew Matilda Effect in Science.” Social Studies of Science 23 (2): 325–41. https://doi.org/10.2307/285482.

Scherer, Layne, ed. 2021. Addressing Diversity, Equity, Inclusion, and Anti-Racism in 21st Century STEMM Organizations. National Academies Press. https://doi.org/10.17226/26294.

Traag, V. A., and L. Waltman. 2022. “Causal Foundations of Bias, Disparity and Fairness.” https://doi.org/10.48550/ARXIV.2207.13665.

Winner, Langdon. 1980. “Do Artifacts Have Politics?” Daedalus 109 (1): 121–36. http://www.jstor.org/stable/20024652.