Discipline: Computer Sciences and Information Management
Subcategory: Computer Science & Information Systems
Session: 1
Room: Park Tower 8219
Adrianne Wu - Mount Holyoke College
Co-Author(s): Elizabeth Sigworth, BA, Vanderbilt University, Nashville, TN; Xuanyi Li, BS, Vanderbilt School of Medicine, Nashville, TN; Samuel M. Rubinstein, MD, Vanderbilt University, Nashville, TN; Jeremy L. Warner, MD, MS, Vanderbilt University, Nashville, TN
Background: Progress in the field of hematology/oncology is mediated primarily through the publication of practice-changing clinical trials. Conventional wisdom holds that individual researchers typically specialize in their research and publish in one sub-field. However, the vitality of a field is also driven by collaboration within and across sub-fields. We hypothesized that high impact individuals could be distinguished from highly collaborative individuals, using several metrics.
Methods: We established measures of authors’ impact as well as the diversity of their publications. Our data set draws from the prospective clinical trial literature cited on HemOnc.org, a wiki-based website primarily intended for hematology/oncology professionals. The analyzed data spans ~4,000 publications comprising >20,000 authors. Diversity is based on the Gini Index, Gi, which classically measures income disparity in a country by evaluating the inequality of values among income levels. We re-purposed it to measure an author?s sub-field diversity using the distribution of their publications across 12 broad cancer sub-fields (e.g., thoracic oncology; breast cancer; lymphoma). A higher Gi indicates more inequality, or specialization. We assigned an impact score based on author position and the impact factor of the journal of publication, with first or last authors and higher tier journals having more weight.
Results: For all authors, more collaboration (lower Gi) across sub-fields has a statistically significant correlation with a higher median impact (correlation=-0.20, p-value<0.0001; 95% confidence interval -0.19, -0.21). Strikingly, the authors with lower Gi are responsible for the overall network structure; without them, the network would fall apart, leaving only a collection of siloed sub-graphs. However, with few exceptions, the authors with highest impact tended to also have high Gi. Additionally, the gender distribution of the top 100 authors by each measure suggests a significant disparity. Of the top 100 authors by impact, a mere 9% are women, while they compose 32% of those by lowest Gi. The odds that a woman is in the top 100 authors by Gi (i.e., the lowest Gi) are 4.64 times the odds that she is in the top 100 by impact (p-value<0.0001).
Conclusions: Future work will focus on the ratio of in-links, or connections to authors who are classified as being in the same sub-field, to out-links, or connections to authors who are classified as being in another sub-field, while also considering the average Gi of the author?s collaborators. We further expect to look into the correlation of Gi with median publication to see if this aligns with the traditional narrative of cancer history, e.g., as chronicled in Siddhartha Mukherjee?s The Emperor of All Maladies, as well as evaluate temporal trends in network structure and examine additional attributes of authorship, e.g., years in the field and institutional affiliation.
Funder Acknowledgement(s): I wish to thank the Department of Biomedical Informatics at Vanderbilt University, in particular Rischelle Jenkins and Kim Unertl, PhD, MS, as well as the Warner lab. This project was funded by NSF Award #1757644.
Faculty Advisor: Jeremy Warner, jeremy.warner@vumc.org
Role: I worked with Dr. Warner and Dr. Rubinstein to establish measures of author impact and publication diversity, and then generated plots and relevant figures based on those metrics in RStudio. Additionally, I was responsible for author disambiguation and parser code maintenance. I independently created a PowerPoint I presented to the Department of Biomedical Informatics summarizing my project as well as a poster for the Vanderbilt Summer Science Academy poster symposium.