Cover Page

Related Titles

Hall, G.M. (ed.)

How to Write a Paper 5e

5th Edition

2012

Print ISBN: 978-0-470-67220-4; also available in electronic formats

 

Gladon, R.J., Graves, W.R., Kelly, J.

Getting Published in the Life Sciences

2011

Print ISBN: 978-1-118-01716-6; also available in electronic formats

 

Hunt, P.C.

Development for Academic Leaders

A Practical Guide for Fundraising Success

2012

Print ISBN: 978-1-118-27017-2; also available in electronic formats

 

Ebel, H.F., Bliefert, C., Russey, W.E

The Art of Scientific Writing

From Student Reports to Professional Publications in Chemistry and Related Fields

2nd Edition

2004

Print ISBN: 978-3-527-29829-7; also available in electronic formats

 

Deutsche Forschungsgemeinschaft (DFG) (ed.)

Funding Ranking 2009

Institutions - Regions - Networks Thematic Profiles of Higher Education Institutions and Non-University Research Institutions in Light of Publicly Fun

2010

Print ISBN: 978-3-527-32873-4; also available in electronic formats

Roberto Todeschini and Alberto Baccini

 

Handbook of Bibliometric Indicators

Quantitative Tools for Studying and Evaluating Research

 

 

 

Wiley Logo

This book is dedicated with love to

Milo and Giulio, my heroes

RT

To Dadi, Dede, and Edo,

in alphabetical order

AB

Preface

This book aims to collect bibliometric indicators in an encyclopedic form, trying, hopefully, to cover all indicators proposed in the literature.

Knowledge organization provides not only a collection of facts – a store of information – but also a contribution to the growth of knowledge, knowledge organization being itself one way of doing research. This is the true end of an encyclopedic guide: indeed, the organization of knowledge is not separated from its production.

The book tries to meet the great interest on bibliometric indicators shared by all the actors of the scientific communities, being used for analysis and description of research activities and for research evaluation.

The Handbook of Bibliometric Indicators collects definitions, formulas, algorithms, and short comments about the bibliometric indicators known in the literature, defined at different levels, for researchers, groups of authors, institutions, and countries.

All the definitions are organized in alphabetic order, and a big effort was made to unify the symbols in order to enhance the comparability and the comprehension of each topic. Some topics related to the definition of indicators and/or to the bibliometric methodologies are considered, such as similarity/diversity analysis, correlation measures, graphs, or multicriteria decision making. On the other side, some topics are only marginally considered, such as content analysis, coword analysis, text analysis, the visualization, and the network mapping of scientometric data.

The importance of each definition is not directly related to its length, but the length is only related to the need of a clear explanation. An effort was made to collect appropriate bibliographic information under each definition, and attention was paid to place as the first reference(s) near to each definition, the scientist(s) who proposed first the presented topic.

We are sorry if any relevant indicator and/or work has been missed out; although this has not been done deliberately, we take full responsibility for any omission.

Milan, January 2016

Roberto Todeschini
Alberto Baccini

Acknowledgments

Special thanks go to Dr. Viviana Consonni, who fruitfully used her experience in the previous handbooks produced together with one of the authors (RT) and gave a fundamental contribution in revising, evaluating, and criticizing the original text and suggesting a lot of valuable modifications.

The authors also gratefully acknowledge Dr. Francesca Grisoni for her worthy support in producing all the graphics and images collected in the book and Dr. Alberto Montesi for his invaluable help in finding many articles and books that otherwise would have not been listed among the references of this book.

One of the authors (AB) would like also to acknowledge his colleagues and friends of the editorial board of the blog www.roars.it with which in these years he has discussed at length about bibliometric indicators and their application in a research policy perspective.

Notations and Symbols

The notations and symbols used in the book are listed below.

Basic symbols
A Authors
B Bibliometric indices
C Citations
F Research fields
G Groups, categories
J Journals
N Nations, countries
P Papers
R References
Sets
fbetw01-math-0001 Set of vertices (nodes) in a graph
fbetw01-math-0002 Set of edges (links) in a graph
fbetw01-math-0003 Generic set, source set
N Set of natural numbers
R Set of real numbers
Indices
i Index usually running on the papers of an author
j Index usually running on the journals where papers are published
a Index running on the authors
y Generic time index in years
g Index running on the countries, institutions, or research groups
k, m, s General indices
Counts
Ai Number of authors contributing to the ith paper
Ci Number of citations of the ith paper
CT Total number of citations
CH Total number of citations in the h-core
Ct Total number of citations in the h-tail
Pi Number of papers of the ith author
ri Ranking of the ith item
Ri Number of references in the ith paper
T Time window
Y Interval in years
yi Year of the publication of the ith paper
n Total number of items
Characteristic symbols
IF Impact factor
p Probability
p Percentiles
w Statistical weights
Vector and matrices
a, b, … Column vectors
aT, bT, … Row vectors
A, B, … Matrices

The Authors

c01fgy001

Roberto Todeschini is a full professor of chemometrics at the Department of Earth and Environmental Sciences of the University of Milano-Bicocca (Milano, Italy), where he constituted the Milano Chemometrics and QSAR Research Group. His main research activities concern chemometrics in all its aspects, QSAR, molecular descriptors, multicriteria decision making, and software development. President of the International Academy of Mathematical Chemistry from 2008 to 2014, president of the Italian Chemometric Society, and “ad honorem” professor of the University of Azuay (Cuenca, Ecuador), he is author of more than 200 publications on international journals and of the books The Data Analysis Handbook (1994), Elsevier, along with I.E. Frank; Handbook of Molecular Descriptors (2000), Wiley-VCH, along with V. Consonni, and Molecular Descriptors for Chemoinformatics (2009), Wiley-VCH, along with V. Consonni.

c01fgy001

Alberto Baccini is a full professor of Economics at the Department of Economics and Statistics of the University of Siena (Italy). He started his research activities by working on applied economics and history of economic thought. He focused his attention on the history of the theory of choice under uncertainty, becoming an expert of the works of the British economist Francis Y. Edgeworth. More recently, he has been working on application of network analysis techniques to the study of scientometric data and on topics at the boundaries between scientometrics, statistics, and economics, such as the statistical properties of scientometric indicators, researchers' productivity, and research evaluation.

User's Guide

This book consists of definitions of technical terms in alphabetic order, each technical term being an entry of the book.

Each topic is organized in a hierarchical fashion. By following cross-references (→ typeset in italics), one can easily find all the entries pertaining to a topic even if they are not located together. Starting from the topic name itself, one is referred to more and more specific entries in a top-down manner.

Each entry begins with an entry line. There are three different kinds of entries: regular, referenced, and synonym.

A regular entry has its definition immediately after the entry line. A regular entry begins with the bookmark dash dot and can be followed by its acronym and/or symbol, if any, and by its ≡ synonyms, if any.

For example:

dash dot coauthorship-weighted Indices

A wide set of bibliometric indices which take into account the number of authors of a paper and their relative position in the paper byline with the aim to assign each author some credits using some weighting schemes.

In the text of a regular entry, one is referred to other relevant terms by words in italics indicated by →. To reach a complete view of the topic, we highly recommend reading also the definitions of these referred words in conjunction with the original entry.

For example:

dash dot academic age-based indices

These indices are → time-dependent bibliometric indices which, unlike the → paper age-based indices, take into account the whole career of an author, that is, the academic age of the author.

The academic age (Yaa) is defined as …

A referenced entry has its definition in the text of another entry. Each referenced entry begins with the bookmark ➢, and the symbol → precedes the name of the regular entry where the definition of the referenced entry is located.

For example:

  1. dash dot multidimensional H2-index → partition-based bibliometric indicators

A synonym entry is followed by the symbol “≡” and its synonym typeset in italics.

For example:

  1. dash dot composite indicescomposite bibliometric indicators

To find the definition of a synonym entry, if the synonym is a regular entry, one directly goes to the text under the entry line of the synonym word; otherwise, if the synonym is a referenced entry, one goes to the text of the entry indicated by .

For example:

  1. dash dot linear countingproportional counting → coauthorship-weighted indices

In the former case, composite indices is the synonym of the composite bibliometric indicators, which is a regular entry, while, in the latter, linear counting is the synonym of proportional counting whose definition is under the entry “coauthorship-weighted indices.”

The text of a regular entry may include the definition of one or more referenced entries highlighted in bold face. When there are many referenced entries under one regular entry, called the “main entry,” they are often organized in a hierarchical fashion, denoting them by the symbol •. The subentries can be in either alphabetic or logical order.

dash dot author

The person who originates or gives existence to a → research output. A research output can have multiple authors, usually called the coauthors and whose number is the number of coauthors. The names of the authors usually appear in the byline…

Finally, words in italics not indicated by → in the text of a regular entry (or subentry) denote relevant terms for the topic which are no further explained or whose definition is reported in a successive part of the same entry.

The symbol dash dot at the end of each entry denotes a list of further readings.

We have made a special effort to keep mathematical notation simple and uniform. A collection of the most often appearing symbols is provided in Notations and Symbols.

Introduction

To be effective, scientific research has to be communicated. Books, book chapters, and articles in journals are some of the research outputs produced by scientists' in their communication activities. Scientists need to use an apparatus of citations and references in view of establishing their priority in the discovery of a new finding or of demonstrating the originality of their contribution [Merton, 1957]. Citations and references have many functions by allowing to acknowledge predecessors, to trace back the origins of a new idea, to differentiate the new findings by the received ones, and so on [Bornmann and Daniel, 2008b; Baccini, 2010]. Scientific outputs, references, and citations represent the observable features or the raw facts on which bibliometric indicators are built.

Bibliometric indicators were originally developed for the quantitative study of science [De Bellis, 2014]. They represented the frontier for scholars analyzing science from the point of view of sociology by permitting to quantify Robert K. Merton's theoretical construction. The pioneering work of Eugene Garfield was aimed to build efficient tools for transforming raw bibliographic data in useful information for scientists and librarians. The computer revolution, the possibility to realize complex bibliographic and citation databases, fostered the development of the new discipline called scientometrics.

During the last 20 years of the past century, the main traits of public policies were the so-called audit explosion, the governance by indicators, and the consequent diffusion of policy evaluation [Dahler-Larsen, 2012]. In this context, research and education policy also needed sets of indicators to inform governments' decisions about research funding. The growing number of scholars working on the new scientometric machinery supplied many instruments able to satisfy governments demanding tools for monitoring science policies. Universities were induced to follow this trend by improving their accountability, and as a consequence they requested their professors and researchers to document the results they achieved and to prove that they produce value for money. In these years, the seed was planted of the publish or perish age. Bibliometric and citation databases ceased to be managed by scholars at the boundary between academic and market and became truly for profit products maintained and diffused by multinational publishers. Many firms started up in the fledging market of research evaluation and consultancy, and newspapers and consultancy firms started to release the now ubiquitous university rankings.

The current explosion of bibliometric indicators is the last step of a dynamic started many years ago. It may be useful to try to disentangle three different perspectives in which bibliometric indicators are currently developed and used. A first perspective can be labeled as positive bibliometrics. Its main aim is to describe and explain phenomena in science and scientific communication. In this perspective bibliometric indicators represented phenomena or proxies of phenomena. For example, the diffusion of a new idea in the scientific community may be proxied, under suitable assumptions, by the number of citations received by the article in which the idea was exposed for the first time, which is by the impact of the article. Since bibliometric indicators are only tools functional to the study of science, this perspective is the nearest in spirit to the pioneers of bibliometrics. Citation analysis, cocitation analysis, citation networks, informetric laws, and productivity analysis are examples of this stream of literature.

A second perspective, probably the one growing at a faster pace, is the so-called evaluative bibliometrics [Narin, 1971; Cronin, 2000] aimed to define quantitative instruments for evaluating articles, scientists, journals, institutions, and so on. In the majority of applications, an ordered list, that is, a ranking, is produced as a result of the application of the chosen tools [Glänzel, 2009; Waltma, van Eck et al., 2013]. In this perspective bibliometric indicators implicitly or explicitly depend always on a value judgement about observed features. This stream of literature includes articles which proposed indicators for ranking scholars, such as h-index and its many variants, or for ranking journals – such as the impact factor or SNIP.

A third perspective, not yet well developed, can be labeled as normative bibliometrics, and it consists in the definition of indicators and of rules for their use in research evaluation and in research policy [Elkana, Lederberg et al., 1978; Leydesdorff and van den Besselaar, 1997; Moed, Glänzel et al., 2004; Baccini, 2010; Herranz and Ruiz-Castillo, 2013]. In this perspective the focus is primarily on the goals of research evaluation; the problem is to define an evaluative strategies and to design evaluative tools which are coherent with the chosen goals. In particular, bibliometric indicators should be defined and used in view of designing a system of incentives for researchers and institutions able to push them to adopt behaviors pointing to the achievement of the predefined goals. For example, if the goal of an institution is to climb a university ranking, a system of internal evaluation and rewards should be defined for inducing scholar to behave coherently with the chosen goal.

The growth of evaluative bibliometrics and the universal diffusion of citation data and bibliometric indicators about researchers, papers, journals, and institutions are shaping the contemporary landscape of science [Leydesdorff, 2001]. “Policymakers, scientists, scholars, publishers, research administrators, and funding bodies now have at their fingerprints an unprecedented trove of data” [Cronin, 2014]. Whether indicators are formally inserted in research assessment or not, since they involve money or reputation, they tend to affect researchers' behavior in ways that are not yet sufficiently explored [Lawrence, 2003; Bornmann, 2011a; Wouters, 2014]. The strategic responses adopted by researchers may change research priorities [Butler, 2003], publication activities [Csada, James et al., 1996; Bissell, 2013], or research organization [Taylor, Perakakis et al., 2013] in ways that policymakers may judge positively. Strategic behaviors may give rise also to unintended and undesirable consequences for the long run development of science [Gillies, 2008] and for research integrity by favoring misconduct ranging from plagiarism to fabrication of results [Brembs, Button et al., 2013] or to impolite self-citational conducts.

It is natural that dangers emerge when indicators are poorly constructed and misused, but it is not sufficient to have well-constructed indicators to be safe [Lewison, Thornicroft et al., 2007]. For this reason many efforts were devoted to discern best practices and metrics [Rehn, Gornitzki et al., 2014]. Of these efforts at least three need a short consideration here. The first two have a scholarly nature, while the third is more policy oriented. The first document is the San Francisco Declaration On Research Assessment (DORA) originated in December 2012 in a meeting of the American Society for Cell Biology [San Francisco DORA, 2013] and signed up to now by about 600 organizations worldwide and by more than 12000 individuals. The main aim of the declaration is to support the adoption of good practices in research assessment. Dora recommendations suggest the adoption of good practices to funding agencies, institutions, publishers, organization supplying metrics, and finally to researchers. The main aim of DORA is about the need to eliminate the use of journal-based metrics such as journal impact factors, in funding, appointment, and promotion consideration. Indeed the improper use of journal impact factors as the master key for every kind of assessment is probably the most diffuse abuse in everyday practices. According to DORA, a research should be evaluated according to “its own merits rather than on the basis of the journal” in which it is published; indeed the general recommendation sounds as “Do not use journal-based metrics … as a surrogate measure of the quality of individual articles, to assess individual scientist's contributions, or in hiring, promotion, or funding decisions.” It is worthwhile to note that DORA encourages publishers to make available article-level metrics so to allow a shift toward assessment based on the scientific impact of individual published articles or research outputs. It requires also to organizations that supply metrics to be clear about what constitutes an inappropriate manipulation of metrics. DORA commits signatory researchers to adopt the aforementioned general recommendation when they are called to make decision for assessments and to adopt deontological rules for citing: a researcher should give credit where credit is due, by avoiding to cite, when possible, review articles instead of primary literature.

The second document [IEEE, 2013], entitled “Appropriate use of bibliometric indicators for the assessment of journals, research proposals, and individuals,” was issued in September 2013 by the Board of Directors of the Institute of Electrical and Electronics Engineers (IEEE), the world's largest professional association for engineering. The document stigmatizes as “misguided and biased” decisions taken by funding agencies and tenure committees in which only bibliometric data are used for the assessment of “scientific quality” of individual scientists and for funding evaluation of research proposals. IEEE endorses three main tenets in conducting proper assessment. The first one consist in the use of multiple complementary bibliometric indicators at all levels; the second one coincides with the DORA general recommendation; the third one reads as follows: “while bibliometrics may be employed as a source of additional information for quality assessment within a specific area of research, the primary manner for assessment of either scientific quality of a research project or of an individual scientist should be peer review.” The IEEE's document ends disapproving “any practice aimed at influencing the number of citations to a specific journal with the sole purpose of artificially influencing the corresponding indices.”

The third document is the “Leiden manifesto for research metrics” presented at a conference on Science and Technology Indicators in September 2014 and signed by five scientometric scholars [Hicks, Wouters et al., 2015]. It contains 10 general principles which represent, according to the authors, the “distillation of best practice in metric-based research assessment.” The Leiden manifesto is clearly policy oriented and its main aim appears to be a defense of “the important part” that “research evaluation can play … in the development of science and its interactions with society.” The first principle states that “quantitative evaluation should support qualitative, expert assessment.” According to the authors, indicators should represent a constraint for peer review. The second principle states that performance should be measured “against the research missions of the institution, group or researcher.” It is a generic statement according to which no single evaluation model applies to all contexts, and indicators have to be considered in reference not only to the goals of the evaluation but also by taking into account “wider socio-economic and cultural contexts.” Principle 4 refers to the quality of data and of the analytical processes, which should be “open, transparent and simple.” All these conditions, according to the authors, have to be considered as a “common practice among the academic and commercial groups that built bibliometric evaluation methodology over several decades.” Principles 5–8 refers to evaluation procedures by stating that evaluated people and institutions should be allowed to “verify data and analysis,” that variation by field in publication and citation practices must be accounted for, and finally, that individual researchers should be assessed on the basis of “a qualitative judgement of their portfolio,” because “reading and judging a researcher's work is much more appropriate than relying on one number.” The eighth principle consists in avoiding “misplaced concreteness and false precision” by giving preference to the use of multiple indicators instead of a single one. The 10th principle suggests that indicators should regularly scrutinized and updated by considering that assessments and the research system coevolves. Principle 9 is based on the observation that indicators change the system through the incentives they establish for researchers and institutions. The systemic effects of assessment and indicators should be therefore recognized. One of the systematic effects of the use of indicators is a negative bias toward social sciences and humanities which are regionally and nationally engaged. For this reason, a principle of the Leiden Manifesto, the third one, states that “excellence in locally relevant research should be protected” also by developing “metrics built on high-quality non-English literature.”

This handbook wishes to contribute to this ongoing debate. Its first aim is to give to the readers a unified guide to bibliometrics indicators developed both in positive and normative bibliometrics. An effort was made not only for unifying the notation of bibliometric indicators but also for grouping them according to their structure or function. We hope that this effort will allow to better compare indicators and to inform the choices of scholars and practitioners on which indicators have to be used for a given end. We have also tried to make explicit the reasonable interpretation and use of each indicator. A particular attention was devoted to the slippery distinction between research quality, that is, the intrinsic property of the research as judged by peers, and research impact, that is, the influence of a publication on the research activities of scholars as it is proxied by the number of received citations. Despite the fact that in scientometric literature many impact indicators were proposed by using word “quality,” an effort was made throughout the book to maintain a clear-cut separation between the two notions. We hope that this attention will be a valuable contribution to a better definition of the debate about indicators and their interpretation.

The authors of this handbook have built their work on the shoulders of other scholars. Indeed, more than 1900 bibliographic references were taken into account, covering more than 1370 authors and 360 journals. In the figure that follows, the frequency distribution of the references according to the years of publication is shown, together with an approximated indication of three important events in the definition of bibliometric indicators.

image