The Guide provides a brief overview of Bibliometrics; these provide methods of extracting measurable data around publication and citation activity as one possible indicator of research quality, productivity or reach.
A Citation is a reference provided by an author to direct a reader to a published or unpublished source or underpinning set of data, usually for the purpose of acknowledging their relevance to the topic of discussion.
The number of citations an article receives is one indicator of the "academic impact" of the article, providing an indication of its popularity (or reach) in terms of how many people have read and then applied or referred to that research. A high citation count is not a direct indication of high quality, however. Read about the Limitations of some publication metrics on this guide (see Responsible Metrics).
It is possible to track when newly published research cites a published research output you are already interested in.
This could be useful to:
For further information, see our guide to Citation Searching.
In order to monitor citations, you need an as comprehensive citation dataset as possible to make the collection, counting and analysis in any way meaningful.
Below are five key sources of citation data available:
|Key Indicators / Uses
The Initiative for Open Citations (I4OC) is a collaboration between scholarly publishers, researchers, and other interested parties to promote the unrestricted availability of scholarly citation data.
It recognises that in order to best enable researchers, and the wider public, to keep up with new and significant developments in any field, it is "essential to have unrestricted access to bibliographic and citation data in machine-readable form" and that citation data are "not usually freely available to access, they are often subject to inconsistent, hard-to-parse licenses, and they are usually not machine-readable".
Further information about I4OC.
|Open Citation data is provided by many academic publishers and may be accessed within a few days through the Crossref REST API (which is fed into our Library Discover service).
Previously provided by Thomson Reuters, and now by Clarivate Analytics, the Web Of Science is the original 'Citation Index" for published academic research, originating with the Science Citation Index (SCI) in 1964, and later followed by the 'Arts and Humanities Citation Index' and the 'Social Sciences Citation Index'.
Further information about Web of Science content coverage.
Citation data from the Web of Science is used to calculate the:
Citation data from Web of Science will be used in REF2021, and forms part of the calculations used by the:
Provided by Elsevier, Scopus launched in 2004 as a competitor to Web of Science.
Citation data from Scopus is used to calculate the:
Citation data from Scopus was used in REF2014, and forms part of the calculations used by the:
Dimensions is a linked data platform launched in January 2018 by Digital Science (who also provide services including Altmetric.com, Figshare and ReadCube).
It extracts references between publications either from existing sources (such as Crossref, PubMed Central, or Open Citations (I4OC)), or directly from the full text record provided by the content publisher. Reference extraction is not limited to articles published within journals: it also includes citations from (and to) monographs, text-books, conference proceedings, and pre-prints.
You can see some examples of how publishers and others have used Dimensions data here. You may also spot Dimensions Badges on some article, repository or author profile pages, similar to this one below:
Unlike Web of Science and Scopus (which require subscription access), this is a free to access service which provides citation data.
See our blog post on citation data in Google Scholar, and why citation counts are often higher in Google Scholar than anywhere else.
|Many academics create a Google Citations Profile to track citations for their own publications, or use Publish or Perish (free software) to download and calculate various metrics from the data available.
The most basic metric which can be used as a measure of productivity is the number of publications produced by an individual, or group of individuals.
The total sum of citations received by an author's research outputs, or a group of researcher's outputs.
The mean citation rate of a group of research outputs.
Either a total number of publications which have received at least 1 citation, or a percentage of total publications which have received 1 or more citations.
A comparison of the actual number of citations received by a single output, or large group of outputs, with what might have been the expected number of citations they would receive, based upon the mean number of citations received by all other similar publications (e.g. normalised by output type, output age and field of study).
The % of a group of outputs which are in the global top 1/10/25% most cited outputs.
The % of a group of outputs which are in the global top 1/5/10/25% of journals, when ranked by an identified journal metric (eg by JIF, Citescore, SJR or SNIP).
Some metrics may also look at the Citation Impact of outputs within a group of outputs, which have a co-author with an affiliation which does not belong to the parent group.
For example, this might offer a comparison of the Citation Impact of a group of articles with international (e.g. where a co-author's affiliation does not belong to the author's institution and is outside that institution's country) or corporate co-authors, compared to the Citation Impact of the whole group of articles.
Calculated from the previous 2 year’s worth of citation data found in the Web of Science (Clarivate Analytics) database. It gives an approximate measure for the average number of citations articles published in that journal over 2 years have received, in that year (So a 2015 JIF is the average number of citations received in 2015, for articles published in 2013-14). Citations are not weighted, nor can you draw any conclusions from comparing journals across subject boundaries as it will not take into account differences in publication or citation culture.
Further information: https://clarivate.com/webofsciencegroup/essays/impact-factor/
JIF Scores: Available via Web Of Science (Journal Citation Reports) - Library Subscription
Calculated from the previous 2 years of citation data as curated by the Journal Citation Reports (Web of Science (Clarivate Analytics) database). Citations are weighted based upon where they come from. Eigenfactor scores are scaled so that the sum of scores for all journals listed in the JCRs total 100, so that a journal with an Eigenfactor score of 1.0 has 1% of the total “influence” of all indexed publications. There are over 11,000 journals ranked, with PLoS One having the highest Eigenfactor Score as of 2019 (with a score of 1.70677, compared to Nature's 1.28501).
Further information: http://www.eigenfactor.org/index.php
JIF Scores: Available via Web Of Science (Journal Citation Reports) - Library Subscription
Calculated from the previous 3 year’s worth of citation data found in the Scopus (Elsevier) database. Launched in December 2016, 'Citescore' is similar to the JIF - but is updated monthly as well as annually. It gives an approximate measure for the average number of citations articles published in that journal over 2 years have received in that year (So a 2016 Citescore is the average number of citations received in 2016, for articles published in 2014-15). Citations are not weighted, nor can you draw any conclusions from comparing journals across subject boundaries as it will not take into account differences in publication or citation culture.
Further information on the (2020) updated methodology for Citescore: Scopus Blog June 2020
Citescore Rankings: Available via Scopus Journal Metrics
Calculated from the previous 3 year’s worth of citation data found in the Scopus (Elsevier) database. Citations are weighted based upon where they come from (a journal with a higher or lower SJR), and normalised based upon the set of documents which cite its papers, thus providing a ‘classification free’ measure for comparison.
Further information: http://www.scimagojr.com/
SJR Scores: Available via Scopus Journal Metrics
Calculated from previous 3 years of citation data found in the Scopus (Elsevier) database. A journal’s ‘subject field’ is taken into account, normalising for subject specific citation cultures (average number of citations, amount of indexed literature, speed of publication) to allow an easier comparison of scores for journals between different subject areas.
Further information: https://www.elsevier.com/solutions/scopus/how-scopus-works/metrics
SNIP Scores: Available via Scopus Journal Metrics
The Hirsch index (or Hirsch number) was first proposed in 2005 as a measure for the academic productivity and impact of a researcher's publications over their career. An author's h-index will increase over time, as they publish more papers and their published papers attract more citations.
"An author has an h-index of h, if a number h of their papers have h or more citations"
Example: An author has published 22 publications. Of these publications, at least 8 have received at least 8 citations each. The author does not have 9 publications which have received at least 9 citations. Therefore, that author has an h-index of 8.
Limitations on use: see Responsible Metrics on this guide.
The h-index is not a useful metric for early career researchers, amongst other criticisms of its usefulness. Some alternative metrics you might want to consider include:
Think of what your h-index doesn't show:
Alternatively , there are several proposed variations on the h-index which are sometimes used or referred to.
The g-index, proposed by Leo Egghe in 2006, us similar to the h-index but aims to take some account of any highly-cited papers.
"[Where a given set of articles are] ranked in decreasing order of the number of citations that they received, the g-index is the (unique) largest number such that the top g articles received (together) at least g2 citations."
Example: An author has published 22 publications. Of these publications, the sum of the citations of the top 12 articles (by number of citations) is equal to or over 144 12 squared) citations. The sum of the citations for their top 13 articles (by number of citations) is less than 169 (13 squared) citations however. Therefore their g-index is 12.
The M-index, or M-quotient, was also proposed by Hirsch in 2005. It aimed to allow a more fair comparison between academics of differing career lengths.
An author's m-value is found by dividing their h-index by the number of years the author has been actively publishing (measured as the number of years since their first published paper).
Example: An author with an h-index of 18 who has been actively publishing for 6 years will have an m-index of 3. An author with an h-index of 30 who has been actively publishing for 15 years will have an m-index of 2. If the two author's are publishing in the same field of study, this may give a more fair way of comparing the impact of the author's publication output over the length of each of their publishing careers.
Below we have highlighted some of the considerations you should bare in mind when making any judgement based on a citation based metric. You can also print off this guide here:
"The popular view that citation rate is a measure of scientific quality is not supported by the bibliometric expert community. Bibliometricians generally see the citation rate as a proxy measure of scientific impact or of impact on the relevant scientific communities. This is one of the dimensions of scientific or scholarly quality."
If the rate of citation is to be seen as an indicator or proxy for impact or quality, then this assumes that a citation is made in recognition of the contribution that earlier research has made.
But there are recognised problems in when and how some citations are made:
When using citation based metrics as part of research assessment, it should be recognised that citations can and are used for differing purposes, which is often ignored in many citation based metrics where in most cases each citation is treated equally as 'positive' indicators of a publications 'impact' or 'value'.
When using any citation metrics, users should remain aware of what factors may affect citation rates.
Citation rates differ between types of publication (e.g. monographs and journal articles) and types of article (research papers and review articles). Inclusion or exclusion of some document types can also make comparison of some metrics problematic, or may obscure or give disproprotionate prominence to some outputs.
Subject of Discipline:
Publication and citation rates vary across disciplines, and are not directly comparable. This can be illustrated if comparing the aggregate JIF or Citescore for different subject categories side by side.
Applied research has been observed to attract fewer citations on average than "basic" or "pure" research in some fields. in some disciplines, quantitative based research has been shown to be cited less frequently than more qualitative research papers.
Studies using global analyses of data show the dominance of male-authored articles, and male first-authored articles. Female authors are also more likely to work part-time or take career breaks or mid-career changes.
At the author level, male authors tend to have more citations across their career (impacting on metrics such as citation count and the h-index) - likely due to a range of these and other factors, but at a citation impact level this advantage is less clear.
Research Career Stage:
The so-called Matthew effect in citation accrual, whereby the more prestige (citations) an author has, the more likely they are to accrue further prestige (citations) due to their prominence within their field of research.
Some metrics, such as the h-index, are measures of both impact and productivity, and may not fully reflect the impact of an author with only a few publications to their name.
Time since publication:
Citations are accrued over time, and thus the date at which a metric is calculated, and the date range which citation is collected from, will affect the outcome.
Citation accrual may see different rates in different disciplines, and it can be hard to assess citation impact of recent publications which have not had time to be disseminated and assimilated in to the research conversation.
Source of Citation Data:
Scopus, Web of Science, Dimensions and Google Scholar provide different coverage, both in terms of publications indexed, types of publication indexed and the date coverage of those publications.
This will impact on any metrics calculated from these data-sets: your h-index as calculated using data from Scopus will be different to that if calculated using citation data from Google Scholar.
Size of the dataset:
Most outputs do not attract large numbers of citations; a few attract many citations and thus "inflate" the average of the dataset as a whole. As no source of citation date is complete, all citation-based metrics are calculated from a sample of the complete data. The smaller the 'sample' dataset, the more extreme outliers are likely to have a greater impact on any metrics which use the arithmetic mean of the dataset.
There is some discussion across the academic community around the when, where and how of using journal level metrics as a basis for any evaluation of the research output of an individual author or group of authors.
However, it remains to be the case that in many situations, a value is placed upon where an author has published, not just what they have published, and this may impact upon your career as a researcher.
Distribution of citations
The Journal Impact Factor (JIF), Citescore and other metrics present a measure of the 'average citations per article' a journal received over a set period. However, the distribution of citations to articles are often highly skewed, with some very highly cited articles and many articles which may not have received any citations at all.
Limitations of subject classifications
Journal Metrics try to account for the differences between disciplines by assigning journals to 'subject categories' to aid comparison.
As a rule, you should not try to compare journals across these subject categories (a JIF of 2 in one category may be very high, but very low in another).
Potential for Gaming
There is a recognised potential for ‘influencing’ the journal impact factor of a journal, which may be in the interests of editors, publishers or authors with a stake in the particular journal. Not all activities are necessarily unethical, but do have an impact on the citation rate to articles in the journal.
Even where addressed (e.g. a journal being excluded from a years JCR publication, and so not being granted a JIF), this can be problematic where nuances in the discipline may not have been taken into account (e.g. rate of journal self-citation in a small and niche research field).
What is citeable?
One criticism of some journal level metrics is that some journal content (for example, letters, editorial or commentary material and other ‘front matter’) is not deemed ‘citeable’; these article types are not included in the ‘number of publications’ element of the metric – but any citations those articles do attract may still be counted in the ‘total citations’ to the journal.
This presents some difficulties in offering a clear comparison between journals within each of the ranking systems.
An article can attract a large number of citations from other authors disagreeing or finding fault with the findings, methodology and or conclusions. This is all a valid and essential part of the scholarly discussion.
Unless that article is redacted (which may not be justified), those citations will still contribute to the aggregated total of citations used in the calculation of most journal level metrics.
Incentivising negative behaviours
One of the greatest concerns amongst the academic community about journal level metrics is not about the reliability of the metrics or the limitations of their use, but about incentivising some negative behaviours in authors and the communication of research.
This could include influencing the choice of where to publish based on recruitment or promotion criteria (rather than the best venue for the research to reach its intended audience or be best served by editorial or peer review), pressures on authors working in interdisciplinary collaborations, pressure on editors to accept articles based on likely citation rather than quality or novelty, and the impact on recruitment and promotion activity focusing on venue of publication over quality of research.
For a useful review and critique of the h-index, see Barnes, C (2017).
Limitations of the h-index
The below image offers an illustration (from Professor Stephen Curry, Imperial College London) of what the h-index obscures or ignores in simplifying citation and publication impact to a single metric.