Skip to main content
QUICK REVIEW

[论文解读] The counting house: measuring those who count. Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in the Google Scholar Citations, ResearcherID, ResearchGate, Mendeley & Twitter

Alberto Martín‐Martín, Enrique Orduña‐Malea|arXiv (Cornell University)|Jan 19, 2016
Web visibility and informetrics参考文献 51被引用 37
一句话总结

本研究通过分析814名文献计量学研究人员,评估了学术档案平台(Google Scholar Citations、ResearcherID、ResearchGate、Mendeley和Twitter)的可靠性与完整性。研究发现,GSC能最准确地描绘科学界图景,其与ResearchGate和Mendeley的指标具有高度相关性,但同时也警告各平台普遍存在数据质量问题,尤其体现在档案完整性与准确性方面。

ABSTRACT

Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place. A set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among scientists in the digital space, making new aspects of scientific communication emerge. In this work we present a method for ―capturing‖ the structure of an entire scientific community (the Bibliometrics, Scientometrics, Informetrics, Webometrics, and Altmetrics community) and the main agents that are part of it (scientists, documents, and sources) through the lens of Google Scholar Citations (GSC). Additionally, we compare these author ―portraits‖ to the ones offered by other profile or social platforms currently used by academics (ResearcherID, ResearchGate, Mendeley, and Twitter), in order to test their degree of use, completeness, reliability, and the validity of the information they provide. A sample of 814 authors (researchers in Bibliometrics with a public profile created in GSC) was subsequently searched in the other platforms, collecting the main indicators computed by each of them. The data collection was carried out on September, 2015. The Spearman correlation (α= 0.05) was applied to these indicators (a total of 31), and a Principal Component Analysis was carried out in order to reveal the relationships among metrics and platforms as well as the possible existence of metric clusters. We found that it is feasible to depict an accurate representation of the current state of the Bibliometrics community using data from GSC (the most influential authors, documents, journals, and publishers). Regarding the number of authors found in each platform, GSC takes the first place (814 authors), followed at a distance by ResearchGate (543), which is currently growing at a vertiginous speed. The number of Mendeley profiles is high, although 17.1% of them are basically empty. ResearcherID is also affected by this issue (34.45% of the profiles are empty), as is Twitter (47% of the Twitter accounts have published less than 100 tweets). Only 11% of our sample (93 authors) have created a profile in all the platforms analyzed in this study. From the PCA, we found two kinds of impact on the Web: first, all metrics related to academic impact. This first group can further be divided into usage metrics (views and downloads) and citation metrics. Second, all metrics related to connectivity and popularity (followers). ResearchGate indicators, as well as Mendeley readers, present a high correlation to all the indicators from GSC, but only a moderate correlation to the indicators in ResearcherID. Twitter indicators achieve only low correlations to the rest of the indicators, the highest of these being to GSC (0.42-0.46), and to Mendeley (0.41-0.46). Lastly, we present a taxonomy of all the errors that may affect the reliability of the data contained in each of these platforms, with a special emphasis in GSC, since it has been our main source of data. These errors alert us to the danger of blindly using any of these platforms for the assessment of individuals, without verifying the veracity and exhaustiveness of the data. In addition to this working paper, we also have made available a website where all the data obtained for each author and the results of the analysis of the most cited documents can be found: Scholar Mirrors.

研究动机与目标

  • 评估学术档案平台在衡量科学影响力方面在完整性、可靠性和有效性方面的表现。
  • 比较Google Scholar Citations、ResearcherID、ResearchGate、Mendeley和Twitter之间的指标表现。
  • 识别影响这些平台影响力指标准确性的数据质量问题。
  • 构建一个影响档案评估可信度的错误分类体系。
  • 证明GSC相较于其他平台能更准确地呈现文献计量学社区的图景。

提出的方法

  • 于2015年9月收集了814名在Google Scholar Citations(GSC)上拥有公开档案的科研人员的数据。
  • 在ResearcherID、ResearchGate、Mendeley和Twitter上搜索相同作者,以比较档案完整性与指标数值。
  • 采用Spearman等级相关系数(α=0.05)评估各平台间31项指标之间的关系。
  • 通过主成分分析(PCA)识别指标集群及平台特有的影响力维度。
  • 识别并分类影响可靠性的数据错误,特别以GSC作为主要数据源进行重点分析。
  • 通过Scholar Mirrors网站公开所有数据与分析结果。

实验结果

研究问题

  • RQ1Google Scholar Citations、ResearcherID、ResearchGate、Mendeley和Twitter上科研人员的档案在多大程度上完整且可靠?
  • RQ2不同平台的指标在多大程度上相互关联,特别是与Google Scholar Citations的关联程度如何?
  • RQ3哪个平台能最准确、最全面地呈现文献计量学社区的图景?
  • RQ4哪些类型的数据错误普遍影响学术档案平台的可靠性?
  • RQ5是否存在反映科学影响力不同维度(如引用影响力与社交连接性)的指标聚类?

主要发现

  • Google Scholar Citations(GSC)识别出814名作者,为所有平台中最高,提供了文献计量学社区最完整、最可靠的图景。
  • ResearchGate识别出543名作者,增长迅速,其指标与GSC高度相关(r = 0.42–0.46),表明在影响力测量上具有强一致性。
  • Mendeley拥有大量档案,但17.1%的档案基本为空,其阅读量指标与GSC和ResearchGate的相关性中等。
  • Twitter档案普遍处于非活跃状态,47%的账号发布微博少于100条,且与其他平台的相关性仅处于较低水平(r = 0.41–0.46)。
  • 样本中仅11%(93名作者)在全部五个平台拥有档案,凸显科研人员数字身份的显著碎片化。
  • 主成分分析揭示了两个主要影响力维度:学术影响力(引用与使用量)和连接性/流行度(关注者数量),其中GSC、ResearchGate和Mendeley构成一个高相关性指标集群。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。