Delineating the Citation Impact of Scientific Discoveries . Chaomei Chen 1 , Jian Zhang 1 , Weizhong Zhu 1 , Michael Vogeley 2 1 College of Information Science and Technology, Drexel University 2 Department of Physics, Drexel University .
Delineating the Citation Impact of Scientific Discoveries
Chaomei Chen1, Jian Zhang1, Weizhong Zhu1, Michael Vogeley2
1College of Information Science and Technology, Drexel University
2Department of Physics, Drexel University
This work is supported by the National Science Foundation under Grant No. 0612129.
Thomson ISI provides the bibliographic data for the analysis.
There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers—conclusions which he cannot find time to grasp, much less to remember, as they appear. Yet specialization becomes increasingly necessary for progress, and the effort to bridge between disciplines is correspondingly superficial.
massive scientific data are being collected by one group of scientists
being analyzed by another group of scientists.
Two notable examples:
1. The SDSS project in astrophysics
2. The human genome project in biomedicine
There is an increasingly strong trend in science that massive scientific data are being collected by one group of scientists and being analyzed by another group of scientists (Gray & Szalay 2004). Two notable examples: the SDSS project in astrophysics and the human genome project in biomedicine.
Sloan Survey Data
Figure 3. Prominent keywords assigned by authors and burst terms extracted from titles and abstracts (2002-2006).
Hc, Ht Split
As of June 18, 2007, 95 SDSS papers
have 95 or more citations.
It was 89 in January 2007.
Sc discounts citations accumulated over a long period of time.
St measures the recent impact:
Figure 4. An overview of a decision tree generated based on 216 terms selected by log-likelihood ratio values (p<0.01) and a geometric mean split (74.44% of classification accuracy). The tree should be read from the root downwards .
Figure 5. A part of the tree shown in Figure 4. The presence (>0) or absence (<=0) of a term is associated with a citation status group, i.e. highly and timely cited group.
Figure 6. An ADTree derived from the data selected with the same selection criteria with 70.55% of accuracy.
Figure 7. A decision tree of 95.82% classification accuracy derived from 721 terms and 1,267 records.
Figure 10. The citation history of timeliness papers shows recently published papers are moved up in the rankings.