bibliometrics l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
BIBLIOMETRICS PowerPoint Presentation
Download Presentation
BIBLIOMETRICS

Loading in 2 Seconds...

play fullscreen
1 / 43

BIBLIOMETRICS - PowerPoint PPT Presentation


  • 554 Views
  • Uploaded on

BIBLIOMETRICS. Tefko Saracevic Rutgers University http://www.scils.rutgers.edu/~tefko. What is?. “… all studies which seek to quantify processes of written communication.” Pritchard “… the quantitative treatment of the propertiesd of recorded discourse and behavior pertaining to it.”

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'BIBLIOMETRICS' - Lucy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
bibliometrics

BIBLIOMETRICS

Tefko Saracevic

Rutgers University

http://www.scils.rutgers.edu/~tefko

© Tefko Saracevic

what is
What is?
  • “… all studies which seek to quantify processes of written communication.”

Pritchard

  • “… the quantitative treatment of the propertiesd of recorded discourse and behavior pertaining to it.”

Fairthorne

  • Recorded communication - ‘literature’->

quantitative methods

© Tefko Saracevic

alan pritchard 1969
Alan Pritchard 1969
  • Coined the term "bibliometrics"

"the application of mathematics and statistical methods to books and other media of communication“

Journal of Documentation (1969) 25(4):348-349

© Tefko Saracevic

and other related metrics
and other related metrics …
  • Also used to study broader than books, articles …
    • Scientometrics
      • covering science in general, not just publications
    • Infometrics
      • all information objects
    • Webmetrics or cybermetrics
      • web connections, manifestations
      • using bibliometric techniques to study the relationship or properties of different sites on the web

© Tefko Saracevic

concepts
Concepts

Basic (primitive) concepts:

1. Subject

2. Recorded communication -> document, information object

3. Subject literature

  • Bibliometrics related to:
    • science of science
    • sociology of science - numerical methods

© Tefko Saracevic

literature studies
Literature studies
  • Qualitative
    • often in humanities, librarianship
  • Quantitative
    • bibliometrics
  • Mixed

© Tefko Saracevic

reasons for quantitative studies of literature
Reasons for quantitative studies of literature
  • Analysis of structure and dynamics
    • search for regularities - predictions possible
  • Understanding of patterns
    • “order out of documentary chaos”
    • verification of models, assumptions
  • Rationale for policies & design

© Tefko Saracevic

why quantitative studies
Why quantitative studies?
  • Qualitative methods often depend on assertions. ‘authoritative’ statements, anecdotal evidence
  • Science searches for regularities
  • Success of statistical methods in social sciences
  • Need for justification & basis for decisions
  • Something can be counted - irresistible

© Tefko Saracevic

application in
Application in ...
  • History of science
  • Sociology of science
  • Science policy; resource allocation
  • Library selection, weeding, policies
  • Information organization
  • Information management
    • utilization

© Tefko Saracevic

historical note
Historical note
  • Bibliometrics long precedes information science
  • But found intellectual home in information science
    • study of a basic phenomenon - literature
  • It is not ‘hot’ lately, but still produces very interesting results
  • Branched out into web studies (web is a “literature” as well)

© Tefko Saracevic

what studied
What studied?
  • Governed by data available in documents or information resources in general - that what can be counted
    • author(s)
    • origin
      • organization, country, language
    • source
      • journal, publisher, patent …

© Tefko Saracevic

what more
what … more
  • contents
    • text, parts of text, subject, classes
  • representation
  • citations
    • to a document, in a document, co-citation
  • utilization
    • circulation, various uses
  • links
  • any other quantifiable attribute

© Tefko Saracevic

tools
Tools
  • Science Citation Index
  • Compilation of variables from journals in a subject
  • Use data
  • Publication counts from indexes, or other data bases
  • Web structures, links

© Tefko Saracevic

variable authors
Variable: authors
  • number in a subject, field, institution, country
  • growth
  • correlation with indicators like GNP, energy etc.
  • productivity e.g. Lotka’s law
  • collaboration - co-authorship, associated networks
  • dynamics - productive life, transcience, epidemics
  • papers/author in a subject
  • mapping

© Tefko Saracevic

variable origin
Variable: origin
  • Rates of production, size, growth by
    • country, institution, language, subject
  • Comparison between these
  • Correlation with economic & other indicators

© Tefko Saracevic

variable sources
Variable: sources
  • Concentration most often on journals
  • Growth, dynamics, numbers
    • information explosion - exponential laws
    • time movements, life cycles
  • Scatter - quantity/yield distribution
    • Bradford’s law
  • Various distributions
    • by subject, language, country

© Tefko Saracevic

variable contents
Variable: contents
  • Analysis of texts
    • distribution of words – Zipf’s law
    • words, phrases in various parts
    • subject analysis, classification
    • co-word analysis

© Tefko Saracevic

variable representation
Variable: representation
  • frequency of use of index terms, classes
  • distribution laws - key terms where?
  • thesaurus structure

© Tefko Saracevic

variable citations
Variable: citations
  • Studied a lot; many pragmatic results
    • base for citation indexes, web of science, impact factors, co-citation studies etc
  • Derived:
    • number of references in articles
    • number of citations to articles
      • research front; citation classics
    • bibliographic coup[ling

© Tefko Saracevic

citations more
citations … more
  • co-citations
    • author connections, subject structure, networks, maps
  • centrality
    • of authors, papers
  • validation with qualitative methods
  • impact

© Tefko Saracevic

variable utilization
Variable: utilization
  • frequency
  • distribution of requests for sources, titles
    • e.g. 20/80 law
  • relevance judgement distributions
  • circulation patterns
  • use patterns

© Tefko Saracevic

variable links
Variable: links
  • Development of link-based metrics
    • in-links, out-links
  • Web structure
  • Web page depth; update
  • PageRank vs quality

© Tefko Saracevic

examples from classic studies
Examples from classic studies
  • Comparative publications over centuries
  • Number of journals founded over time
  • Number of abstracts published over time
  • National share of abstracts in chemistry
  • National scientific size vs. economy size
  • Bibliographic coupling and co-citation
  • Web structures, links

© Tefko Saracevic

examples of laws methods
Examples of laws & methods
  • Lotka’s law
  • Bradford’s law
  • Zipf’s law
  • Impact factor
  • Citation structures
  • Co-citation structures

© Tefko Saracevic

alfred j lotka 1926
Alfred J. Lotka 1926
  • Statistics—the frequency distribution of scientific productivity

Purpose: to "determine, if possible, the part which men of different calibre contribute to the progress of science“

    • Looked at Chemical Abstracts Index, then Geschichtstafeln der Physik
      • J. Washington Acad. Sci. 16:317-325

© Tefko Saracevic

lotka s law x n y c
Lotka’s law: xn • y = C

The total number of authors yin a given subject, each producing xpublications, is inversely proportional to some exponential functionn of x.

  • Where:
    • x = number of publications
    • y = no. of authors credited with x publications
    • n = constant (equals 2 for scientific subjects)
    • C = constant
  • inverse square law of scientific productivity

© Tefko Saracevic

lotka s law scientific publications
Lotka's Law - scientific publications

No. of authors

xn• y = C

© Tefko Saracevic

samuel clement bradford 1934 1948
Samuel Clement Bradford 1934, 1948
  • Distribution of quantity vs yield of sources of information on specific subjects
    • he studied journals as sources, but applicable to other
    • what journals produce how many articles in a subject and how are they distributed? or
    • How are articles in a subject scattered across journals?
  • Purpose: to develop a method for identification of the most productive journals in a subject & deal with what he called “documentary chaos”

First published in: Engineering (1934) 137:85-86, then in his book Documentation, (1948)

© Tefko Saracevic

bradford s law
Bradford’s law

"If scientific journals are arranged in order of decreasing productivity of articles on a given subject, they may be divided into a nucleus of periodicals more particularly devoted to the subject and several groups or zones containing the same number of articles as the nucleus, when the numbers of periodicals in the nucleus and succeeding zones will be as a : n : n2 : n3 …"

© Tefko Saracevic

bradford s law of scattering an idealized example
Bradford's Law of Scattering – an idealized example

No. of articles per source

60

35

30

25

9

8

6

5

4

3

Total no. of articles

60

70

30

50

18

32

60

35

20

15

No. of

source journals

1

2

1

2

2

4

10

7

5

5

3

130

9

130

27

130

© Tefko Saracevic

bradford s law of scattering zones
Bradford's Law of Scattering – zones

nucleus

3 sources

130 articles

9 sources

130 articles

27 sources

130 articles

Garfield hypothesis

© Tefko Saracevic

george kingsley zipf 1935 1949
George Kingsley Zipf 1935, 1949
  • The psycho-biology of language: an introduction to dynamic philology (1935)
  • Human behavior and the principle of least effort: An introduction to human ecology(1949)
  • Looked, among others, at frequency distributions of words in given texts
    • counted distribution in James Joyces’ Ulysses
  • Provided an explanation as to why the found distributions happen:

Principle of least effort

© Tefko Saracevic

zipf s law r f c
Zipf’s law: r • f = c
  • Where:

r = rank (in terms of frequency)

f = frequency (no. of times the given word is used in the text)

c = constant for the given text

  • For a given text the rank of a word multiplied by the frequency is a constant
  • Works well for high frequency words, not so well for low – thus a number of modifications

© Tefko Saracevic

charles f gosnell 1944 obsolescence
Charles F. Gosnell 1944 Obsolescence
  • He studied obsolescence of books in academic libraries via their use
        • College Res. Libr. (1994) 5:115-125
  • But this was extended to study of articles via citations, and other sources
  • Age of citations in articles in a subject:
    • half life – half of the citations are x year old etc
      • different subjects have very different half-lives

© Tefko Saracevic

curve of obsolescence
Curve of obsolescence

Number of users

Age at time of use

© Tefko Saracevic

eugene garfield 1955
Eugene Garfield 1955
  • Focused on scientific & scholarly communication based on citations
        • Science (1995) 122:108-111
  • Founded Institute for Scientific Information (ISI)
    • major proeduct now ISI Web of Knowledge
  • Impact factor for journals, based on how much is a journal cited
  • Mapping of a literature in a subject
  • Citation indexes/web of knowledge
    • MAJOR resources in bibliometric studies

© Tefko Saracevic

citation matrix
Citation matrix

citing

article

citing

article

cited

article

citing

article

article

citing

article

cited

article

citing

article

citing

article

cited

article

citing

article

© Tefko Saracevic

science citation index
Science Citation Index

Association-of-ideas index

citing

article

citing

article

cited

article

citing

article

article

citing

article

cited

article

citing

article

citing

article

cited

article

citing

article

© Tefko Saracevic

co citation analysis
Co-citation analysis

Articles that cite the same article are likely to both be of interest to the reader of the cited article

citing

article

article

These two articles are likely to be related

citing

article

© Tefko Saracevic

impact factor if
Impact factor (IF)

number of citations received in current year by papers published in the journal in the previous two yearsdivided by

number of papers published in the journal in the previous two years

  • IF has become over time a crucial indicator of journal quality and
    • given ISI a monopoly position in the evaluation of journal quality
  • Reported in Journal Citation Reports (1976-)

© Tefko Saracevic

garfield s histcite
Garfield’s HistCite
  • “Bibiliographic Analysis and Visualization Software”
  • Provides citation statistics & graphs for people, journals, institutions …
    • various citations scores, no. of cited references in articles … various graphs with connections
  • Example: articles and authors for JASIST (and predecessor names) for 1956-2004
    • includes citations to authors

© Tefko Saracevic

conclusion
Conclusion
  • Bibliometrics, & related scientometrics, infometrics, webmetrics provide insight into a number of properties of information objects
    • some general, predictive “laws” formulated
    • structures have been exposed, graphed
    • myriad data collected & analyzed
  • A good area for research!

© Tefko Saracevic

sources used in making this presentation among others
Sources used in making this presentation– among others
  • Ruth Palmquist Bibliometrics
  • Donna Bair-Mundy Boolean, bibliometrics, and beyond
  • Short set of bibliometric exercises by J. Downie

http://people.lis.uiuc.edu/~jdownie/biblio/

© Tefko Saracevic