1 / 14

Evaluating (Scientific) Knowledge for people, documents, organizations/activities/communities

Evaluating (Scientific) Knowledge for people, documents, organizations/activities/communities. ICiS Workshop: Integrating, Representing and Reasoning over Human Knowledge Snowbird August 9 2010 Geoffrey Fox gcf@indiana.edu http://www.infomall.org http://www.futuregrid.org http://pti.iu.edu/

lois
Download Presentation

Evaluating (Scientific) Knowledge for people, documents, organizations/activities/communities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating (Scientific) Knowledgefor people, documents, organizations/activities/communities ICiS Workshop: Integrating, Representing and Reasoning over Human Knowledge Snowbird August 9 2010 Geoffrey Fox gcf@indiana.edu http://www.infomall.orghttp://www.futuregrid.orghttp://pti.iu.edu/ Director, Digital Science Center, Pervasive Technology Institute Associate Dean for Research and Graduate Studies,  School of Informatics and Computing Indiana University Bloomington

  2. My Role • My research would be on building Cyberinfrastructure (MapReduce/Bigtable/Clouds/Information Visualization) for “Integrating, Representing and Reasoning over Human Knowledge” • Use FutureGrid to prototype Cloud/Grid environments • Here I talk in role as frustrated journal editor and School bureaucrat responsible for advising faculty how to get NSF grants and tenure.

  3. Knowledge Evaluation is Important? • Review of journal or conference paper. • Several conference management systems but don’t offer reviewing tools • Supporting choice of panel reviewing proposals • And proposal review itself • Supporting choice of Program Committee for conference • Supporting promotion and tenure process • h-index appears in several referee reports • Supporting ranking of organizations such as Journals, Universities and (Computer Science) departments. • Deciding if some activity is useful such as TeraGrid; particular agency or agency program; a particular evaluation process (panel v individual reviews) • Deciding if some concept useful such as multidisciplinary research, theory, computing ……. • Evaluation of Knowledge evaluation methodologies

  4. “Policy Informatics”aka “Command & Control” (military knowledge) • In Data-Information-Knowledge-Wisdom-Decision(Evaluation) pipeline, some steps are “dynamic” (can be redone if you save raw data) but decisions are often “final” or “irreversible” • We could (and do as preprints) publish everything as “disks free” and change our evaluations • But Finite amount of research funding and finite number of tenure positions

  5. SS

  6. Citation Analysis • Use of Google Scholar (Publish or Perish) to analyze contribution of individuals is well established • #papers, # citations, h-index, hc-index (contemporary), g-index (square)….. • There is ambiguity as to “best metric” and if such metrics are sound at all but in some cases, perhaps most serious problem is calculating them in unbiased fashion • One can probably find metrics for “Geoffrey Fox” but it’s hard for more common names and for example most Asian names are hard • Google Scholar has crude approach to refine by including and excluding names e.g. include “Indiana University” or exclude “GQ Fox” (not clear where words are?) • “Automating” hard unless analysis for each name done by hand • Even the name nontrivial – need “GC Fox” and “Geoffrey Fox”

  7. Evaluating Documents • As journal editor, I find choosing referees (and persuading them to write report) as hardest problem • Especially with increasing number of non traditional authors • Need to identify related work and find authors or previous referees of these related papers • Currently ScholarOne uses largely useless keyword system • Can also look at originality of articles examined from overlap in text between a given article and some corpus (typically resubmitting conference paper unchanged) • If unfamiliar with authors need to identify which of a multi-author paper is appropriate choice, where they are now and contact information • Current services DBLP, ACM Portal, LinkedIn, Facebook don’t tell you necessary information • Need tools to quantify reliability of referees

  8. Is High Performance Computing Useful for improving Knowledge • Are papers that use TeraGrid “better” than those that don’t? • Does TeraGrid help enhance Knowledge? • Correlate quality and type of papers with “use of TeraGrid” • Possibly can be done by text analysis (does paper acknowledge TeraGrid) • Here use indirect mapTeraGrid  Projects/People  Papers

  9. TeraGrid Analysis I: Bollen

  10. TeraGrid Analysis II: Bollen

  11. TeraGrid Web of Science

  12. Need a Freely Available Toolkit • Firstly current tools as in Google Scholar and CiteSeer have insufficient scope • Google Scholar stuck in early stage of “perpetual beta (?alpha)” after killing Windows Academic Live • Secondly need to enable customization so that can explore evaluation choices • Current CS department rankings put Indiana in dungeon – partly because Fox/Gannon papers are not counted as not in approved journals • Don’t want to let Thomson to control Impact Factors (relevant for tenure especially in Asia?) without scientific scrutiny • As discussed ScholarOne (also Thomson) is dreadful but seems to have growing adoption • Want to explore new ideas such as evaluating TeraGrid

  13. Tools Needed • More accurate scientific profiles; ACM Portal says I have 3 publications; DBLP 250; Google Scholar 1000 • Neither tells you my contact and professional information • Unbundled CiteSeer/Google Scholar allowing more accurate document analysis • e.g. Analyze document on hand (as in conference submission) • Open decomposition into Authors, Title, Institution, emails, Abstract, Paper, citations • Analyzers of Citations and/or text to suggest referees • Analysis of novelty of document • Tool to produce accurate h-index (etc.)

  14. Some Research Needed • Open analysis of concepts like Impact Factor, h-index, Indexing services • Look at definitions and • Possibilities of making valid deductions • How do we evaluate “groups” (research, departments) as opposed to individuals • Can one automate the current time consuming manual steps • Identity confusion in Google Scholar • Research profiles • Compare traditional ethnography approach to evaluation (do a bunch of interviews) versus data deluge enabled version • Why are Web 2.0 tools like Delicious, Facebook etc. little used in science

More Related