Information Seeking Behavior of Scientists Brad Hemminger firstname.lastname@example.org School of Information and Library Science University of North Carolina at Chapel Hill
Contributors • Assisting Researchers • Jackson Fox (web survey) • Steph Adams (participant recruiter) • Dihui Lu (initial descriptive statistical analysis) • Billy Saelim (continued statistical analysis) • Chris Weisen (Odum Institute, statistical consultant) • Feedback on Survey Design • UNC Libraries: Bill Burke (Botany), David Romito (Zoology), Jimmy Dickerson (Chemistry), Zari Kamarei (Math/Physics) • KT Vaughan (Health Sciences Library) • Cecy Brown (University of Oklahoma) • Supported by • UNC Libraries • Carolina Center for Genome Sciences • Basic Science Department chairs • RENCI P20 grant
Why Study Information Seeking Behavior of Scientists • Goal is to improve scholarly communications. Other areas of my research involve presentation aspects (visualization/computer human interaction) and the storage and communication of scholarly information (digital libraries, institutional repositories, virtual communities of practice). • To do this we need to understand how people search out and use information currently, and why. As part of investigating this we found that there has been a significant change in the last 5-10 years. • So we’re studying ISB both to understand it, and to look at recent changes.
How to Study the Information Seeking Behavior of Scientists? • Survey • Reach many people • Address common questions • Produce lots of feedback for libraries • Quantitative, models of variance (“positivist” approach) • Interviews • In depth coverage of selected groups (bioinformatics) • Use grounded theory and critical incident techniques to capture more qualitative, contextual experiences • Develop models of information processing and use
Survey--Long Term Plan • Conduct an initial survey study at UNC. Develop survey instrument and interview methodologies that work here, but could easily be applied on a larger scale. • From the results of the initial UNC study, draft national version (with feedback from national sites). • Run national study. Setup so that other sites only have to recruit subjects; the entire survey runs off of UNC website. Hopefully this results in large number of sites and participants for minimal experimental costs.
Survey Sampling Technique • Census • Need to be able to reach all members • Best if can get response from large segment of population • Results in potentially more input from wider audiences, especially for the open comment questions. • Subject to bias (only computer users take, etc.) • Random sample • Statistically, generally a better choice • Higher cost and significantly more work due to identifying and following up with individual subjects
Questions • Questions were based on • Prior studies with which we wished to correlate our results. This is facilitated by authors who have published their surveys (in papers as appendix, e.g. Cecy Brown), and especially to folks who have put theirs collections of surveys online (e.g. Carol Tenopir). • This allows us to compare results over time, as well as to clarify current practices (for instance whether print or electronic formats are used—and looking breaking this out into two questions, retrieval versus reading) • Covering issues that our librarians were concerned about • Developed during several drafts and that were reviewed by representatives from all libraries on campus.
Survey Instrument Choices • Paper • Phone • Email • Web-based. While these can require more effort than anticipated, if the number of survey respondents is over several hundred it is generally more cost effective*. This seemed the best choice since our pilot survey was of several thousand subjects, and our national survey was planned for tens of thousands. Since we have web and database expertise we were able to automate the process with minimal startup costs. *[Schonlau 2001, “Conducting Research Surveys via E-mail and the Web”].
Data Acquisition Details • PHP Surveyor used for web based survey. Another common choice at our school for simpler surveys is Survey Monkey. PHP Surveyor allowed us to ask multi-part questions, and to constrain answers to specific format responses. • PHP Surveyor dumps data directly into MySQL database. • Data is cleaned up then feed into SAS for analysis. (data cleaning is still a significant manual effort! Examples were determining Dept/CB, browsers that didn’t validate datatypes on forms properly).
Subjects and Recruitment • Subjects are university faculty, grad students and research staff. • We approached all science department chairs to get support first. • Contact • Initial contact was by email giving motivation for study, indication of support by depts&campus, and link to web-based survey. • Follow-ups by letter, then two emails • Flyers in department, Pizza Party Rewards
Look at Survey 902 participants from recruited departments, which were classified as either science or medicine. Participation rate was 26%. Participants by Department Survey
Analysis • For the quantitative response variables standard descriptive statistics (mean, min, max, standard deviation) are computed, and histograms are used to visualize the distribution. • Categorical variables are reported as counts and percentages for each category, and displayed as frequency tables.
Analysis: Correlations • Categorical vs Categorical • Chi-square • Categorical vs Quantitative • Analysis of Variance • Quantitative vs Quantitative • Correlation • Examples are by dept analysis of other features; age vs preferred interface (Google or Library)
Simple Questions • Ninety-one percent of the participants had access to the internet in their office or lab. • Do you maintain a personal article collection?” Most all participants (85.4%) responded that they did, while only 14.6% did not • Do you maintain a personal bibliographic database for print and/or electronic references?”, and 52.2% of the participants did maintain one, while 47.8% did not.
Google vs Library Search Page • “Which interface would you rather use to begin you search process?” with the possible responses “Google search page” and “Your library’s home page”. Overall, a slight majority of users preferred Google (53.3%) over the library page (46.7%); however, the difference was substantially larger for basic science researchers (Google 58.5% versus Library 41.5%) compared to medical researchers (Google 52.2% versus Library 47.8%).
Google vs Library Search Page • This difference may also be larger if the question had asked which style or type of interface the users preferred, as many of the comments in the survey indicated a strong preference for a single “meta” search tool where the user could enter a single search string that would result in all content in all resource collections being searched (as opposed to manually identifying resource collections and individually searching them).
We never leave our chairs… • Most all information seeking and use interactions occur on the researchers’ computer in their office. • As a result library visits have dramatically declined, and the reasons for visits to library have changed. • Researchers read both in electronic and print form, but print (paper) is still the most preferred form.
Single Text Box + MetaSearch • Researchers prefer a single text box for initial searching, that covers all resources. • This is most evidenced by preference for Google Scholar over library web page interfaces.
More than just text • Researchers are making increasing use of content contained in online databases like Genbank, or web pages of research labs. • For the scientists in our survey this type of access has surpassed personal communications and is close to journal articles in frequency of usage by researchers.
Transformative Changes • Transformative collaborative group communications have already taken place in the consumer marketplace, and are finding their way into scholarly communications. Examples include folksonomies supporting community tagging (Del.icio.us), comment and review systems like Amazon’s rankings, FLickr, etc. Beginnings of similar changes are in their initial stages for scholarly communities, for instance Faculty of 1000 and the Connotea application for online sharing of bibliographic databases and annotations by scientists.
What might the future hold? • In the future the researcher may all maintain all their scholarly knowledge online and make it accessible to others as they see fit. Having scholars’ descriptions and annotations of the digital scholarly materials as well as the materials themselves available on the web will allow online communities and community review systems to blossom, just like the availability of online journals articles has transformed basic information seeking of science scholars today.
Future Work • Upcoming papers from UNC survey • Correlations, information seeking behavior predictions from demographics • By department/research area comparisons • Review and reflection on major changes (with Cecy Brown, Don King, Carol Tenopir) • Textual analysis of library comments (Meredith Pulley, KT Vaughan) • ICIS tool for visualizing comments within schema • New work being proposed by other researchers using this data (if you think the data from this study might help you in your research come talk to me). • National Study….(Florida, Oklahoma, others to start soon)… • Interview Studies (labs, individuals) email@example.com