1 / 34

Managing sensitive data and authorship in Humanities and Social Sciences

Managing sensitive data and authorship in Humanities and Social Sciences. Louise Corti Collections Development and Producer Support. ODIN conference, Cologne October 2013. Overview. Introducing the UK Data Service Our data portfolio and users Citation, impact measurement and DOIs

quant
Download Presentation

Managing sensitive data and authorship in Humanities and Social Sciences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Managing sensitive data and authorship in Humanities and Social Sciences Louise Corti Collections Development and Producer Support ODIN conference, Cologne October 2013

  2. Overview • Introducing the UK Data Service • Our data portfolio and users • Citation, impact measurement and DOIs • Challenges for social science citation

  3. The UK Data Archive • Based at the University of Essex, since 1967 • 45 years of selecting, ingesting, curating and providing access to social science data • designated as Place of Deposit by The National Archives • Data and data support services for higher and further education for research, teaching and learning • Recently attained the highest information security standard, ISO 27001

  4. University of Essex The Archive

  5. SISTER DATA ARCHIVES Council of European Social Science Data Archives (CESSDA ) ICPSR (USA) Inter-University Consortium for Political and Social Research ADA Australian Social Science Data Archive

  6. What is the UK Data Service? • Comprehensive data resource funded by the UK Economic and Social Research Council • Single virtual point of access to a wide range of secondary data for social science research (Directed from Essex) • Offer promotion, support, training and guidance

  7. What does the UK Data Service do? • Put together a collection of the most valuable data • Preserve data for the long term for future research purposes • Make the data and documentation available for reuse • Provide data management advice for data creators • Provide training and support for users of the service • Bring together owners, producers and users • Demonstrate impact through evidence of usage • Easy access through website - ukdataservice.ac.uk

  8. Who is our service for? • Data for secondary analysis, research, policy making • Teaching and learning • Academic researchers and students • Government analysts • Charities and foundations • Business consultants • Independent research centres • Think tanks

  9. Our data portfolio • Over 6,000 datasets in the collection • 230new datasets added each year • Official agencies - mainly central government • International statistical time series • Individual academic’ research grants • Market research agencies • Public records/historical sources • Access to international data via links with other data archives worldwide

  10. UK survey series • High quality repeated cross-sectional surveys • Individual or household level data • Cover many topics including health, work, crime, social attitudes, family expenditure, living costs, housing etc. • Labour Force Survey • British Crime Survey • Health Survey for England • British Social Attitudes • Annual Population Survey ….

  11. Cross-national surveys and macro databanks • Eurobarometers • European Social Survey • European Values Survey • International Social Survey Programme • Time series data aggregated to country/region • International governmental organisations (IMF, OECD, IEA, World Bank)

  12. Longitudinal studies • British Household Panel Survey and Understanding Society • Understanding Society (2009-) • English Longitudinal Study of Ageing • Families and Children Study • Growing Up in Scotland • Longitudinal Study of Young People in England

  13. UK census data • 1971-2011 census data • Baseline for other statistics • Detailed combinations of characteristics • Small geographies • Census outputs • Aggregate data • Boundary data • Flow data • Microdata

  14. Business data • Collected through a wide range of surveys, and administrative sources: • productivity, innovation, workforce skills, earnings • international trade, foreign direct investment • research and development • business demography • industrial relations

  15. Qualitative data • Interviews, focus groups • Essays, diaries, open-ended survey questions • Observations, case notesetc. • Family Life and Work Experience before 1918, Middle and Upper Class Families in the Early 20th Century,1870-1977 • Gender Difference, Anxiety and the Fear of Crime, 1995 • Mothers Alone: Poverty and the Fatherless Family, 1955-1966

  16. Usage of data • Operate a spectrum of access • Web download under End User Licence • Permission only via Special Licence access • ‘Approved researcher’ access via remote secure access • End user licence includes: • Appropriate data usage • Full citation of data and informing us of re-use • Have always provided a citation format • over 22,000 registered users • approximately 60,000downloads worldwide p.a. • 3,000+user support queries

  17. Evidence of access and re-use User access information • Collect user information and ‘projects’ upon registration • Collate data and documentation download statistics • Users can share project information for others to see • Report data access stats on demand Usage information • Email all users every 6 months after registration about activity • Manually add all research outputs references to the data record • Reporting rate of publications is poor! • Prior to DOIs, have scanned citation literature for dataset mentions – very manual and unreliable, and poorly cited

  18. Impactful case studies of use • Identify and seek out case studies of re-use: research or teaching. • Very successful! • 125 case studies in our database • Can help provide impact stories for data owners/producers and users • And can inspire others! • Some are harvested by ESRC for their website • Often include ongoing work – no need to wait for publications

  19. Our Persistent identifiers approach • Our data collections are not digital objects • Need to capture changes made to data • Versioning data in a commonly understood manner • Needed rule-based definition of a‘significant’change • Integrate processes withdigital preservation activities & work flows • In 2011 we assigned Datacite DOIs for all of our collections • Mint and update DOIs with our metadata management infrastructure

  20. Recording significant change • Approx. 15% UKDA data collections are altered within first year after first publication • We have distinguished between major and minor changes to a data collection = high impact vs. low impact • DOI allocated to a metadata instance of a data collection • DOIs resolve to jump page pointing to all external instances • New DOI = High Impact change, with explicit logging • Provided access only to most up-to-date version of data

  21. Major changes – high impact • New variable added • New labels/value codes added • Weighting variables reconstructed • Wrong data supplied (e.g., March not April) • Mis-coded data (e.g., Don’t know/Refused confused) • Change in format (file migration) • Significant changes in documentation • Change in access conditions

  22. Raising awareness in the social sciences • ESRC funding for short-term project on citation • Advocacy for best practice in citing research data • Audiences • Professional organisations • Academic publishers and journal editors • Researchers and postgraduates • Key activities • Data citation principles for social sciences • Personal communications • Events with BL DataCite, JISC and wider PI community • Outreach through Doctoral Training Centres

  23. Making

  24. Demonstrating impact with citation • Assuming better use of DOIS… • Starting to search for use of our DOIs – Google • Automate this process and compile reports; promote • Gather data citation statistics from Thomson Reuters Data Citation Index. One of the early 20 feeder repositories, but our own access limited! • Work with BL Datacite and ODIN to gain connectivity between identifiers & outputs – early adopters

  25. CHALLENGES FOR THE FUTURE • Citing parts (fragments) of data collections • single files • subsets of quantitative data • extracts of textual data • ESRC project Digital Futures will enable extract level citation within a web-based browsing system • Using rich highly structured XML metadata • GUIDS for everything

  26. UK Quali Bank

  27. Resolving citation objects • Will enable extract level citation • Citation object and citation format created on the fly – using GUIDS and URI • URI resolves directly to the data extract • Some more sensitive collections will be closed, so cannot resolve to data • As yet uncertain of relationship to our collection-level DOIs

  28. CONTACT UK Data Service University of Essex Wivenhoe Park Colchester Essex CO4 3SQ • ……………..…..……………………….. T+44 (0)1206 872001 Ecorti@essex.ac.uk

More Related