Institute for Research on Innovation & Science (IRIS) - PowerPoint PPT Presentation

graceland
institute for research on innovation science iris n.
Skip this Video
Loading SlideShow in 5 Seconds..
Institute for Research on Innovation & Science (IRIS) PowerPoint Presentation
Download Presentation
Institute for Research on Innovation & Science (IRIS)

play fullscreen
1 / 26
Download Presentation
Institute for Research on Innovation & Science (IRIS)
126 Views
Download Presentation

Institute for Research on Innovation & Science (IRIS)

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Institute for Research on Innovation & Science (IRIS) Jason Owen-Smith IRIS/University of Michigan jdos@umich.edu Iris.isr.umich.edu @IRIS_UMETRICS

  2. Roadmap • Background on IRIS • What we currently do • USE Cases for MPC/FHE/etc.

  3. The Challenge(s)

  4. In 2016, our society invested $220 on academic research for every man, woman, and child in the country For every $1: $0.55 from federal government, $0.24 from universities, $0.06 each from states, industry, non-profits, $0.03 from all other sources • We make those investments to develop human knowledge and to improve quality of life and well being. • How do we understand and improve those effects?

  5. Competing Budgetary Priorities

  6. A Thought Experiment

  7. The Wisconsin Idea Proposed revision The mission of the [University of Wisconsin] system is to develop human resources to meet the state’s workforce needs, to discover and disseminate knowledge, and to develop in students heightened intellectual, cultural, and human sensitivities, scientific, professional, and technological expertise, and a sense of purpose. The mission of the [University of Wisconsin] system is to develop human resources to discover and disseminate knowledge, to extend knowledge and its application beyond the boundaries of its campuses and to serve and stimulate society by developing in students heightened intellectual, cultural, and human sensitivities, scientific, professional, and technological expertise, and a sense of purpose. Inherent in this broad mission are methods of instruction, research, extended training and public service designed to educate people and improve the human condition. Basic to every purpose of the system is the search for truth.

  8. Our Response: IRISData for research and reporting to understand, explain, and improve the public value of academic research

  9. Framework Discovery Learning Dissemination Innovation Entrepreneurship Economic Growth Public Health Food Safety Security (More) Rational Policy … Propose Knowledge, People, Skills Fund Science Investments Universities Hiring, Spending Jobs Stimulus

  10. Background • Founded in 2015 • Recession  STARMETRICS  UMETRICS  IRIS • Emerged from CIC/Big Ten • Transaction level sponsored projects expenditures on employees, vendors and sub-awards • 33 current member institutions (11 Big 10) = ~30% of federal R&D spend • Members contribute to support infrastructure & receive reports and other data products • Goal is 150 members (~93% of federal R&D spend) • IRB approved data repository – Virtual Data Enclave • ~60 current users w/ approved projects, signed DUAs • Disclosure proofing procedures • But basically a trust model

  11. Institute for Research on Innovation and Science (IRIS)

  12. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers

  13. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers 1 University transaction data – Restricted

  14. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers 1 2 University transaction data –Restricted US Census outcome data –Restricted

  15. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers 1 3 2 • University transaction data –Restricted • US Census outcome data –Restricted • Federal grant data – Public

  16. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers 1 3 2 University transaction data –Restricted US Census outcome data –Restricted Federal grant data –Public US Patent Office data – Public 4

  17. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers 1 3 2 5 • University transaction data –Restricted • US Census outcome data –Restricted • Federal grant data –Public • US Patent Office data – Public • Publication data –Public & Restricted 4

  18. Research and reporting to understand, explain and improve the public value of academic research Key goal: long term, near comprehensive, longitudinal data about academic researchers Key problems: no single data source, most extant data is about documents (grants, publications, patents) not people, no (public) persistent identifiers 1 3 2 5 University transaction data –Restricted US Census outcome data –Restricted Federal grant data –Public US Patent Office data – Public Publication data –Public & Restricted Dissertation data –Public & Restricted 4 6

  19. Submission Process • Common data structure • Upload through a secure portal • Coded quality assurance checks • Immediate: e.g. value ranges, duplication, missing fields, record counts etc. • 24 hours: e.g. normalization • Data depositors generally don’t know what’s really in the data they submit

  20. Reporting

  21. Making Data Available: IRIS

  22. Process Challenges • Universities vary dramatically in quality of data produced • Normalization, actually unique identifiers, garbage strings, duplicates, missing data, negative values, wonky date ranges . . . • Labor intensive community and relationship building • Substantial work required to integrate multi-university data • No persistent individual or organizational identifiers, wildly different naming conventions • Disambiguation and unstructured linkage across univs and b/t integrated data and public sources (e.g. pubmed, ISI, patents, proquest) • Computationally intensive feature-based SVM approach in development • Formatting issues – e.g. Census works only in SAS and requires access to identified micro-data • Social science researchers are trained (and expect to) look closely at micro-data, construct unique variables, integrate new data sources • Data access is time consuming, may be a barrier, limitations in computing capacity, many “help desk” requests, disclosure review

  23. Use Cases for FHE • Social science research, standard statistical methods • Key challenge, data are supposed to be flexible and usable for research we cannot envision • Universities who want to benchmark but don’t want to be identified • Universities love to compare themselves to others but hate to be compared • Agencies, policy-makers, the public (?) who want to explore aggregate data • Issues of trust/oversight, generally need to see information across a portfolio of institutions • ???? “Living” data. Multiple updates per year, two transfers to Census, annual documented data release through a virtual and a physical enclave.

  24. Thank You