1 / 30

Data-gov Wiki: Towards Linking Government Data

Tetherless World. Data-gov Wiki: Towards Linking Government Data. Li Ding, Dominic DiFranzo, Alvaro Graves, James R. Michaelis, Xian Li, Deborah L. McGuinness and Jim Hendler Tetherless World Constellation March 22, 2010. DATA GOV. Outline. Background

amadis
Download Presentation

Data-gov Wiki: Towards Linking Government Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tetherless World Data-gov Wiki: Towards Linking Government Data Li Ding, Dominic DiFranzo, Alvaro Graves, James R. Michaelis, Xian Li, Deborah L. McGuinness and Jim Hendler Tetherless World Constellation March 22, 2010 DATA GOV

  2. Outline • Background • Open Government Data Initiative • data.gov • The Data-gov Wiki • Making Government Linkable • Linking and Using Government Data • Provenance Issues

  3. Background http://www.whitehouse.gov/open http://www.data.gov/

  4. Open Government Data Initiative Open Government Data Initiative • Transparency • Participation • Collaboration Open Government Directive (Dec 8, 2009) • Publish Government Information Online • Improve the Quality of Government Information • Create and Institutionalize a Culture of Open Government • Create an Enabling Policy Framework for Open Government

  5. data.gov, data.gov.uk and beyond • What’s next? • More datasets • More links • More provenance £30 million to fund  "Institute of Web Science"

  6. Statistics about data.gov 50 participating agencies:USDA, DOC, DOD, ED, DOE, HHS, DHS, HUD, DOI, DOJ, DOL, STATE, DOT, TREAS, VA, EPA, GSA, NASA, NSF, NRC, OPM, SBA, SSA, USAID, BBG, CFTC, CNS, EXIM, EOP, FCC, FDIC, FEC, FRB, IMLS, MSPB, NARA, NEA, NEH, NLRB, NTSB, OSHRC, ONHIR, OPIC, PBGC, RRB, SEC, SSS, TVA, CPSC, EEOC Source: http://www.data.gov/metric accessed March 21, 2010

  7. The Data-gov Wiki http://data-gov.tw.rpi.edu/

  8. About the data-gov Wiki Mission The data-gov project investigates the role of semantic web technologies, esp. linked data, in producing, processing and utilizing government data found in data.gov. Objectives • Support linked government data publishing, applications and provenance using semantic technologies • Educate potential developers and users • Enable social collaborations on linked government data This project is run by the Tetherless World Constellation at RPI, headed by Profressor Jim Hendler and Deborah McGuinness and led by Li Ding. Other team members include: Dominic DiFranzo, Sarah Magidson ,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, and Peter Coons.

  9. Data-gov Wiki Architecture Knowledge Provenance Usage Data Web Linked Data LGD in RDF Enhancement … Conversion LGD: Linked government data

  10. Data-gov Cloud (Oct 2009) GeoNames LinkedData Community US agency US location LABOR-STAT (19xx-Present) Environment USAspending (2008-2010) Government EARTHQUAKE (Present) US-COMMUNITY (2005-2007) GOV-BUDGET (1962-2014) TOXIC-RELEASE (2005-2008) DATA-GOV-CATALOG (present) RECS code RECS (2005) CASTNET (1990 – Present) MED-COST (1994-2009) CASTNET sites Services PUBLIC-LIB (1992-2006) STATE-LIB (2006-2007)

  11. http://data-gov.tw.rpi.edu/wiki/demos data.gov + uk gov data + NY times + DBpedia http://data-gov.tw.rpi.edu/wiki/Demo:_Comparing_US-USAID_and_UK-DFID_Global_Foreign_Aid

  12. From Open Government Data (OGD) to Linked Government Data (LGD)

  13. Make government data linkable Raw Data: http://www.whitehouse.gov/omb/budget/fy2010/assets/receipts.csv RDF Conversion *Minimal and extensible * Web accessible Raw RDF: http://data-gov.tw.rpi.edu/raw/403/data-403.rdf <rdf:Description rdf:about="#entry262"> <dgp401:account_name>Donations, Donations for the Official Residence of the Vice President</dgp401:account_name> … <dgp401:agency_name>Executive Office of the President</dgp401:agency_name> </rdf:Description>

  14. Linking at Conversion TimeReuse Property Raw RDF: http://data-gov.tw.rpi.edu/raw/402/data-402.rdf <rdf:Description rdf:about="#entry840"> <dgp401:account_name>Defense Vessel Transfer Receipt Account</dgp401:account_name> … <dgp401:agency_name>Department of Defense--Military</dgp401:agency_name> </rdf:Description> Raw RDF: http://data-gov.tw.rpi.edu/raw/403/data-403.rdf <rdf:Description rdf:about="#entry262"> <dgp401:account_name>Donations, Donations for the Official Residence of the Vice President</account_name> … <dgp401:agency_name>Executive Office of the President</dgp401:agency_name> </rdf:Description>

  15. Linking using Semantic Wikienrich ontology definition Property Definition: http://data-gov.tw.rpi.edu/wiki/Property:92/title [[rdfs:subPropertyOf::Property:rdfs:label]] Property Definition: http://data-gov.tw.rpi.edu/vocab.php?property=92/title <owl:DatatypeProperty rdf:about="http://data-gov.tw.rpi.edu/vocab/p/92/title"> <rdfs:label>92/title</rdfs:label> <rdfs:subPropertyOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#label"/> <rdfs:subPropertyOf rdf:resource="http://xmlns.com/foaf/0.1/name"/> <rdfs:subClassOf rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Property"/> … </owl:DatatypeProperty>

  16. Linking using Semantic Wikiconnect entities using owl:sameAs  Correct Wikipedia Name X Wrong Wikipedia Name

  17. Incremental Data Enhancement Enhance raw RDF with links: http://data-gov.tw.rpi.edu/linked/403/agency_403.rdf <rdf:Description rdf:about="http://data-gov.tw.rpi.edu/raw/403/data-403.rdf#entry262"> ….. <agency_name_link rdf:resource="http://data-gov.tw.rpi.edu/vocab/Executive_Office_of_the_President"/> </rdf:Description> Link to DBpedia: http://data-gov.tw.rpi.edu/vocab/Executive_Office_of_the_President <swivt:Subject rdf:about="http://data-gov.tw.rpi.edu/vocab/Executive_Office_of_the_President"> <rdfs:label>Executive Office of the President</rdfs:label> <rdf:type rdf:resource="http://data-gov.tw.rpi.edu/vocab/c/Agencies_of_the_United_States_government"/> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Executive_Office_of_the_President"/> …… </swivt:Subject>

  18. Runtime Linking in Applications • Link datasets by common literal value • Link datasets by overlapping time • Align multiple time series • Support users to comment on time series data

  19. Provenance Issues

  20. Provenance Annotation • Descriptions • Relations Agency Dataset Demo

  21. Provenance Events CSV2RDF visualize derive derive create revision Archive SemDiff Enhance derive

  22. Results from Revision Provenance The number of datasets published at data.gov has been tripled since July 2009 Dataset updates on data.gov are not limited to additions.

  23. Conclusion

  24. Conclusion - Observations • Minimal and extensible RDF conversion is useful for generate linked government data in a timely fashion • Literal name is still useful in linking data, especially if we know the context of data • Social semantic web technologies can help distributing high cost tasks, e.g. mapping entity names, to the crowd. • Provenance is a growing requirement to the transparency of open data applications

  25. Conclusions – Ongoing Workbuild hub datasets US Census State population DATA-GOV-CATALOG (present) IRS annual Tax report PUBLIC-LIB (1992-2006) USAspending (2008-2010) CASTNET sites GOV-BUDGET (1962-2014) US agency US location Employment statistics owl:sameAs skos:altLabel Medicare cost Blah, blah… … …..

  26. Conclusions – Ongoing Work Making Sense of LGD AI + CI ! To appear in Web Sci 2010 conference – co-located with WWW 2010

  27. Conclusions – Ongoing Workincremental knowledge on social semantic web • A social semantic web website can substantially promote collaborations on knowledge accumulation (ontology as well as instance linkage) • We need a tradeoff on costly high quality conversion and ugly minimal conversion #a foaf:name “my title” ? #a dgp92:title “my title” #a rdfs:label “my title” #a skos:prefLabel “my title” dgp92:title rdfs:subPropertyOf rdfs:label

  28. Conclusions – Ongoing Workprovenance is everywhere Evaluate issues on exposing provenance data and improve semantic-difference computation. • provenance vocabulary • provenance awareness • provenance reasoning • provenance mining • …

  29. Ok, it is really the final conclusion • The data-gov project does not use much AI for now (most on representation side), but even little semantics goes a long way • The massive knowledge accumulated in this project is now raising a number of challenges to AI (especially the computation side) • Semantic technologies are not far from us, undergraduate students can build a demo quickly!

  30. BTW,…. Questions? Shameless self-promotions • Link: http://data-gov.tw.rpi.edu/ • “Browsing and Finding Linked Data” by Shangguan this afternoon • See us at demo/poster session, we have more exciting demos to show you • IPAW 2010 (June 2010, Troy, NY) will be looking for late breaking news from you!

More Related