70 likes | 238 Views
BioGeomancer: Semi-automated Georeferencing Engine John Wieczorek, Aaron Steele, Dave Neufeld, P. Bryan Heidorn, Robert Guralnick, Reed Beaman, Chris Frazier, Paul Flemons, Nelson Rios, Greg Hill, Youjun Guo. Locality Interpretation Methods.
E N D
BioGeomancer: • Semi-automated • Georeferencing Engine • John Wieczorek, Aaron Steele, Dave Neufeld, P. Bryan Heidorn, Robert Guralnick, Reed Beaman, Chris Frazier, • Paul Flemons, Nelson Rios, Greg Hill, Youjun Guo
Locality Interpretation Methods All of these projects/institutions contributed to how BioGeomancer understands localities (using regular expression analysis or machine learning/natural language processing): • Tulane - GEOLocate • Yale - BioGeomancer Classic • U. Illinois, Urbana-Champagne • Inxight Software, Inc.
37 Locality Types • F – feature • P – path • FO – offset from a feature, sans heading • FOH – offset from feature at a heading • FO+ – orthogonal offsets from a feature • FPOH – offset at a heading from a feature along a path • 31 other locality types known so far
Five Most Common Locality Types* • 51.0% - feature • 21.4% - locality not recorded • 17.6% - offset from feature at a heading • 8.6% - path • 5.8% - undefined • types of localities BG recognizes *based on 500 records randomly selected from the 296k records georeferenced manually in the MaNIS Project.
Types of Data BG Uses and Georeferences • BG has 11 million entries in the gazetteer • http://www.biogeomancer.org/metadata.html • User created places = 112,000 • 1.5 million localities were georeferenced, for 6.2 million georeferences (so on average 4 georeferences per locality) • 500 login users, 6,000 projects done • ORNIS did 189k localities in BG batch processing