LogicSQL-based Enterprise Archive and Search System Li-Yan Yuan How to organize the information and make it accessible and useful ?
Projects • How to develop an enterprise search engine based on a database management system challenges: implementation of the inverted index
Projects • How to implement the TOP K query • Ranking formula • Inverted indexes are created with respect to frequences
Internet search • Search for relevant web pages • Good answers: • Relevant • Popular • Public domain knowledge, • Search engines are critical to Internet use • internal workings are secret • Tremendous political, economical, and cultural power
Enterprise search • Search the enterprise information systems for right information • Enterprise information • Internal web pages • Internal documentation systems • File systems • Databases • Email servers • The internet and enterprise domains differ fundamentally • Contents • User behavior • Economic motivations
Top-K Query • Objective • How to determine the top K objects that are most likely (approximately) related to the given query • Applications • Information retrieval • Internet and enterprise searches • Multimedia similarity search • Scheduling large scale on-demand data broadcase • ……
LogicSQL Enterprise information Archive and Search system • LogicSQL An object-relational database management system • New concurrency control algorithm • Staged database architecture • Developed in the University of Alberta • Commercialized by Shanghai Shifang Software Co.
Enterprise Archive and Search System • To archive all the enterprise information contents • File systems • Web pages • Emails • Internal documents • Database records? • To provide a web styled search engine • To support user-specified ranking algorithms • focus on the platform of archive and search • Easy implementation and test of various ranking algorithms
Enterprise Archive and Search System • Extend the database functionalities • Security model • Users, roles + security handle • Security primary key • New database objects • Inverted indexes • CREATE INVERTED INDEX • DROP INVESTED INDEX • Automatic population, similar to that of index • ORDER BY clause • User specified aggregate functions • CREATE AGGREGATE FUNCTION • Top-K query evaluation • Specified crawlers
Enterprise Archive and Search System • User configuration • Set up crawlers • Create a list of inverted indexes • Create one aggregate function for object ranking • Extend the query languages • Implement the top K query algorithm • Web based query pages