Parallel and Distributed IR. Papers on Parallel and Distributed IR. Agenda. Introduction Paper A: Inverted file partitioning schemes in Multiple Disk Systems by Byeong-Soo Jeong and Edward Omiecinski 
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Parallel and Distributed IR
Inverted file partitioning schemes in Multiple Disk Systems
By Byeong-Soo Jeong and Edward Omiecinski 
Performance comparison under different parameters
Under skew Query model: partition by document-id performs better. Because I/O load is more balanced in partition by document-ID. Whereas, partition by term-ID performs better in uniform query model.
Under uniform query environment, partition by term-ID model performs twice as fast for long queries and 5-10 times fast for short queries.
Addition of number of disks improves performance of partition by document-ID scheme at higher rate, since I/O load is more evenly distributed in partition by document-ID.
Conclusion: Partition by Term ID performs better under uniform query models, but has high fluctuation in response time depending on terms in query. In Partition by Doc-ID, there is little variation in response time for almost all cases.
Methodologies for Distributed Information Retrieval
By Alister Moffat, Justin Zobel, Owen De Kretser, Tim Shimmin 
[Will see why its not efficient in coming slides]
The only global information maintained by the receptionist is a list of librarian.
Global information stored by receptionist is the vocabularies of the sub-collections.
Receptionist has a full access to the indexes of sub-collections.
Global Information: List of librarians
Global Information: Vocabularies of all sub-collections.
Receptionist has full access to indexes of sub-collection.