1 / 26

Midterm Exam Review for SIMS 202: Information Organization and Retrieval

This lecture provides an overview of the administrative details, rules, and sample questions for the midterm exam. It also includes a study guide and resources to help students prepare.

kyley
Download Presentation

Midterm Exam Review for SIMS 202: Information Organization and Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004 http://www.sims.berkeley.edu/academics/courses/is202/f04/ Lecture 13: Midterm Review SIMS 202: Information Organization and Retrieval

  2. Lecture Overview • Midterm Review • The administrative details • The “Rules” for the exam • We will go through the sample questions and discuss them • Open question/answer period

  3. Lecture Overview • Midterm Review • The administrative details • The “Rules” for the exam • We will go through the sample questions and discuss them • Open question/answer period

  4. Midterm Exam Details • Date: 10/14/2004 Time: 10:30-12:00 • The exam is open-book, open note AND open computer • There will be 8-10 questions on the exam • You may use your own laptop, or one of the computers in the lab. The results of your work are to be printed • The exam can be hand-written if you wish, if so be sure to bring: • Pens/Pencils • Calculator • (Paper will be provided on the exam itself, but you may want to bring scratch paper)

  5. Midterm Exam Details • The exam will cover the first half of the course, that is primarily it will be on the topics covered concerning Information Retrieval • Questions will be worth a specific number of points and these will be stated on the exam itself • Partial credit will be awarded for partial answers • In your answers, please balance conciseness with illustration of all of the requested information • In other words, don't write a lot of things that aren't asked for, but try to address all of what is asked for

  6. Lecture Overview • Midterm Review • The administrative details • The “Rules” for the exam • We will go through the sample questions and discuss them • Open question/answer period

  7. Rules • Do your own work • No discussion during the exam • Yes, IM counts as discussion! • Yes, email counts as discussion! • You are on your honor to not look at other student’s work (you may want to review the University policies on academic dishonesty) • PROVIDE PROPER ATTRIBUTION for ideas taken from other sources (online or printed)

  8. Rules • Questions CAN and SHOULD be asked of me or the TA’s • Issues/Corrections/Answers for details will be put up on the screens in 202 • We will also put these up on a web page for those in the Lab

  9. Lecture Overview • Midterm Review • The administrative details • The “Rules” for the exam • We will go through the sample questions and discuss them • Open question/answer period

  10. Study Guide • To study for the exam: • Be sure you understand the material that was covered in lectures and have read and absorbed the corresponding material in the readings • Be sure you can do activities similar to what was done in the homework assignments • We will have questions that require you to generalize from what you've learned and synthesize ideas • So be sure you have thought about the ideas covered in lecture, readings, and homework assignments

  11. Study Guide • Alison suggests that you might want to bookmark online or printed resources so that you can quickly find the topics that you need

  12. Example Questions • These are available on the Class Web site • Note that these examples are NOT the exact questions that will be on the exam but are similar to questions that have been used in the past • There will be questions that ask you to do something with supplied data • For example, given some data, design an ER diagram describing the data elements and their relationships

  13. Example Questions • The example questions on the web site are organized (approximately) in the order that the topics were presented during the course: • Information • The Search process • Documents and Statistics of Text • Queries, Ranking, and the Vector Space Model • IR Systems and Implementation • Relevance Feedback • Evaluation of IR Systems • Database Design

  14. (Approximate) Course Schedule • Organization • Phone Project Introduction • Categorization • Knowledge Representation • Lexical Relations and WordNet • Metadata Introduction • Controlled Vocabularies Introduction • Facetted Classification • Thesaurus Design and Construction • Semantic Web • Multimedia Information Organization and Retrieval • Metadata for Media • Phone Project Presentations • Retrieval • Overview • Introduction to the Search Process • Boolean Queries and Text Processing • Web Search Issues and Architecture • Statistical Properties of Text and Vector Representation • Probabilistic Ranking & Relevance Feedback • Evaluation • Interfaces for Information Retrieval • Database Design

  15. Review of Course Content • We can draw on: • 14 sets of Slides (including this one and the Math Review slides) • Handout papers • The Reader • Textbooks • Assignments • Discussion questions and issues

  16. Example Questions • Topic:Information • Example Questions: • What is the information life cycle? • What are different ways of measuring information? What are different ways of defining information?

  17. Example Questions • Topic:Document Representation and Statistical Properties of Text • Example Questions: • What is the significance of Zipf's law for weighting of terms in information retrieval? • What kinds of errors can a stemming algorithm produce?

  18. Example Questions • Topic:Queries, Ranking, and the Vector Space Model • Example Questions: • What is the difference between a search engine that uses the vector space ranking algorithm on natural language queries and a system that uses Boolean queries? • What is the role of coordination level ranking in a faceted Boolean system? • Describe the following information need in terms of a faceted Boolean query. What kinds of weighting algorithms can be applied to a faceted query like this? ``I would like to find articles about the effects of the passage of the independent investigator statute by Congress on how the U.S. president chooses an attorney general.'' • Why do different web search engines return different sets of documents for the same query? • Redo the computations of Assignment 3 part 3 using different values for TF.

  19. Example Questions • Topic:IR systems and Implementation • Example Questions: • Draw and label a diagram that shows the major components of an IR system. • What are the special features of the Cheshire II information access system? • What is the purpose of an inverted index? How is it used to generate answers to Boolean queries? • Convert the contents of a set of documents (short texts) into an inverted index representation.

  20. Example Questions • Topic:Evaluation of IR Systems • Example Questions: • Define precision. Define recall. Define relevance. How are the three interrelated? • Under what circumstances is high recall desirable? Under what circumstances is high precision? • What is the main purpose of TREC? How does it differ from earlier evaluation efforts?

  21. Example Questions • Topic:The Search Process • Example Questions: • Search and retrieval is part of a larger process. Name some other components of that process. • How/why doesn't the Bates berry-picking model fit with the standard information retrieval model? • How (fundamentally) does search on a directory system like Yahoo differ from search on Altavista or Google?

  22. Example Questions • Topic:Relevance Feedback • Example Questions: • What is main the difference between relevance feedback as defined in the literature and the more current web-based notion of "more like this"? • Given a query, three documents marked as relevant, and the Rocchio formula for relevance feedback given in class, compute the vector for the new query that results. • The Koenemann & Belkin study found results in three conditions for relevance feedback: opaque, transparent, and penetrable. Consider the different ways people have implemented systems for predicting which web page to show the user next. How do the differences in these systems correspond to the different relevance feedback

  23. Example Questions • Topic:Database Design • Example Questions: • How is a database different than a file system? • What are the benefits of a database system? • What do we mean by data independence? • What are the benefits/drawbacks of the primary database models? • Entity-Relationship Diagrams -- what are they for, how do you create them? • How do you normalize a relational model database? • What is a join?

  24. Lecture Overview • Midterm Review • The administrative details • The “Rules” for the exam • We will go through the sample questions and discuss them • Open question/answer period

  25. Your Questions • What other topics would you like more explanation for?

  26. Be prepared, and good luck!

More Related