Course overview an introduction to information retrieval and applications
This presentation is the property of its rightful owner.
Sponsored Links
1 / 48

Course Overview: An Introduction to Information Retrieval and Applications PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on
  • Presentation posted in: General

Course Overview: An Introduction to Information Retrieval and Applications. J. H. Wang Feb. 17, 2014. Instructor & TA. Instructor J. H. Wang ( 王正豪 ) Associate Professor, CSIE, NTUT Office: R1534, Technology Building E-mail: [email protected] Tel: ext. 4238

Download Presentation

Course Overview: An Introduction to Information Retrieval and Applications

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Course overview an introduction to information retrieval and applications

Course Overview: An Introduction to Information Retrieval and Applications

J. H. Wang

Feb. 17, 2014


Instructor ta

Instructor & TA

  • Instructor

    • J. H. Wang (王正豪)

    • Associate Professor, CSIE, NTUT

    • Office: R1534, Technology Building

    • E-mail: [email protected]

    • Tel: ext. 4238

    • Office Hour: 9:00-12:00 am, every Tuesday and Thursday

  • TA

    • Mr. Huang (R1424, Technology Building)

      • Available Time: Mon. morning or Tue. Afternoon

      • E-mail: jsn900211 @ gmail.com

NTUT CSIE


Course description

Course Description

  • Course Web Page: for the latest announcements and updates of schedule, slides, and homeworks

    • http://www.ntut.edu.tw/~jhwang/IR/

  • Time: 9:10-12:00am, Fri.

  • Classroom: R334, Technology Building

  • Textbook:

    • Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze, Introduction to Information Retrieval, Cambridge University Press, 2008.

      • Available online

      • International Student Edition, imported by Kai-Fa (開發) Publishing

  • Prerequisites:

    • Basic knowledge of data structures and algorithms, linear algebra, and probability theory

    • Programming experience is *required* for homeworks & projects

NTUT CSIE


Target audience

Target Audience

  • Seniors

  • Graduate students

  • IGPEECS (International Graduate Program in Electrical Engineering and Computer Science)

NTUT CSIE


Additional references

Additional References

  • References:

    • Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search, Addison-Wesley, 2011.

      • This is the second edition of their book Modern Information Retrieval in 1999. (華通)

    • Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison-Wesley, 2010. (全華)

    • Stefan Buettcher, Charles L.A. Clarke, and Gordon V. Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.

NTUT CSIE


More books on ir

More Books on IR

  • Gerald Salton, Automatic information organization and retrieval, McGraw-Hill, 1968.

  • Gerald Salton and M.J. McGill, Introduction to modern information retrieval, McGraw-Hill, 1983.

    • Two classics, but out-of-print.

  • C. J. van Rijsbergen, Information Retrieval, Butterworths, 1979.

    • The classic. More than 40 years old, but still worth reading.

  • K. Sparck Jones, P. Willett, Readings in Information Retrieval, Morgan Kaufmann, 1997.

    • A collection of classical IR papers. (out of print)

  • I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, Managing Gigabytes, 2nd edition, 1999.

    • The authority on index construction and compression.

NTUT CSIE


Grading policy

Grading Policy

  • Homework assignments and programming exercises: ~40%

  • Mid-term exam: ~25%

  • Term project: ~35%

    • Including proposal, presentation, and final report

NTUT CSIE


Programming exercises and term project

Programming Exercises and Term Project

  • About 3 programming exercises

    • Team-based with maximum number of students per team:

      • 4 for undergraduates

      • 2 for graduate students

    • You can either write your own code or reuse existing open source code

  • The term project

    • Either team-based system development (the same as programming exercises)

    • Or academic paper presentation

      • Only one person per team allowed

    • A proposal is *required* before midterm (Apr. 11, 2014)

NTUT CSIE


About the term project

About the Term Project

  • The score you get depends on the functions, difficulty and quality of your project

    • For system development:

      • System functions and correctness

    • For academic paper presentation

      • Quality and your presentation of the paper

      • Major methods/experimental results *must* be presented

      • Papers from top conferences are strongly suggested

        • E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …

      • Proposals are *required* for each team, and will be counted in the score

NTUT CSIE


Online submission

Online Submission

  • Submission instructions

    • Programs, project proposals, and project reports in electronic files must be submitted to the TA online at:

      • Submissions website: http://140.124.183.31/net2ftp

      • Submission instructions:

        • FTP server: localhost

        • User name & password: Your student ID

NTUT CSIE


What this course is not about

What this Course is NOT about

  • This course will NOT tell you

    • The tips and tricks of using search engines, although power users might have better ideas on how to improve them

      • There’re plenty of books and websites on that…

    • How to find books in libraries, although it’s somewhat related to the basic IR concepts

    • How to make money on the Web, although the currently largest search engine did it

NTUT CSIE


What s information retrieval

What’s Information Retrieval?

  • Things that you have been doing all day!

    • Searching for something interesting: Web, news, e-mail, image, video, …

    • Asking for advices

  • User interests are changing all the time…

    • 2011: New Zealand Earthquake

    • 2012: Jeremy Lin

    • 2013: Meteor Russia

    • 2014: ? (next slide)

NTUT CSIE


What s information retrieval1

What’s Information Retrieval

NTUT CSIE


In google news

In Google News

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


In web pages

In Web Pages

NTUT CSIE


In wikipedia

In Wikipedia

NTUT CSIE


In google images

In Google Images

NTUT CSIE


Different keywords ukraine riots

Different keywords: Ukraine riots

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


More related keywords

More related keywords

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


What if we search in chinese

What if We Search in Chinese

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


Related keywords

Related Keywords

  • Ukraine

  • Ukraine riots

  • Ukraine crisis

  • Kiev

  • Protest

  • Truce

  • 2014 Hrushevskoho Street riots

NTUT CSIE


Related keywords in chinese

Related Keywords in Chinese

  • 烏克蘭

  • 基輔

  • 示威

  • 衝突

  • 危機

  • 鎮壓

  • And this can go on:

    • for other languages…

    • and other search engines…

    • and social websites…

NTUT CSIE


In google trends

In Google Trends

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


Course overview an introduction to information retrieval and applications

NTUT CSIE


And social search

And Social Search…

NTUT CSIE


How do i know what people care about

How do I Know What PeopleCare about?

NTUT CSIE


What are people searching in taiwan on that day

What are People Searching in Taiwan on that day?

NTUT CSIE


What is information retrieval

What Is Information Retrieval?

  • “Information retrieval is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968)

NTUT CSIE


Course overview an introduction to information retrieval and applications

Goal

  • Information retrieval (IR): a research field that targets at effectively and efficiently searching information in text and multimedia documents

  • In this course, we will introduce the basic text and query models in IR, retrieval evaluation, indexing and searching, and applications for IR

NTUT CSIE


A big picture

A Big Picture

NTUT CSIE


Course overview an introduction to information retrieval and applications

User

Interface

user need

Text

Text Operations

Doc representation

logical view

Query

Expansion

Indexing

user feedback

inverted file

query

Inverted Index

Retrieval

Document Collection

retrieved docs

Ranking

ranked docs

NTUT CSIE


Topics

Topics

  • Text IR

    • Indexing and searching

    • Query languages and operations

  • Retrieval evaluation

  • Modeling

    • Boolean model

    • Vector space model

    • Probabilistic model

  • Applications for IR

    • Multimedia IR

    • Web search

    • Digital libraries

NTUT CSIE


Organization of the textbook

Organization of the Textbook

  • Basics in IR (focus)

    • Inverted indexes for boolean queries (Ch.1-5)

    • Term weighting and vector space model (Ch. 6-7)

    • Evaluation in IR (Ch. 8)

  • Advanced Topics

    • Relevance feedback (Ch. 9)

    • XML retrieval (Ch. 10)

    • Probabilistic IR (Ch. 11)

    • Language models (Ch. 12)

  • Machine learning in IR (useful)

    • Text classification (Ch. 13-15)

    • Document clustering (Ch. 16-18)

  • Web Search

    • Web crawling and indexes (Ch. 19-20)

    • Link analysis (Ch. 21)

NTUT CSIE


Some overlap with other fields

Some Overlap with Other Fields

  • Text mining, Information Extraction

  • Machine Learning

  • Natural Language Processing

  • Social Network Analysis

NTUT CSIE


Pointers to other topics

Pointers to Other Topics

  • Cross-language IR

  • Image, video, and multimedia IR

  • Speech retrieval

  • Music retrieval

  • User interfaces

  • Parallel, distributed, and P2P IR

  • Digital libraries

  • Information science perspective

  • Logic-based approaches to IR

  • Natural language processing techniques

NTUT CSIE


Tentative schedule

Tentative Schedule

  • Before midterm

    • Boolean retrieval (1 wk)

    • Indexing (2 wks)

    • Vector space model and evaluation (2 wk)

    • Relevance feedback (1 wk)

    • Probabilistic IR (2 wk)

  • After midterm

    • Text classification (1-2 wk)

    • Document clustering (1-2 wk)

    • Web search (2 wks)

    • Advanced topics: CLIR, IE, … (2 wks)

    • Term Project Presentation (3 wks)

NTUT CSIE


Generic resources

Generic Resources

  • Wikipedia page on Information Retrieval: http://en.wikipedia.org/wiki/Information_retrieval

  • Information Retrieval Resources: http://www-csli.stanford.edu/~hinrich/information-retrieval.html

NTUT CSIE


Academic resources

Academic Resources

  • Journals

    • ACM TOIS: Transactions on Information Systems

    • JASIST: Journal of the American Society of Information Sciences

    • IP&M: Information Processing and Management

    • IEEE TKDE: Transactions on Knowledge and Data Engineering

  • Conferences

    • ACM SIGIR: International Conference on Information Retrieval

    • WWW: World Wide Web Conference

    • ACM CIKM: Conference on Information Knowledge and Management

    • JCDL: ACM/IEEE Joint Conference on Digital Libraries

    • ACM WSDM: International Conference on Web Search and Data Mining

    • TREC: Text Retrieval Conference

NTUT CSIE


Teaching in english

Teaching in English…

  • Slides and lectures will be offered mainly in English

  • For better understanding for domestic students, important concepts will be briefly summarized in Chinese

NTUT CSIE


Thanks for your attention

Thanks for Your Attention!

  • Any question or comment? Please feel free to send e-mails to [email protected] discuss with me at my office

NTUT CSIE


  • Login