Complexity and Information Retrieval: Exploring Theories and Applications

INFORMATION RETRIEVAL Dr.N.Kalpana Assistant Professor, Department of Computer Science and Engineering PSNA COLLEGE OF INFORMATION AND TECHNOLOGY

Last time • What is complexity • Complex systems • Measuring complexity • Computational complexity – Big O • Scaling • Why do we care • Scaling is often what determines if information technology works • Scaling basically means systems can handle a great deal of • Inputs • Users • growth • Methodology – scientific method

model test The Scientific Method • Observe an event(s). • Develop a model (or hypothesis) which makes a prediction to explain the event • Test the prediction with data • Observe the result. • Revise the hypothesis. • Repeat as needed. • A successful hypothesis becomes a ScientificTheory.

Today • What is Information Retrieval • Definitions • Theories/hypotheses • Why do we care • Impact on information science • Great resource

Tomorrow Topics used in Machine learning • Information retrieval and search • Text • Encryption • Social networks • Probabilistic reasoning • Digital libraries • Others?

Theories in Information Sciences • Enumerate some of these theories in this course. • Issues: • Unified theory? • Domain of applicability • Conflicts • Theories here are mostly algorithmic • Quality of theories • Occam’s razor • Subsumption of other theories • If IR is really true, unified theory of most (all?) of information science

Commercial Retrieval

Retrieval in Real Life A young science (≈ 50 years old) • Exciting and dynamic field, lots of uncharted territory left • Impressive success stories • “Intelligent” in specialized domains • Many application areas Face detection Formal verification

Why the interest in IR? Search engines Science Medicine/ Diagnosis Labor Appliances What else?

One Working Definition of IR • IR is the study of how to make computers do things that people are better at or would be better at if: • They could extend what they do to a World Wide • Web-sized amount of data and • Not make mistakes.

IR Purposes "AI can have two purposes. One is to use the power of computers to augment human thinking, just as we use motors to augment human or horse power. Robotics and expert systems are major branches of that. The other is to use a computer's artificial intelligence to understand how humans think. In a humanoid way. If you test your programs not merely by what they can accomplish, but how they accomplish it, they you're really doing cognitive science; you're using AI to understand the human mind."

Cognitive Science • Cognitive science brings together theories and experimental evidence to model internal activities of the brain • What level of abstraction? “Knowledge” or “Circuits”? • How to validate models? • Predicting and testing behavior of human subjects (top-down) • Direct identification from neurological data (bottom-up) • Building computer/machine simulated models and reproduce results (simulation)

Strong IR "I find it useful to distinguish what I will call "strong" IR from "weak" or "cautious" IR. According to weak IR, the principle value of the computer in the study of the mind is that it gives us a very powerful tool. For example, it enables us to formulate and test hypothesis in a more rigorous and precise fashion. But according to strong AI, the computer is not merely a tool in the study of the mind; rather, the appropriately programmed computer really is a mind, in the sense that computers given the right programs can be literally said to understand and have other cognitive states."

Weak and Strong IR Claims • Weak AI: • Machines can be made to act as if they were intelligent. • Strong AI: • Machines that act intelligently have real, conscious minds.

Eliza • Called after Eliza Doolittle of Pygmalion fame. • Developed in 1964-1966 by Joseph Weizenbaum in MIT • Models (parodies) the rôle of a Rogerian psychotherapist engaged in an initial interview with a patient. Much or the technique of the Rogerian psychotherapist involves drawing the patient out by reflecting the patient’s statements back at him. http://www-ai.ijs.si/eliza/eliza.html

Eliza Men are all alike IN WHAT WAY They’re always bugging us about something or other. CAN YOU THINK OF A SPECIFIC EXAMPLE Well, my boyfriend made me come here. YOUR BOYFRIEND MADE YOU COME HERE He says I am depressed much of the time. I AM SORRY TO HEAR YOU ARE DEPRESSED It’s true. I am unhappy. DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY I need some help, that much seems certain. WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP Perhaps I could learn to get along with my mother.

What is a chatterbot? • A chatter robot, chatterbot, chatbot, or chat bot is a computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, primarily for engaging in small talk. • The primary aim of such simulation has been to fool the user into thinking that the program's output has been produced by a human (the Turing test). • Programs playing this role are sometimes referred to as Artificial Conversational Entities, talk bots or chatterboxes. • Uses: • chatterbots are often integrated into dialog systems for various practical purposes such as online help, personalised service, or information acquisition. • Spam in chatrooms • Some chatterbots use sophisticated natural language processing systems, but many simply scan for keywords within the input and pull a reply with the most matching keywords, or the most similar wording pattern, from a textual database. • Collections: http://www.simonlaven.com/

Types of Chatterbots • Classic Chatterbots • Complex Chatterbots • Friendly Chatterbots • Teachable Bots • AIML Bots • JFred Bots • NativeMinds Bots Non-English Bots • Alternative Bots http://www.simonlaven.com/

Intelligence • Turing Test: A human communicates with a computer via a teletype. If the human can’t tell he is talking to a computer or another human, it passes. • Natural language processing • knowledge representation • automated reasoning • machine learning • Add vision and robotics to get the total Turing test.

Branches of IR • Logical AI • Search • Natural language processing • Computer vision • Pattern recognition • Knowledge representation • Inference From some facts, others can be inferred. • Reasoning • Learning • Planning To generate a strategy for achieving some goal • Epistemology This is a study of the kinds of knowledge that are required for solving problems in the world. • Ontology Ontology is the study of the kinds of things that exist. • Agents • Games • Artificial life / worlds? • Emotions? • Knowledge Management? • Socialization/communication? • …

Approaches to AI • Searching • Learning • From Natural to Artificial Systems • Knowledge Representation and Reasoning • Expert Systems and Planning • Communication, Perception, Action

Watson “The goal is to have computers start to interact in natural human terms across a range of applications and processes, understanding the questions that humans ask and providing answers that humans can understand and justify” - IBM

Watson IBM’s IR system Capable of answering questions in natural language Competed against champions on Jeopardy and won

THANK YOU

Complexity and Information Retrieval: Exploring Theories and Applications

Complexity and Information Retrieval: Exploring Theories and Applications

Presentation Transcript

Department of Computer Science and Engineering

Computer Science and Engineering Geoffrey M. Voelker Assistant Professor

Fuh-Gwo Chen Assistant Professor, department of Computer Science and Information Management,

Department of Computer Science and Computer Engineering

Department of Computer Science and Engineering

Department of Computer Science and Information Engineering

Department of Computer Science and Electrical Engineering

Marty Humphrey Assistant Professor Computer Science Department University of Virginia

Department of Computer Science and Information Engineering

Department of Computer Science and Engineering

Department of Computer Science and Engineering

Chung- Kuan Cheng Position: Professor , Department of Computer Science and Engineering

Monther Dwaikat Assistant Professor Department of Building Engineering

Monther Dwaikat Assistant Professor Department of Building Engineering

Monther Dwaikat Assistant Professor Department of Building Engineering

Monther Dwaikat Assistant Professor Department of Building Engineering

Department of Computer Science and Information Engineering

Monther Dwaikat Assistant Professor Department of Building Engineering

Monther Dwaikat Assistant Professor Department of Building Engineering

Bill Carroll Professor and Chairperson Computer Science and Engineering Department

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

By M.S.Thanabal, Associate Professor, Department of Computer Science and Engineering, PSNACET.