Dr. John R. Jensen Department of Geography University of South Carolina Columbia, SC 29208

Thematic Information Extraction: Artificial Intelligence Dr. John R. Jensen Department of Geography University of South Carolina Columbia, SC 29208

Artificial Intelligence “the study of how to make computers do things which, at the moment, people do better”. But how do we know when an artificially intelligent system has been created? We could use the Turing test, which suggests that if we are unable to distinguish a computer’s response to a problem of interest from a human’s response to the same problem, then the computer system is said to have intelligence. The test is for an artificial intelligence program to have a blind conversation with an interrogator for 5 minutes. The interrogator has to guess if the conversation is with an artificial intelligence program or with a real person. The AI program passes the test if it fools the interrogator 30% of the time. Unfortunately, it is very difficult for most artificial intelligence systems to pass the Turing test. For this reason, “the field of AI as a whole has paid little attention to Turing tests,” preferring instead to forge ahead developing artificial intelligence applications that simply work. Jensen, 2005

Artificial Intelligence Artificial intelligence research was initiated in 1955 when Allen Newell and Herbert Simon at the RAND Corporation proved that computers could do more than calculate. “They demonstrated that computers were physical symbol systems whose symbols could be made to stand for anything, including features of the real world, and whose programs could be used as rules for relating these features. In this way computers could be used to simulate certain important aspects of intelligence. Thus, the information-processing model of the mind was born”. Jensen, 2005

Artificial Intelligence Unfortunately, artificial intelligence was oversold in the 1960s much like remote sensing was oversold in the 1970s. General artificial intelligence problem solving was found to be much more difficult than originally anticipated. Scientists could not get computers to solve problems that were routinely solved by human experts. Therefore, scientists instead started to investigate the development of artificial intelligence applications in “micro-worlds,” or very narrow topical areas. This led to the creation of the first useful artificial intelligence systems for select applications, e.g., games, disease diagnosis (MYCIN), spectrograph analysis (DENDRAL). NASA’s REMOTE AGENT program was the first on-board autonomous planning program to control the scheduling of operations for a spacecraft traveling a hundred million miles from Earth. Jensen, 2005

Expert Systems A knowledge-based expert system is defined as: “a system that uses human knowledge to solve problems that normally would require human intelligence”. It is the ability to “solve problems efficiently and effectively in a narrow problem area” and “to perform at the level of an expert”. Expert systems represent the expert’s domain (i.e., subject matter) knowledge base as data and rules within the computer. The rules and data can be called upon when needed to solve problems. A different problem within the domain of the knowledge base can be solved using the same program without reprogramming. Knowledge-based expert systems are used extensively in remote sensing research. Jensen, 2005

Components of a Typical Rule-based Expert System Domain (thematic) knowledge contained in an expert’s mind is extracted in the form of a knowledge basethat consists of hypotheses, rules, and conditions that satisfy the rules. A user interfaceand an inference engineare used to encode the knowledge base rules, extract the required information from online databases, and solve problems. Hopefully, the information is of value to the userwho queries the expert system. Jensen, 2005

Expert System User Interface The expert system user interface should be easy to use, interactive, and interesting. It should be intelligent and accumulate user preferences in an attempt to provide the most pleasing communication environment possible. The figure depicts a commercially available Knowledge Engineer interface that can be used to develop remote sensing–assisted expert systems. This expert system shell was built using object-oriented programming. All of the hypotheses, rules, and conditions for an entire expert system may be viewed and queried from the single user interface. Jensen, 2005

Creating the Knowledge Base Images, books, articles, manuals, and periodicals have a tremendous amount of information in them. Practical experience in the field with vegetation, soils, rocks, water, atmosphere, and urban infrastructure is also important. However, a human must comprehend the information and experiences and turn it into knowledge for it to be useful. Many human beings have trouble interpreting and understanding the information in images, books, articles, manuals, and periodicals. Similarly, some do not obtain much knowledge from field work. Fortunately, some laypersons and scientists are particularly adept at processing their knowledge using three different problem-solving approaches: Jensen, 2005

Creating the Knowledge Base • Algorithms using conventional computer programs • Heuristic knowledge-based expert systems: • Human-derived rules • Machine-derived rules • Artificial neural networks Jensen, 2005

Creating the Knowledge Base Algorithmic Approaches to Problem Solving: Conventional algorithmic computer programs contain little knowledge other than the basic algorithm for solving a specific problem, the necessary boundary conditions, and data. The knowledge is usually embedded in the programming code. As new knowledge becomes available, the program has to be changed and recompiled. Jensen, 2005

Characteristics that Distinguish Knowledge-based Expert Systems from Conventional Algorithmic Problem-solving Systems

Creating the Knowledge Base Heuristic Knowledge-based Expert System Approaches to Problem Solving: Knowledge-based expert systems, on the other hand, collect many small fragments of human know-how for a specific application area (domain) and place them in a knowledge base that is used to reason through a problem, using the knowledge that is most appropriate. Characteristics that distinguish knowledge-based expert systems from conventional algorithmic systems are summarized in the table. Heuristicknowledge is defined as “involving or serving as an aid to learning, discovery, or problem solving by experimental and especially by trial-and-error methods. Heuristic computer programs often utilize exploratory problem-solving and self-educating techniques (as the evaluation of feedback) to improve performance”. Jensen, 2005

Characteristics that Distinguish Knowledge-based Expert Systems from Conventional Algorithmic Problem-solving Systems Jensen, 2005

Creating the Knowledge Base The Problem with Experts Unfortunately, most experts really do not know exactly how they perform their expert work. Much of their expertise is derived from experiencing life and observing hundreds or even thousands of case studies. It is difficult for the experts to understand the intricate workings of complex systems much less be able to break them down into their constituent parts and then mimic the decision-making process of the human mind. Therefore, how does one get the knowledge embedded in the mind of an expert into formal rules and conditions necessary to create an expert system to solve relatively narrowly defined hypotheses (problems)? This is the responsibility of the knowledge engineer. Jensen, 2005

Creating the Knowledge Base The knowledge engineer interrogates the domain expert and extracts as many rules and conditions as possible that are relevant to the hypotheses (problems) being examined. Ideally, the knowledge engineer has unique capabilities that allow him or her to help build the most appropriate rules. This is not easy. The knowledge engineering process can be costly and time-consuming. Recently, it has become acceptable for a domain expert (e.g., biologist, geographer) to create his or her own knowledge-based expert system by querying oneself and hopefully accurately specifying the rules associated with the problem at hand, for example, using ERDAS Imagine’s expert system Knowledge Engineer. When this activity takes place, the expert must have a wealth of knowledge in a certain domain and the ability to formulate a hypothesis and parse the rules and conditions into understandable elements that are amenable to the “knowledge representation process.” Jensen, 2005

Knowledge Representation Process The knowledge representation process normally involves encoding information from verbal descriptions, rules of thumb, images, books, maps, charts, tables, graphs, equations, etc. Hopefully, the knowledge base contains sufficient high-quality rules to solve the problem under investigation. Rules are normally expressed in the form of one or more “IF condition THEN action” statements. The condition portion of a rule statement is usually a fact, e.g., the pixel under investigation must reflect > 45% of the incident near-infrared energy. When certain rules are applied, various operations may take place such as adding a newly derived derivative fact to the database or firing another rule. Rules can be implicit (slope is high) or explicit (e.g., slope > 70%). It is possible to chain together rules, e.g., IF c THEN d; IF d THEN e; therefore IF c THEN e. It is also possible to attach confidences (e.g., 80% confident) to facts and rules. Jensen, 2005

Knowledge Representation Process For example, a typical rule used by the MYCIN expert system is IF the stain of the organism is gram-negative AND the morphology of the organism is rod AND the aerobicity of the organism is anaerobic THEN there is strong suggestive evidence (0.8) that the class of the organism is Enterobacter iaceae. Following the same format, a typical remote sensing rule might be: IF blue reflectance is (Condition) < 15% AND green reflectance is (Condition) < 25% AND red reflectance is (Condition) < 15% AND near-infrared reflectance is (Condition) > 45% THEN there is strong suggestive evidence (0.8) that the pixel is vegetated. Jensen, 2005

Knowledge Representation Process Decision Trees The best way to conceptualize an expert system is to use a decision-tree structure where rules and conditions are evaluated in order to test hypotheses. When decision trees are organized with hypotheses, rules, and conditions, each hypothesis may be thought of as the trunk of a tree, each rule a limb of a tree, and each condition a leaf. This is commonly referred to as a hierarchical decision-tree classifier(e.g., Swain and Hauska, 1977; Jensen, 1978; Kim and Landgrebe, 1991; DeFries and Chan, 2000; Stow et al., 2003; Zhang and Wang, 2003). The purpose of using a hierarchical structure for labeling objects is to gain a more comprehensive understanding of relationships among objects at different scales of observation or at different levels of detail. Jensen, 2005

Knowledge Representation Process Decision Trees A decision tree takes as input an object or situation described by a set of attributes and returns a decision. The input attributes can be discrete or continuous. The output value can also be discrete or continuous. Learning a discrete-valued function is called classificationlearning. Learning a continuous function is called regression. We will concentrate on Boolean classification wherein each example is classified as true (positive) or false (negative). A decision tree reaches its decision by performing a sequence of tests. Jensen, 2005

Knowledge Representation Process Hypothesis 1: the terrain (pixel) is suitable for residential development that makes maximum use of solar energy (i.e., I will be able to put solar panels on my roof ). Jensen, 2005

Knowledge Representation Process Specify the expert system rules: Heuristic rules that the expert has learned over time are the heart and soul of an expert system. If the expert’s heuristic rules of thumb are indeed based on correct principles, then the expert system will most likely function properly. If the expert does not understand all the subtle nuances of the problem, has left out important variables or interaction among variables, or applied too much significance (weight) to certain variables, the expert system outcome may not be accurate. Therefore, the creation of accurate, definitive rules is extremely important. Each rule provides the specific conditions to accept the hypothesis to which it belongs. A single rule that might be associated with hypothesis 1 is: specific combinations of terrain slope, aspect, and proximity to shadows result in maximum exposure to sunlight. Jensen, 2005

Knowledge Representation Process • Specify the rule conditions: • The expert would then specify one or more conditions that must be met for each rule. For example, conditions for the rule stated above might include: • slope > 0 degrees, AND • slope < 10 degrees (i.e., the terrain should ideally lie on terrain • with 1 to 9 degrees slope), AND • aspect > 135 degrees, AND • aspect < 220 degrees (i.e., in the Northern Hemisphere the • terrain should ideally face south between 136 and 219 • degrees to obtain maximum exposure to sunlight), AND • the terrain is not intersected by shadows cast by neighboring • terrain, trees, or other buildings (derived from a viewshed • model). Jensen, 2005

Knowledge Representation Process A human-derived decision-tree expert system with a rule and conditions to be investigated by an inference engine to test Hypothesis 1: the terrain (pixel) is suitable for residential development that makes maximum use of solar energy (i.e., I will be able to put solar panels on my roof ). Jensen, 2005

Inference Engine The terms reasoning and inference are used to describe any process by which conclusions are reached. Thus, the hypotheses, rules, and conditions are passed to the inference engine where the expert system is implemented. One or more conditional statements within each rule are evaluated using the spatial data (e.g., 135 < aspect < 220). Multiple conditions within a rule are evaluated based on Boolean AND logic. While all of the conditions within a rule must be met to satisfy the rule, any single rule within a hypothesis can cause that hypothesis to be accepted or rejected. In some cases, rules within a hypothesis disagree on the outcome and a decision must be made using rule confidences (e.g., a confidence of 0.8 in a preferred rule and a confidence of 0.7 in another) or the order of the rules (e.g., preference given to the first) as the factor. The confidences and order associated with the rules are stipulated by the expert. Jensen, 2005

Inference Engine The inference engine interprets the rules in the knowledge base to draw conclusions. The inference engine may use backward- or forward-chaining strategies or both. Both backward and forward inference processes consist of a chain of steps that can be traced by the expert system. This enables expert systems to explain their reasoning processes, which is an important and positive characteristic of expert systems. You would expect a doctor to explain how he or she came to a certain diagnosis regarding your health. An expert system can provide explicit information about how a particular conclusion (diagnosis) was reached. Jensen, 2005

Inference Engine An expert system shell provides a customizable inference engine. Expert system shells come equipped with an inference mechanism (backward chaining, forward chaining, or both) and require knowledge to be entered according to a specified format. Expert system shells qualify as languages, although with a narrower range of application than most programming languages. Typical artificial intelligence programming languages include LISP, developed in the 1950s, PROLOG, developed in the 1970s, and now object-oriented languages such as C++. Jensen, 2005

Expert Systems Applied to Remote Sensor Data The use of expert systems in remote sensing research will be demonstrated using two different methodologies used to create the rules and conditions in the knowledge base. The first expert system classification is based on the use of formal rules developed by a human expert. The second example involves expert system rules derived automatically by an inductivemachine-learning algorithm based on training data that is input by humans into the system. Both methods are used to identify white fir forest stands on Maple Mountain in Utah County, Utah, using Landsat Enhanced Thematic Mapper Plus (ETM+) imagery and topographic variables extracted from a digital elevation model of the area. Jensen, 2005

Expert System Applied to Remote Sensor Data A hypothesis (class), variables, and conditions necessary to extract white fir (Abiesconcolor) forest cover information from Maple Mountain, Utah, using remote sensing and digital elevation model data. The Boolean logic with which these variables and conditions are organized within a chain of inference may be controlled by the use of rules and sub-hypotheses. Jensen, 2005

Classification of White Fir on Maple Mountain, Utah County using HierarchicalDecision Tree Logic 1 × 1 m NAPP aerial photography (acquired 17 Aug 1994) is draped over a 10 × 10 m USGS DEM. Jensen, 2005

30 × 30 USGS DEM Shaded Relief Contours Slope Aspect ETM NDVI ETM Panchromatic ETM RGB = 5,4,2 ETM RGB = 4,3,2 Jensen, 2005

Terrestrial Photograph ETM Panchromatic ETM RGB = 5,4,2 Expert’s Classification of White Fir ETM RGB = 4,3,2 Jensen, 2005

Hierarchical Decision Tree Classifier ETM Panchromatic Expert’s Model Predicted White Fir Jensen, 2005

Rules and Conditions Based on Machine Learning The heart of an expert system is its knowledge base. The usual method of acquiring knowledge in a computer-usable format to build a knowledge base involves human domain experts and knowledge engineers, as previously discussed. The human domain expert explicitly expresses his or her knowledge about a subject in a language that can be understood by the knowledge engineer. The knowledge engineer translates the domain knowledge into a computer-usable format and stores it in the knowledge base. Jensen, 2005

Rules and Conditions Based on Machine Learning • This process presents a well-known problem in creating expert systems that is often referred to as the knowledge acquisition bottleneck. The reasons are: • the process requires the engagement of the domain expert and/or knowledge • engineer over a long period of time, and • although experts are capable of using their knowledge for decisionmaking, they • are often incapable of articulating their knowledge explicitly in a format that is • sufficiently systematic, correct, and complete to be used in a computer • application. Jensen, 2005

Rules and Conditions Based on Machine Learning To solve such problems, much effort has been exerted in the artificial intelligence community to automate the building of expert system knowledge bases. Machine learning is defined as “the science of computer modeling of learning processes”. It enables a computer to acquire knowledge from existing data or theories using certain inference strategies such as induction or deduction. We will focus only on inductive learning and its application in building knowledge bases for image analysis expert systems. Jensen, 2005

Rules and Conditions Based on Machine Learning A human being has the ability to make accurate generalizations from a few scattered facts provided by a teacher or the environment using inductive inferences. This is called inductive learning (Huang and Jensen, 1997). In machine learning, the process of inductive learning can be viewed as a heuristic search through a space of symbolic descriptions for plausible general descriptions, or concepts, that explain the input training data and are useful for predicting new data. Inductive learning can be formulated using the following symbolic formulas. Jensen, 2005

Hierarchical Decision Tree Classifier Based on Inductive Machine Learning Production Rules ETM Panchromatic C5.0 Model Jensen, 2005 Predicted White Fir

Machine Learning-derived Classification Map Jensen, 2005

Rules and Conditions Based on Machine Learning • The following topics are covered in Geography 751:Seminar in Remote Sensing: • - Machine Learning • - Neural Networks Jensen, 2005

Dr. John R. Jensen Department of Geography University of South Carolina Columbia, SC 29208