450 likes | 648 Views
The role of the knowledge engineer. Knowledge acquisition. Software development: conventional systems and KBS. You are probably familiar with a standard model of the software development life cycle. It is likely to be something like this: Feasibility study ® Analysis
 
                
                E N D
Software development: conventional systems and KBS • You are probably familiar with a standard model of the software development life cycle. It is likely to be something like this: Feasibility study ® Analysis ® Requirements definition ® Design ® Implementation ® Testing ® Maintenance & review
Software development: conventional systems and KBS • Knowledge-based systems require special approaches to systems analysis, especially to the collection of the data (or rather knowledge) on which the system is based. • We will discuss the ways in which this model needs to be modified to take account of these special features in lecture 8.
Knowledge Engineering • The term "knowledge engineering" is often used to mean the process of • designing • building • installing an expert system or other knowledge- based system. In other words, the whole process of making a KBS, from beginning to end.
Knowledge Engineering • Some authors use the term to mean just the phase in which the knowledge base is built.
Building the knowledge base • Five processes can be identified: • 1. Knowledge acquisition • 2. Knowledge analysis & representation • 3. Knowledge validation • 4. Inference design • 5. Explanation and justification • These are not stages that have to follow each other - some of them will run concurrently.
Knowledge Acquisition • Knowledge acquisition is: The process of gathering the knowledge to stock the expert system's knowledge base.
Knowledge Acquisition • This has proved to be the most difficult component of the knowledge engineering process. It's become known as the 'knowledge acquisition bottleneck', and expert system projects are more likely to fail at this stage than any other. • This is the principle reason why expert systems have not become more widespread.
Knowledge Acquisition • Sources of knowledge: • Documents: textbooks, journal articles, technical reports, records containing case histories, etc. • This will almost never be sufficient to provide the knowledge base for a real-world expert system. • The range of problems which a textbook examines and solves is always smaller than the range of problems that a human expert is master of.
Knowledge Acquisition • Sources of knowledge: • Human experts
Knowledge Elicitation • The most important part of knowledge acquisition is knowledge elicitation - obtaining knowledge from a human expert (or human experts) for use in an expert system. • Knowledge elicitation is difficult. Hence the knowledge acquisition bottleneck mentioned above. • It is necessary to find out what the expert(s) know, and how they use their knowledge.
Knowledge Elicitation • Expert knowledge includes: • domain-related facts & principles; • problem-solving strategies; • meta-knowledge - for instance, knowledge about when to use a particular piece of knowledge; • explanations and justifications.
Knowledge Elicitation • The knowledge elicitation/analysis task involves • finding at least one expert in the domain who: • iswilling to provide his/her knowledge; • has the timeto provide his/her knowledge; • is ableto provide his/her knowledge. - any or all of these are liable to prove difficult.
Knowledge Elicitation • The knowledge elicitation/analysis task involves • repeated interviews with the expert(s), probably combined with other, non-interview, techniques.
Knowledge Elicitation - the compiled knowledge problem • One major obstacle to knowledge elicitation: experts cannot easily describe all they know about their subject. • They do not necessarily have much insight into the methods they use to solve problems. • Their knowledge is "compiled" (like a compiled computer program - fast & efficient, but unreadable).
Knowledge Elicitation - interview techniques • Some of the interview techniques used in knowledge elicitation: • Unstructured interview. A general discussion of the domain, designed to provide a list of topics and concepts. • Structured interview. Concerned with a particular concept within the domain - a particular problem-solving skill or small group of skills.
Knowledge Elicitation - interview techniques • interview techniques : • Problem-solving interview. The DE is provided with a real-life problem, of a kind that they deal with during their working life, and asked to solve it. As they do so, they are required to describe each step, and their reasons for doing what they do. The transcript of their verbal account is called a protocol.
Knowledge Elicitation - interview techniques • interview techniques : • Think-aloud interview. As above, but the DE merely imagines that they are solving the problem presented to them, rather than actually doing it. Once again, they describe the steps involved in solving the problem.
Knowledge Elicitation - interview techniques • interview techniques : • Critical incident analysis. The DE is asked to provide details of cases which were particularly difficult, or of special interest for some other reason. He/she describes how they were solved, and the lessons that were learnt.
Knowledge Elicitation - interview techniques • interview techniques : • Dialogue. The DE interacts with a client, in the way that they would normally do during their normal work routine.
Knowledge Elicitation - interview techniques • interview techniques : • Review. The KE and DE examine the record of an interview session together.
Knowledge Elicitation - non-interview techniques • Some of the non-interview techniques used in knowledge elicitation: • Sample lecture preparation. The DE prepares a lecture, and the KE analyses its content.
Knowledge Elicitation - non-interview techniques • non-interview techniques: • Concept sorting ("card sort"). The DE is presented with a series of cards, with the names of domain concepts written on them, spread out on a table top, and asked to arrange them into clusters, in such a way that the cards in each cluster have something important in common. Then the DE is asked to name the principles that he/she has used to form these clusters. This process can be repeated to produce a hierarchy of concepts.
Knowledge Elicitation - non-interview techniques • non-interview techniques: • Repertory grid (particularly the "laddered grid" technique). • Questionnaires. Especially useful when the knowledge is to be elicited from several different experts.
Knowledge Elicitation - interview techniques • It is standard practice to tape-record KE sessions. • For something like a problem-solving interview, one would wish to videotape it as well. • However, KEs should be aware of the costs this involves, in time and money - it can take as much as 15 hours of secretarial time to transcribe and edit a one-hour interview.
Knowledge analysis & representation • Simultaneously with the knowledge acquisition process, a knowledge analysis process takes place. The KE uses the data - the transcripts and protocols, etc - from the knowledge acquisition sessions to build a good model of the expertise that the DE is using to solve problems in the domain.
Knowledge analysis & representation • The raw data (taken from the DE) is converted into intermediate representations. These are structured representations of the knowledge, but not yet the sort of coded knowledge that can be put into the knowledge base. • This will improve the knowledge engineer's understanding of the subject;
Knowledge analysis & representation • This will probably provide knowledge in a form that can be shown to the DE, for criticism and correction; • This provides easily-accessible knowledge for future KEs to work from (knowledge archiving). • The intermediate representation is then converted into the knowledge representation formalism which is to be used in the KBS software.
Knowledge validation • It is necessary to verify the knowledge against the knowledge source (the expert or document). • It is also necessary to validate the knowledge against known outcomes. • The objective is to produce knowledge of high integrity.
Inference design • It may be necessary to design the software which will comprise the inference engine; or a particular shell may already have been specified.
Explanation and justification • An explanation facility, capable of explaining/justifying any of the reasoning and conclusions that the system produces, needs to be designed and programmed.
Computer-assisted knowledge elicitation • Since knowledge engineering skills, and hence knowledge engineers, are rare (see appendix), it would be desirable to automate the job. • i.e. to write an expert system to do knowledge engineering.
Computer-assisted knowledge elicitation • The state of the art in AI (especially in natural language processing) is not sufficiently advanced to permit fully-automated knowledge elicitation.
Computer-assisted knowledge elicitation • However, 'knowledge elicitation workbenches', or 'knowledge engineering environments', are commercially available • e.g. KEE, KnAcqTools, ETS, KRITON, AQUINAS; • their principle use is to simplify the task of converting a protocol into frames, rules, etc., and inserting these structures into an expert system shell as soon as they are formulated.
Fully computerised knowledge acquisition • It might be thought that one could avoid using a domain expert altogether, by building a system that could extract knowledge, given facts about the domain. • This is the approach taken by machine learning systems: • "classic" machine learning systems such as ID3 (Quinlan, 1979) & AQ11 (Michalski & Chilauski, 1980);
Fully computerised knowledge acquisition • systems designed to provide knowledge for a particular system's knowledge base, e.g. META-DENDRAL, designed to discover rules for the rule-base in DENDRAL; • data mining systems; these do a similar job to classic machine learning systems, but work on a very large database of information. • sub-symbolic systems, i.e. neural nets and genetic algorithms. More about these in the last lecture in this course.
Fully computerised knowledge acquisition • There are plenty of examples of machine learning systems producing formerly-unknown knowledge, and knowledge that was better than that of a domain expert
Knowledge discovery • e.g.(1) META-DENDRAL • produced rules about the behaviour of molecules in a mass spectroscope that were published in a chemistry journal as original contributions to the field;
Knowledge discovery • e.g.(2) AQ11 • produced rules about how to diagnose diseases in Soya bean plants.
AQ11’s rules were correct 97% of the time. The domain expert's rules were correct 83% of the time; he abandoned his rules, and adopted AQ11's rules instead.
Knowledge discovery • e.g.(1) META-DENDRAL produced rules about the behaviour of molecules in a mass spectroscope that were published in a chemistry journal as original contributions to the field; • e.g.(2) AQ11 produced rules about how to diagnose diseases in Soya bean plants. They were correct 97% of the time. The domain expert's rules were correct 83% of the time; he abandoned his rules, and adopted AQ11's rules instead.
Fully computerised knowledge acquisition • This approach is particularly fruitful in 'knowledge-poor' domains, i.e. domains where not much expert knowledge is available. • However, it is a mistake to believe that one can do machine learning without a domain expert - at the very least, you need an expert to select the training examples, and to explain the domain terminology. Probably also to identify the features of the examples which are likely to be relevant.