1.13k likes | 1.34k Views
Анализ мотивации, целей и подходов проекта унификации языков на правилах RIF. Калиниченко Л.А., Ступников С.А. (ИПИ РАН) Симпозиум «Онтологическое моделирование», КФУ, 11-12 Октября 2010 г. План рассмотрения. Унификация языков, интеграция информационных ресурсов, интероперабельность
E N D
Анализ мотивации, целей и подходов проекта унификации языковна правилах RIF Калиниченко Л.А., Ступников С.А. (ИПИ РАН) Симпозиум «Онтологическое моделирование», КФУ, 11-12 Октября 2010 г.
План рассмотрения • Унификация языков, интеграция информационных ресурсов, интероперабельность • Цели проекта RIF • Языки на правилах и их применения: краткий исторический экскурс • Логические модели рассуждений • Семантики логических программ • Языки и системы ЛП, оказавшие существенное влияние на RIF • Примеры использования RIF и системы ЛП в области интересов группы RIF WG • Требования к RIF • Обзор основных решений RIF • Выводы
Унификация языков, интеграция информационных ресурсов, интероперабельность
Унификация языков, интероперабельность информационных ресурсов • Требуют унификации: языки реляционных и объектных баз данных, онтологического моделирования, представления слабо структурированных, графовых, мультимедийных данных, представления баз знаний, логического программирования, дедуктивные языки запросов к базам данных, спецификации потоков работ, спецификации интерфейсов программных ИР, языки со специализированной семантикой (например, для выражения темпоральных, пространственных моделей), языки для определения нечетких, вероятностных представлений, языки концептуального моделирования и метамодели, и многие другие. • Примеры расширяемых унифицирующих языков – СИНТЕЗ • Программы на правилах – новый вид ИР , унификация ЯП • Цель RIF: создание расширяемого унифицированного семейства языков на правилах • Применение подхода должно быть общим, не ограничиваться Вебом
Why rules • The use of rules for knowledge representation and intelligent information systems dates back over thirty years. • By now it is a mature technology with decades of theoretical development, practical and commercial use. • The accumulated experience in this area exceeds the experience gathered with the use of Description Logics and the field is arguably more mature when it comes to rules • The mature technology for rule-based applications is based on logic programming and nonmonotonic reasoning (LPNMR) • Interesting that SQL, arguably the most important rule-based language, has LPNMR as its foundation • To exploit the full potential of rule-based approaches, the business rules and the Semantic Web communities have started to develop solutions for reusing and integrating knowledge specified in different rule languages (such as http://www.w3.org/2005/rules/wg/wiki/List of Rule Systems)
Концепция RIF • Рабочая группа W3C RIF WG: с 2005 г.Задача: • Обеспечить использование потенциала, накопленного при создании и использовании различных языков и систем на правилах • Совместимость с RDF и OWL • Цель: создание унифицированных диалектов на правилах, в которые существующие языки на правилах могли быть отображены с сохранением семантики , позволяя правила, созданные в рамках некоторого приложения, публиковать, использовать совместно с другими правилами, повторно использовать в других приложениях и других машинах • Унифицированный язык представляется как семейство диалектов, которые имеют общее ядро (корневой диалект) и совокупность расширяющих его диалектов • Для полновесного включения каждого языка на правилах в совокупность интероперабельных языков достаточно снабдить соответствующую систему программирования двумя сохраняющими семантику преобразователями – из собственного языка в диалект (роль поставщика) и из диалекта в собственный язык (роль пользователя)
Принятие стандарта • 22 June 2010 W3C published a new standard for building rule systems on the Web. Declarative rules allow integration and transformation of data from multiple sources in a distributed, transparent and scalable manner. The new standard, called Rule Interchange Format (RIF), was developed with participation from the Business Rules, Logic Programming, and Semantic Web communities to provide interoperability and portability between many different systems using declarative technologies. The six new standards are: • RIF Core Dialect, which provides a standard, base level of functionality for interchange • RIF Basic Logic Dialect and RIF Production Rule Dialect provided extended functionality matching two common classes of rule engines • RIF Framework for Logic Dialects describes how to extend RIF for use with a large class of systems • RIF Datatypes and Built-Ins 1.0 borrows heavily from XQuery and XPath for a set of basic operations • RIF RDF and OWL Compatibility specifies how RIF works with RDF data and OWL ontologies
Языки на правилах и их применения: краткий исторический экскурс
Программы на правилах, классы применений • Logic programming has three main classes of application: • a general-purpose programming language, • a database language, • a knowledge representation language. • As a programming language, it can represent and compute any computable function. • As a database language, it generalises relational databases, to include general clauses in addition to facts. • As a knowledge representation language it is a non-monotonic logic, which can be used for default reasoning
Применение логики для представления знанийи решения задач Basic landmarks: • Advice Taker, McCarthy (1958) , KR and theorem-proving • Resolution (Robinson, 1965) • Procedural against logic approach: Planner (MIT) Winograd1971 • Procedural interpretation of Horn clauses Kowalski 1974 • Colmerauer’s development of the programming language Prolog • Non-monotonic logics: Naf (Clark, 1978), circumscription ( McCarthy, 1980), default logic (Reiter, 1980), autoepistemic logic (Moore, 1985) • Production systems (Neweli, 1973) • Agent LP logic (Kowalski and Sadri,1999; Kowalski 2001, 2006)
Дедуктивные базы данных • Green and Raphael (1968): connection between theorem proving and deduction in databases; QA systems • First description of bounded recursive queries (can be replaced by nonrecursive equivalents): Minker, Nicolas • Distinction between EDB and IDB first emphasized in DADM (1981) • Reiter's paper on the closed world assumption (1978) • Showing that the least fixpoint of a Horn-clause logic program coincides with its least Herbrand model: Emden and Kowalski, 1976 • Period of 1984 – 1995: active research and development under the influence of the 5th Gen Computers initiative • LDL and LDL++ project at MCC (1984, 1990) • ECRC deductive database project • DOOD conferences: 1989 (Kyoto), 1991 (Munich), 1993 (Phoenix), 1995 (Singapore), 1997 (Montreux), 2000 (London).
Логические модели рассуждений Примеры различных видов (моделей) немонотонных рассуждений: • Абдуктивные рассуждения:вид автоматизированного правдоподобного рассуждения • Рассуждения на основе умолчания: имеется возможность отказа от предположения умолчания, когда оказывается, что оно неверно • Отрицание как неудача: основано на предположении о замкнутости мира • Логикаочерчивания (circumscription): действуют предположения здравого смысла, пока нет других указаний • Автоэпистемическая логика: представление знаний о знаниях и рассуждения в таком контексте • Логика отмены заключений: при рассуждениях заключения не окончательны, они могут быть скорректированы или отменены • Логика рассуждений с неопределенностью: например, использование Байесовского формализма в системах правил • Теории аргументации: основана на искусстве ведения дебатов, убеждения, применяя логические рассуждения
NAF и рассуждения о неполной информации • В логических формулах наряду с отрицанием как неудачей (not), требуется использование классического (сильного) отрицания (¬) flies(X) :- bird(X), not excl(X). bird(X) :- penguin(X). excl(X) :- penguin(X). bird(tweety) :-. penguin(sam) :-. ¬bird(X) :- not bird(X). ¬penguin(X) :- not penguin(X). ¬excl(X) :- not excl(X). ¬flies(X) :- penguin(X). ¬flies(X) :-¬bird(X). • Явное определение предположенияCWA (выделенные предикаты). Остальные предикаты - в предположении OWA
Classes of Programs • H ← B1 ˄ . . . ˄ Bm ˄ ~Bm+1 ˄ . . . ˄ ~Bn where H, Bi are atoms, n, m ≥ 0 and ~Bj are called negation-as-failure literals (NAF literals). if [body] then [head]”, with universal quantification on the outer level. Specific classes of programs: • Normal programs allow the use of classical negation in the head and in the body of a rule. • Definite programs are normal programs where negated atoms and NAF-literals are not allowed in the rules. • Datalog programs, are logic programs where function symbols other than constants are not allowed. Usually it is also required that the rules are safe. • Conjunction of literals in the head: shorthand for a number of ordinary rules. • Disjunction of literals in the head => disjunctive logic programming
Definite Programs Definite programs (not including NAF literals), definite Horn clause, or Horn rule H ← B1 ˄ . . . ˄ Bk The semantics of definite logic programs : the least Herbrand model of a given program. The least Herbrand model is the smallest (w.r.t. set inclusion) set S of ground atoms such that for any rule H ← B1 ˄ . . . ˄ Bk, if B1 ˄ . . . ˄ Bk ϵ S then H ϵ S. • Backwardreasoning, where an atomic query is matched/unified with the head of a rule and replaced by the respective instance of the body • Forwardreasoning, where the head of a ground instance of a rule is added to the conclusion set when all body atoms of this rule are already included in the conclusion set.
Хорновские правила • A Horn clause is a clause (a conjunction of literals – atomic formulae (atoms) or their negation) with at most one positive literal in the head H ← B1 ˄ . . . ˄ Bm ˄ ~Bm+1 ˄ . . . ˄ ~Bn • Horn clauses play a basic role in logic programming. • A Horn clause with exactly one positive literal in the head is a definite clause. • A Horn clause with no positive literals in LP is called a goal clause. • The resolution of a goal clause with a definite clause is the basis of the SLD resolution (Selective Linear resolution with Definite clauses). • NAF can be used in a body of clauses. In Prolog NAF is known to be problematic: it may not terminate due to infinite positive recursion; due to infinite recursion through negation; and it may repeatedly evaluate the same clause body literal, leading to unacceptable performance • Modern logic programming languages use either the well-founded default negation or the one based on stable models
Two Philosophically Different Approaches for LP • 1. To keep the idea of defining a single model for a program, possibly including also problematic classes of programs with negation. This can be achieved by properly defining which single model should be selected among all classical models of a program. For general normal problems, the most popular semantics is perhaps the one based on the well-founded model • 2. To identify a collection of multiple models. This line of research abandons the “dogmatic” requirement of a single model and accepts the possibility of having multiple scenarios compatible with a given program. Stable model: model generation (that is, the computation of the set of preferred models) . This is more than query answering
Stable Model • The intuition behind stable model semantics is to treat negated atoms in a special way. Intuitively, such atoms are a source of “contradiction” or “unstability.” “Stability” can thus be seen as follows: if an interpretation M of P is not self-contradicting, then it is stable. man(petrov). single(X) ← man(X), not husband(X). husband(X) ← man(X), not single(X). • where single(petrov) and husband(petrov) are mutually dependent using negation. An SLD resolution algorithm would loop forever when trying to answer the query single(X). • This program has two minimal Herbrand models that are stable M1 = {man(petrov), single(petrov)} and M2 = {man(petrov), husband(petrov)}
Answer Set Programming (ASP) • ASP adds constraints, strong negation and disjunction • female(X) v male(X) ← person(X) • Useful application for strong negation (in combination with weak negation) is to express default rules. For example, we can express that “a bird flies by default” with the rule • flies(X) ← bird(X), not¬ flies(X). Here ¬ is a strong negation. • not p in the body of a rule can be read (V.Lifschitz) as "p is not believed" • broken(left hand, tom) v broken(right hand, tom) - disjunctive fact • ok(C) v ¬ ok(C) ← component(C) states that a component may be in a working condition or not working at all,
ASP Example • In the graph 3-coloring problem, assuming that G = (V, E) is stored using • facts node(n) for each n ϵ V and edge(n, n’) for each (n, n’) ϵ E, which gives the data D, the generic specification of solutions PS can be given by the following rules: • b(X) ← node(X), not r(X), not g(X) • r(X) ← node(X), not b(X), not g(X) • g(X) ← node(X), not r(X), not b(X) • and the constraints • ←b(X), b(Y ), edge(X, Y ) • ←r(X), r(Y ), edge(X, Y ) • ←g(X), g(Y ), edge(X, Y )
ASP Applications • This method has been successfully applied to numerous problems in a range of areas; an incomplete list is • diagnosis • information integration • constraint satisfaction • reasoning about actions (including planning) • routing, and scheduling • health care • biomedicine and biology • text mining and classification • question answering
Хорошо обоснованная семантика (WFS) • The well-founded semantics (WFS) -three-valued logic (true, false, unknown) • If an atom is true in the well-founded model of P then it belongs to every stable model of P. The converse, generally, does not hold. • For instance, the program p ← not q q ← not p r ← p r ← q has two stable models, {p,r} and {q,r}. Even though r belongs to both of them, its value in the well-founded model is unknown. • WFS, в отличие от ASP, продуцирует в результате вывода единственное результирующее множество • Хорошо обоснованная семантика (WFS) является основой реализации многих систем, прежде всего коммерческих (например, XSB, Ontobroker, Intellidimension, SweetRules, SILK, FLORA) • ASP обладает более высокой вычислительной сложностью. Примеры реализаций ASP: Smodels, DLV, Clasp.
Хорошо обоснованная и стабильная семантики • One of the frequently cited difference is that ASP supports reasoning by cases , which is not possible using WFS. Due to the differences in their expressive power, the two paradigms are typically used for different purposes. ASP is ideally suited for solving hard combinatorial programs and such systems usually appear as embedded knowledge base components of imperative programming languages. In contrast, WFS-based systems typically are Turing-complete and can be used as programming languages in their own right. • At the same time, it was shown that the WFS can be viewed as a three-valued version of the stable model semantics.
Языки и системы ЛП, оказавшие существенное влияние на RIF (F-Logic)
Фреймовая логика и язык метапрограммирования (F-Logic) • F-logic combines the advantages of conceptual modeling with object-oriented, frame-based languages and offers a declarative semantics • F-logic provides a logical foundation for frame-based and object-oriented languages for data and knowledge representation • HiLog is a a logical formalism extending Prolog that provides higher-order and meta-programming features in a computationally tractable first-order setting. • FLORA-2 is an advanced object-oriented knowledge base language • The language of FLORA-2 is a dialect of F-logic with extensions – HiLog and logical updates in the style of Transaction Logic • Non-first order, nonmonotonic semantics by interpreting the negation operator as negation-as-failure as well as the semantics of multiple inheritance with overriding
Structure of F-Logic Programs F-programs specify what each method is supposed to do, define method signatures, and organize objects along class hierarchies. • Object definitions may be explicit, that is, given as facts, or implicit, that is, specified via deductive rules. • Class-hierarchy declarations, as their name suggests, organize objects and classes into IS-A hierarchies. • Signature declarations specify the types of the arguments for each method and the type of the output they produce. • In FLORA-2, transactions are expressed as object methods that are prefixed with the special symbol “%". Atomicity of transactions is provided.
Схемы в F-Logic • Classes are treated as objects and it is possible for the same object to play the role of a class in one formula and of an object in another. • Schema information through signature formulas paper[authors => person, title => string]. journal p :: paper[in vol => volume]. conf p :: paper[at conf => conf proc]. journal vol[of => journal, volume => integer, number => integer, year => integer]. journal[name => string, publisher => string, editors => person]. conf proc[of conf => conf series, year => integer, editors => person]. conf series[name => string]. publisher[name => string]. person[name => string, affil(integer) => institution]. institution[name => string, address => string].
Объекты в F-Logic o_j1 : journal p[title -> ’Records, Relations, Sets, Entities, and Things’, authors -> {o_mes}, in vol -> o_i11]. o_di : conf p[ title -> ’DIAM II and Levels of Abstraction’, authors -> {o_mes, o_eba}, at conf -> o_v76]. o_i11 : journal vol[of -> o_is, number -> 1, volume -> 1, year -> 1975]. o_is : journal[name -> ’Information Systems’, editors -> {o_mj}]. o_v76 : conf proc[of -> vldb, year -> 1976, editors -> {o_pcl, o_ejn}]. o_vldb : conf series[name -> ’Very Large Databases’]. o_mes : person[name -> ’Michael E. Senko’]. o_mj : person[name -> ’Matthias Jarke’, affil(1976) -> o_rwt]. o_rwt : institution[name -> ’RWTH Aachen’].
Примеры запросов • ?- student[?M=> person]. finds the set-valued methods that are defined in the schema of class student and return objects of type person • ?- student::?C and student[name=> ?T]. returns all the superclasses of class student: • ?- person[?M(?Arg)=> integer]. ?M would be bound to names of functions(and predicates) The semantics for this second-order syntax is first order, however. Roughly it means that variables get bound not to the extensional values of the symbols, but to the symbols themselves.
Реификация p(a). q(p(a)). ?- q(?X), ?X. What happens here is that the proposition p(a) is reified (made into an object) and so it can be bound to a variable ?X John believes that Mary likes Sally as follows: John[believes ->${Mary[likes ->Sally]}]. ${...} is the syntax that FLORA-2 uses to denote reified statements John[believes ->${Bob[likes ->?X] : - Mary[likes ->?X]}]. This sentence reifies a rule (not just a fact) and states that John also believes that Bob likes anybody who is liked by Mary.
F-Logic examples • X: merchandise <- X: Y ˄ Y[ price => ( )] defines a derived class, merchandise, that consists of all objects to which the attribute price applies. • Lowest superclass of X and Y lsup(X,Y)::C ← X::C ˄ Y::C X::lsup(X, Y) Y:: lsup(X, Y) • Greatest subclass of X and Y C::gsub(X,Y) ← C::X ˄ C::Y gsub(X, Y) :: X gsub(X, Y)::Y
Methods in F-Logic empl[ salary @ year => integer] person[ birthdate => year] salary is a function that for any empl-object would return an object in the class integer, if invoked with an argument of class year. The second clause says that birthdate is an attribute that returns a year for any person-object.
F-Logicvs DLs • F-logic is computationally complete. On the other hand, expressive F-logic knowledge bases provide no computational guarantees. Comparing to DL: • The exponential complexity of many problems in DLs creates difficulties in reasoning with large ontologies • Decidable computations in F-logic with polynomial complexity includes queries without function symbols that are beyond the expressive power of DLs. Furthermore, research in logic programming and deductive database has identified large classes of knowledge bases with function symbols where query answering is decidable There are two aspects where DLs provide more flexibility: • DLs allow the user to represent existential information. For instance, one can say that there is a person with certain properties without specifying any concrete instance of such a person • DLs admit disjunctive information into the knowledge base. For instance, one can say that John has a book or a bicycle. The corresponding statement in F-logic is only an approximation: john[has! #:(book or bicycle)] (# - Skolem constant)
Примеры использования RIF и системы ЛП в области интересов группы RIF WG
Cистемы на правилах в области интересов RIF WG • C самого начала проекта RIF следовало позаботиться об установлении взаимодействия с группами создания и поддержки конкретных разнородных языков и системна правилах • Группой RIF WG изначально было определено, что rule system is of interest to the RIF WG if it is liable to enter in applications where it will be required to interchange rules (with other rule systems of interest) and to implement a Use Cases. • Более 50 предложений примеров использования были получены RIF WG в 2005 г. Эти примеры представляли более 20 различных систем на правилах.
Системы на правилах, отобранные для анализа и выработки требований к RIF (1)
Системы на правилах, отобранные для анализа и выработки требований к RIF (2)