Knowledge Representation and Reasoning (1) – History of Knowledge Information Processing, Languages for Representing Knowledge and Prolog
The history of knowledge information processing technology, which is the core of artificial intelligence technology other than machine learning, and knowledge representation languages and Prolog are described.
In the 1960s, AI research was devoted entirely to search problems. In particular, much of the research was on theorem proving. That is, describing a problem as a small set of axioms and searching to prove the problem. Therefore, it was an implicit assumption that the inference mechanism had the capability to do so. If the right search technique could be discovered, it would mean that all the problems could be solved and all the theorems could be proved.
In the 1970s, the idea of theorem proving began to show that there were challenges to this approach, as it did not live up to expectations. what AI researchers gradually began to realize was that no matter how clever the inference algorithms could be devised, they would not be able to solve NP-hard problems. The general inference mechanism worked well for toy problems, but could not cope when the size of the problem increased.
The alternative to this would be the idea of expert systems. According to Feigenbaum, the lesson learned from expert systems such as MYCIN is that the choice of inference mechanism is not as important as having the right knowledge representation. According to Feigenbaum, the lesson learned from expert systems like MYCIN is that the choice of inference mechanism is not as important as having the right knowledge representation. In this view, it does not matter much whether MYCIN uses forward or backward reasoning.
It also does not matter whether it uses confidence, probability, or fuzzy set theory. What matters is the knowledge that “Pseudomonas aeruginosa can infect immunocompromised patients, has a negative Gram stain, and is neck-shaped. In other words, what is important is the acquisition and expression of knowledge.
While the idea of expert systems was successful in some areas, it also had some failures. Researchers were concerned with identifying the limitations of the new techniques and understanding how they worked. Many of the researchers had serious reservations about the knowledge used in the system, which was not clearly defined. For example, does the assertion (color apple red) mean that a particular apple is red, that all apples are red, that some or most apples are red, or that all apples are red?
In the area of knowledge representation, we focused on providing an explicit backmatter for knowledge representation as well as algorithms for manipulating knowledge. A lot of attention was paid to finding a good trade-off between expressiveness and efficiency. An efficient language would be one in which all questions (at least the average ones) are answered immediately. If we need to guarantee that a question will be answered immediately, we need to limit what can be expressed in that language.
At the end of the 1980s, a series of results raised questions about the desire to find an efficient language with reasonable expressive power. Using mathematical techniques based on worst-case analysis, it was shown that even seemingly insignificant languages are difficult to solve. The computational complexity of answering a simple question is of the order of exponentiation in the worst case.
Interest in the 1990s shifted to knowledge representation and reasoning. This area faces both expressiveness and efficiency of language, but the average case is acknowledged to be more important than the worst case.
The amount of knowledge does not help to solve difficult problems in the worst case, but in reality the worst case is rare.
AI researchers have examined hundreds of knowledge representation languages in an attempt to find a language that is useful, expressive, and efficient. Languages can be categorized into four groups, depending on what their basic unit is
- Logical expressions (e.g. Prolog)
- Networks (semantic networks, conceptual graphs)
- Objects (scripts, frames)
- Procedures (Lisp, production systems)
Logic-based languages, such as Prolog, have also been discussed in this blog.
Network-based languages can be seen as a syntactic variant of logic-based languages. The link L between node A and this node becomes another way to express the logical relation L(A,B). The difference is that the network-based language treats its links more importantly, meaning that the links are directly implemented by pointers inside the computer, and inference is performed by following the links. Thus, putting a link L between A and B is not only stating that L(A,B) is true, but also stating something about how the knowledge base is to be explored.
Object-oriented languages can be seen as a syntactic variant of predicate logic. The following would be a statement in a typical slot-embedding frame language.
(a person
(name = Jan)
(age = 32))
This is equivalent to the following logical formula
\[\exists p:\ person(p)\land name(p,Jan)\land age(p,32)\]
Some people argue that it is easier to read if it is expressed in frames, but the expressive power of frames is small; there is no way to state that the name of a person is either Jan or John, or that the age of a person is not 34. Of course, such sentences are easy to make in predicate logic.
Procedural languages are contrasted with expressive languages. Procedural languages are contrasted with expressive languages, i.e., procedural languages compute answers without an explicit representation of knowledge.
Hybrid expressive languages use different methods together to represent different kinds of knowledge. For example, the KL-ONE family uses both logical expressions and objects organized into networks. Many of the frame languages allow for procedural attachment. This is a technique that uses procedures to compute values for expressions that cannot or are difficult to express in the frame language itself.
Many of the expressions we have discussed so far are based on predicate logic. Predicate logic has a special status among the methods of representation. In other words, it has value as a highly universal standard for defining and evaluating other representation methods. In the previous section, we showed an example of expression using a frame language. In terms of the ease of using the syntax and the efficiency of the internal representation of data, the frame language has many advantages. However, in order to understand the meaning of what is expressed in the language, a clear definition is needed. Predicate logic can be used to give such a definition in most cases.
Knowledge representation by predicate logic uses individuals to which relations and functions are applied, and sentences formed by joining the relations by logical connectives and, or, not. Philosophers and psychologists will argue about the suitability of predicate logic as a model for human thought. However, the following point becomes clear. Predicate logic is sufficient to represent anything that can be represented by a digital computer. It will be easy to show this. If we assume that the computer’s memory consists of n bits, and bi= means that bit i is on, then the entire state of the computer can be represented by a dexterous conjunction.
\[(b_0=0)\land (b_1=0)\land(b_2=1)\land\dots\land(b_n=0)\]
Once we express one state of a computer, we can express any computer program in predicate logic, as a set of axioms that map one state to another. In this way, it can be shown that predicate logic is a sufficient language to represent arbitrary events occurring inside a computer. It can also be used as a tool to analyze arbitrary programs from the outside.
This is not to say that predicate logic has proven to be a suitable language for all applications. In some cases there are good reasons to use another language. There are times when we should express knowledge in a form quite different from predicate logic, and investigate expressions in a form quite different from logical reasoning. However, it is necessary to describe the system using the axioms of predicate logic and to value the theorems in proofs. If you try to save your hands, it will be crude. For example, rather than manipulating the axioms of predicate logic, you would rather use the arithmetic instructions built into the CPU to manipulate the numbers inside the computer.
However, when writing a square root routine, it is better to satisfy the following axioms.
\[\sqrt{x}=y\Rightarrow y\times y=x\]
Postoperative logic is also used for other purposes. It is useful as a tool that can be used by programs, rather than as a program. All programs need to manipulate data. Some programs also manipulate data that is considered to be written in the representation of predicate logic. There is interest in this usage.
Using predicate logic, it is easy to start writing facts in a domain. However, even when predicate logic is used in a straightforward manner, it suffers from serious limitations, as shown below.
Decidability: Given a set of axioms and a goal, neither the goal nor its negation may be derivable from the axioms.
- tractability: Even if the goal is provable, it may be too time-consuming to discover its proof using the available reasoning mechanisms.
- Uncertainty: It may be inconvenient to deal with relations that have been proven to some extent, but cannot be known with certainty to be true or false.
- Monotonicity: In pure predicate logic, once a theorem is proved, it is true forever. In pure predicate logic, once a theorem is proved, it is true forever. However, we want to have a way to derive a hypothetical theorem based on a hypothesis, and then withdraw the hypothesis when it is proven false.
- Consistency: In pure predicate logic, no contradiction is allowed. If, by chance, both P and ¬P can be derived, then any theorem can be proved. In short, even the slightest contradiction will contaminate the entire database.
- Omniscience: It may be difficult to distinguish between what is provable and what needs to be proven, and this leads to unfounded hypotheses that agents believe all consequences from known facts.
- Expressiveness: First-order predicate logic can be awkward when talking about things like relations and propositions.
From the point of view that prevails today, the best strategy is to address these issues from both inside and outside of predicate logic. For the sake of convenience and to take advantage of dedicated reasoning mechanisms that are more efficient than general-purpose theorem provers, it would be a good idea to draft a new notation.
However, it is also important to closely define the meaning of the new notation using the well-known notation of predicate logic. As Drew McDermott (1978) put it, “No notation without denotation.
Next, we show how new notations (and their corresponding meanings) can be used to extend existing representations and reasoning systems. We choose Prolog as the language to extend, not because we recommend it as the ultimate knowledge representation language, but simply because it is clear and familiar with the basics.
Prolog has been proposed as an answer to the problem of programming in logic; why is Prolog not accepted as a universal knowledge representation language? Prolog was proposed as an answer to the problem of programming in logic. Given two logically equivalent uses, one will result in an efficient Prolog program, but the other will not.
Kowalsji’s famous equation “Algorithm = Logic + Control” represents the limit of logic alone. In other words, “logic = algorithm – control”. Many problems (especially in AI) have large or infinite search spaces, and Prolog cannot find the answer in a reasonable time without advice on how to search.
The problems with Prolog can be divided into three categories. First, in order to make the language more efficient, its expressive power is limited: Prolog cannot express that a person’s name is Jan or John (although it can ask whether a person’s name is either Jan or John). Prolog does not distinguish between false and unknown.
Second, Prolog’s inference mechanism is neither correct nor complete. Second, Prolog’s inference mechanism is neither correct nor complete: it does not check for circular unification, so it may give the wrong answer. It also does depth-first search, so it may not find the right answer.
Thirdly, Prolog does not have a good way to add control information to the underlying logic. However, adding control information may make it less efficient for certain problems.
Even if Prolog is a language that programs in logic, it is not a complete predicate logic. An important problem is that Prolog cannot express certain unambiguous facts. Prolog can express definite facts, i.e., “The capital of Rhode Island is Providence” can be expressed. It can also express a series of facts. In other words, “The capital of Rhode Island is Providence, and the capital of California is Sacramento.
However, it cannot express a choice or negation. In other words, you cannot express “The capital of California is not Los Angels” or “The capital of New York is either New York City or Albany”. Let’s try the following
(<- (not (capital LA CA)))
(<- (or (capital Albany NY) (capital NYC NY)
These two facts are not concerned with the relation CAPITAL, but with the relations NOT and OR. Therefore, they are not taken into account when asking about CAPITAL. Fortunately, the assertion “Either MYC or Albany is the capital of the state of NY” can be rephrased into two assertions. That is, it can be rephrased as “If MYC is not the capital of NY State, then Albany is the capital of NY State” and “If Albany is not the capital of NY State, then NYC is the capital of NY State.
(<- (capital Albany NY) (not (capital NYC NY)))
(<- (capital NYC NY) (not (capital Albany NY)))
Unfortunately, Prolog’s not is not the same as logic’s not. When Prolog answers a question with “no”, it means that the question cannot be proven from known facts. If every thing is a city, then the question must be false. But if there are unknown facts, then the question may actually be true. This is not surprising. It is not right to expect a program to find the answer by using knowledge it does not have.
Moreover, in the above case, it causes a problem. Given the previous two clauses and the question (capital ? c NY), prolog enters an infinite loop. If we remove the first clause, prolog fails to prove that Albany is the state capital, and thus NYC is proven to be the state capital. if we remove the second clause, the opposite result is drawn.
The problem is that Prolog equates “unproven” with “false”. Prolog creates a condition called the closed world assumption. Prolog creates a situation called the closed world assumption, which assumes that everything that is true is known. The closed world assumption is reasonable for most programs. The closed world assumption is reasonable for most programs because the programmer knows all relevant information. However, for general knowledge representation, we need a system that does not create the state of the closed world hypothesis and has three answers to questions: yes, no, and unknown.
In this example, we cannot conclude whether the capital of the state of NY is or is not NYC, so we cannot conclude anything about ALbany.
Let’s consider one more example.
(<- (damned) (do))
(<- (damned) (not (do)))
With these rules, the question (? – (damned)), we should logically be able to answer “yes” to it. Furthermore, it should be possible to conclude (damned) without checking whether (do) is proved or not. In any case, Prolog will try to prove (do) again, and if the proof fails, (damned) will be proved. In this way, Prolog does the same proof twice when the proof is completely unnecessary. When negation is introduced, the simple evaluation mechanism of Prolog does not hold. It is no longer sufficient to consider only a single clause at a time. If we want to derive the correct answer, we need to consider multiple clauses together.
Robert Moore (1982) provides a good example of the power of selective reasoning. Although his problem concerns three color-coded blocks, we will change the blocks to countries. An Eastern European country g wants to decide whether to remain a communist state or become a democracy, but the outcome is not yet known. g is surrounded by a democracy D and a socialist state c. The problem is “a communist state.
The question is: “Is a communist state adjacent to a democratic state?” Moore points out that the answer is “yes,” but it takes case-by-case reasoning to find the answer. If E is a democracy, it is adjacent to C, so the answer is “yes”; if E is a communist state, it is adjacent to D, so the answer is still “yes. Since there are only two possibilities, the answer must be “yes” in all cases.
Logical reasoning can tell us the correct answer, but Prolog cannot. The following seven assertions and one question describe the problem. However, Prolog cannot handle or, the last assertion.
(<- (next-to D E))(<- (next-to E D))
(<- (next-to E C))(<- (next-to C E))
(<- (democracy D))(<- (communist C))
(<- (or (democracy E)(communist E)))
(?- (next-to ?A ?B)(democracy ?A)(communist ?B))
Prolog confirmed that it is not very good at selection and negation. It is also difficult to express existence. Consider the following example, written in English, logic, and Prolog.
Jan likes everyone.
∀x person(x)⇒likes(Jan,x)
(<- (likes Jan ?x)(person ?x))
In this example, the translation of prolog is faithful, but when it comes to “Jan likes someone”, I cannot find a good translation. The one that seems to be most faithful is the following.
Jan likes someone.
∃x person(x)⇒likes(Jan,x)
(<- (likes Jan p1))
(<- (person p1))
In this example, we introduce a new symbol p1 to represent the unknown that Jan prefers. p1 is not a variable, but a constant. The constant that represents an identified but unknown entity is called the Skolem constant, after Thoralf Skolem. The purpose is to express that p1 may be equal to some other person whose existence is known or unknown.
If we know that Jan’s preferred person is Adrian, we can add the assertion p1=Adrian. However, this does not work well in Prolog. This is because Prolog implicitly uses the unique name asssumption that different atoms refer to different individuals.
In fact, the Skolem constant is a special case of the Skolem function, which represents an unknown entity determined by one or more variables. For example, to sum up “someone prefers someone else,” we write
Everyone likes someone.
∀y∃x person(x)⇒likes(y,x)
(<- (likes ?y (p2 ?y)))
(<- (person (p2 ?y)))
In this example, p2 is a Skolem function determined by the variable?y. In other words, “everyone prefers a certain person who is not necessarily the same person”.
In the previous section, I mentioned that Prolog sacrifices expressive power in order to increase efficiency. In this article, I will discuss the limitations of expressiveness in predicate logic.
It is necessary to honor the fact that lions, tigers, and bears are species of animals. In predicate logic and Prolog, the following implications are applied to each case.
(<- (animal ?x)(lion ?x))
(<- (animal ?x)(tiger ?x))
(<- (animal ?x)(bear ?x))
Using these implications, we can prove that any known lion, tiger, or bear is an animal. However, it does not answer the question, “What kinds of animals are out there? cannot be answered. It is not difficult to think of extensions to Prolog that would answer the following questions.
(?- (<- (animal ?x) ?proposition))
However, this extension is not legitimate for Prolog, or even for first-order predicate logic (FOPC): in FOPC, variables must represent constants in the language and must not represent relations or propositions. Higher-order predicate logic does not have this restriction, but it complicates the proof argument.
In the above question, it is not obvious what the value of ?proposition should be. It is certain that (lion ?x) is a valid answer, but there are an infinite number of other propositions such as (animal ?x), (or (tiger ?x)(bear ?x)), etc. Perhaps we should ask two types of questions: questions about “kinds” and questions about propositions.
It is also possible that we need to ask questions about relations; just as it is useful to declare the type of a parameter in a Lisp function, it would be useful to declare the type of a parameter in a relation and ask about its type later. For example, we can mention that the relation likes holds the relation between person and object.
In general, sentences in predicate logic that use relations or sentences as terms are called high-order sentences. If we allow the use of higher-order expressions, a subtle problem surfaces. In other words, is the sentence “This minute is false” true or false? is true or false?
Predicate logic is defined in a world composed of solids, using the properties of individuals and the relations between individuals. Therefore, predicate logic is well suited to the model of finding individuals and classifying them, for example, a person here, a building there, and a sidewalk between them. However, it is not clear how to define them as individuals when one considers the continuous real world of water, which is composed of indefinite components, such as water as a whole, but some of which evaporates and mixes with the air, and some of which rises and forms clouds. On the other hand, Patrick Hayes has shown that predicate logic can successfully describe this kind of situation if the right choices are made.
There is a further difficulty when defining categories. Predicate logic works well for categories that are mathematically distinct. For example, “x is a polygon with three sides if and only if x is a polygon with three sides. Unfortunately, most of the categories that humans use in their daily lives are not so illusory to define. The category “friend” usually refers to someone we can have a favorable feeling for, such as trustworthy. This “definition” is necessary, but not sufficient.
Rather, it is a list of characteristics that are highly correlated with the category “friend”, but whose scope is not clear. A prototype of an ideal friend can be described, but a clear boundary distinguishing “friend” from “acquaintance” cannot be described. Besides, the boundary changes from situation to situation. In other words, a person who can be described as a good friend at work may be a mere acquaintance in family life.
In addition to “for all” and “there exist”, there are predicate logics that allow qualifiers such as “most”. There is also an attempt to define a prototype and measure the distance from it.
コメント