Knowledge Engineering 06th November 2006 Dr Bogdan L. Vrusias b.vrusias@surrey.ac.uk
Contents Definitions Basic Process of Knowledge Engineering Case Studies 06th November 2006 Bogdan L. Vrusias © 2006
Definition Davis’ law: “For every tool there is a task perfectly suited to it”. But… It would be too optimistic to assume that for every task there is a tool perfectly suited to it. The process of building intelligent knowledge-based systems is called knowledge engineering. 06th November 2006 Bogdan L. Vrusias © 2006
Process of Knowledge Engineering 06th November 2006 Bogdan L. Vrusias © 2006
Phase 1: Problem assessment Determine the problem’s characteristics. Identify the main participants in the project. Specify the project’s objectives. Determine the resources needed for building the system. 06th November 2006 Bogdan L. Vrusias © 2006
Phase 1: Problem assessment 06th November 2006 Bogdan L. Vrusias © 2006
Phase 2: Data and Knowledge Acquisition Collect and analyse data and knowledge. Make key concepts of the system design more explicit. Deal with issue of: Incompatible data Inconsistent data Missing data 06th November 2006 Bogdan L. Vrusias © 2006
Phase 3: Development of a Prototype System Choose a tool for building an intelligent system. Transform data and represent knowledge. Design and implement a prototype system. Test the prototype with test cases. 06th November 2006 Bogdan L. Vrusias © 2006
What is a prototype? A prototype system is defined as a small version of the final system. It is designed to test how well we understand the problem – to make sure that the problem-solving strategy, the tool selected for building a system, and techniques for representing acquired data and knowledge are adequate to the task. It also provides us with an opportunity to persuade the sceptics and, in many cases, to actively engage the domain expert in the system’s development. 06th November 2006 Bogdan L. Vrusias © 2006
What is a test case? A test case is a problem successfully solved in the past for which input data and an output solution are known. During testing, the system is presented with the same input data and its solution is compared with the original solution. 06th November 2006 Bogdan L. Vrusias © 2006
Phase 4: Development of a Complete System Prepare a detailed design for a full-scale system. Collect additional data and knowledge. Develop the user interface. Implement the complete system. 06th November 2006 Bogdan L. Vrusias © 2006
Phase 5: Evaluation and Revision of the System Evaluate the system against the performance criteria. Revise the system as necessary. To evaluate an intelligent system is , in fact, to assure that the system performs the intended task to the user’s satisfaction. A formal evaluation of the system is normally accomplished with the test cases. The system’s performance is compared against the performance criteria that were agreed upon at the end of the prototyping phase. 06th November 2006 Bogdan L. Vrusias © 2006
Phase 6: Integration and Maintenance Make arrangements for technology transfer. Establish an effective maintenance program. 06th November 2006 Bogdan L. Vrusias © 2006
Will an Expert System Work for my Problem? The Phone Call Rule: “Any problem that can be solved by your in-house expert in a 10-30 minute phone call can be developed as an expert system”. 06th November 2006 Bogdan L. Vrusias © 2006
Case Study 1: Diagnostic Expert System Diagnostic expert systems are relatively easy to develop: Most diagnostic problems have a finite list of possible solutions, Involve a rather limited amount of well-formalised knowledge, and Often take a human expert a short time (say, an hour) to solve. 06th November 2006 Bogdan L. Vrusias © 2006
Case Study 1: Diagnostic Expert System 06th November 2006 Bogdan L. Vrusias © 2006
Choosing an Expert System Development Tool Tools range from high-level programming languages such as LISP, PROLOG, OPS, C and Java, to expert system shells. High-level programming languages offer a greater flexibility, but they require high-level programming skills. Shells provide us with the built-in inference engine, explanation facilities and the user interface. We do not need any programming skills to use a shell – we enter rules in English in the shell’s knowledge base. 06th November 2006 Bogdan L. Vrusias © 2006
Choosing an Expert System Shell When selecting an expert system shell, we consider: how the shell represents knowledge (rules or frames); what inference mechanism it uses (forward or backward chaining); whether the shell supports inexact reasoning and if so what technique it uses (Bayesian reasoning, certainty factors or fuzzy logic); whether the shell has an “open” architecture allowing access to external data files and programs; how the user will interact with the expert system (graphical user interface, hypertext). 06th November 2006 Bogdan L. Vrusias © 2006
Case study 2: Classification Expert System Classification problems can be handled well by both expert systems and neural networks. As an example, we will build an expert system to identify different classes of sail boats. We start with collecting some information about mast structures and sail plans of different sailing vessels. Each boat can be uniquely identified by its sail plans. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 2: Classification Expert System 06th November 2006 Bogdan L. Vrusias © 2006
Case study 2: Classification Expert System 06th November 2006 Bogdan L. Vrusias © 2006
Classification and Certainty Factors Although solving real-world classification problems often involves inexact and incomplete data, we still can use the expert system approach. However, we need to deal with uncertainties. The certainty factors theory can manage incrementally acquired evidence, as well as information with different degrees of belief. 06th November 2006 Bogdan L. Vrusias © 2006
Classification and Certainty Factors 06th November 2006 Bogdan L. Vrusias © 2006
Will a Fuzzy Expert System Work for my Problem? If you cannot define a set of exact rules for each possible situation, then use fuzzy logic. While certainty factors and Bayesian probabilities are concerned with the imprecision associated with the outcome of a well-defined event, fuzzy logic concentrates on the imprecision of the event itself. Inherently imprecise properties of the problem make it a good candidate for fuzzy technology. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 3: Decision-support Fuzzy Systems Although, most fuzzy technology applications are still reported in control and engineering, an even larger potential exists in business and finance. Decisions in these areas are often based on human intuition, common sense and experience, rather than on the availability and precision of data. Fuzzy technology provides us with a means of coping with the “soft criteria” and “fuzzy data” that are often used in business and finance. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 3: Decision-support Fuzzy Systems Mortgage application assessment is a typical problem to which decision-support fuzzy systems can be successfully applied. Assessment of a mortgage application is normally based on evaluating the market value and location of the house, the applicant’s assets and income, and the repayment plan, which is decided by the applicant’s income and bank’s interest charges. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 3: Decision-support Fuzzy Systems 06th November 2006 Bogdan L. Vrusias © 2006
Will a Neural Network Work for my Problem? Neural networks represent a class of very powerful, general-purpose tools that have been successfully applied to prediction, classification and clustering problems. They are used in a variety of areas, from speech and character recognition to detecting fraudulent transactions, from medical diagnosis of heart attacks to process control and robotics, from predicting foreign exchange rates to detecting and identifying radar targets. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 4: Character Recognition Neural Networks Recognition of both printed and handwritten characters is a typical domain where neural networks have been successfully applied. Optical character recognition systems were among the first commercial applications of neural networks. A multilayer feedforward network could work well for the for a character recognition system. For simplicity, we can limit our task to the recognition of digits from 0 to 9. Each digit is represented by a 5 x 9 bit map. In commercial applications, where a better resolution is required, at least 16 x 16 bit maps are used. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 4: Character Recognition Neural Networks 06th November 2006 Bogdan L. Vrusias © 2006
How do we choose the architecture of a neural network? The number of neurons in the input layer is decided by the number of pixels in the bit map. The bit map in our example consists of 45 pixels, and thus we need 45 input neurons. The output layer has 10 neurons – one neuron for each digit to be recognised. Complex patterns cannot be detected by a small number of hidden neurons; however too many of them can dramatically increase the computational burden. Another problem is overfitting. The greater the number of hidden neurons, the greater the ability of the network to recognise existing patterns. However, if the number of hidden neurons is too big, the network might simply memorise all training examples. 06th November 2006 Bogdan L. Vrusias © 2006
Character Recognition Neural Network 06th November 2006 Bogdan L. Vrusias © 2006
Testing the character recognition system A test set has to be strictly independent from the training examples. To test the character recognition network, we present it with examples that include “noise” – the distortion of the input patterns. We evaluate the performance of the printed digit recognition networks with 1000 test examples (100 for each digit to be recognised). 06th November 2006 Bogdan L. Vrusias © 2006
Can we improve the performance of the character recognition neural network? A neural network is as good as the examples used to train it. Therefore, we can attempt to improve digit recognition by feeding the network with “noisy” examples of digits from 0 to 9. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 5: Prediction Neural Networks As an example, we consider a problem of predicting the market value of a given house based on the knowledge of the sales prices of similar houses. In this problem, the inputs (the house location, living area, number of bedrooms, number of bathrooms, land size, type of heating system, etc.) are well-defined, and even standardised for sharing the housing market information between different real estate agencies. The output is also well-defined – we know what we are trying to predict. The features of recently sold houses and their sales prices are examples, which we use for training the neural network. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 5: Prediction Neural Networks 06th November 2006 Bogdan L. Vrusias © 2006
Validating the Results To validate results, we use a set of examples never seen by the network. Before training, all the available data are randomly divided into a training set and a test set. Once the training phase is complete, the network’s ability to generalise is tested against examples of the test set. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 6: Classification Neural Networks with Competitive Learning As an example, we will consider an iris plant classification problem. Suppose, we are given a data set with several variables but we have no idea how to separate it into different classes because we cannot find any unique or distinctive features in the data. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 6: Classification Neural Networks with Competitive Learning Neural networks can discover significant features in input patterns and learn how to separate input data into different classes. A neural network with competitive learning is a suitable tool to accomplish this task. The competitive learning rule enables a single-layer neural network to combine similar input data into groups or clusters. This process is called clustering. Each cluster is represented by a single output. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 6: Classification Neural Networks with Competitive Learning For this case study, we will use a data set of 150 elements that contains three classes of iris plants – setosa, versicolor and virginica. Each plant in the data set is represented by four variables: sepal length, sepal width, petal length and petal width. The sepal length ranges between 4.3 and 7.9 cm, sepal width between 2.0 and 4.4 cm, petal length between 1.0 and 6.9 cm, and petal width between 0.1 and 2.5 cm. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 6: Classification Neural Networks with Competitive Learning 06th November 2006 Bogdan L. Vrusias © 2006
Case study 6: Classification Neural Networks with Competitive Learning The next step is to generate training and test sets from the available data. The 150-element Iris data is randomly divided into a training set of 100 elements and a test set of 50 elements. Now we can train the competitive neural network to divide input vectors into three classes. 06th November 2006 Bogdan L. Vrusias © 2006
Case study 6: Classification Neural Networks with Competitive Learning 06th November 2006 Bogdan L. Vrusias © 2006
Closing Questions??? Remarks??? Comments!!! Evaluation! 06th November 2006 Bogdan L. Vrusias © 2006