Dendral: A Case Study Lecture 25
DENDRAL: Introduction DENDRAL (for DENDritic ALgorithm) was an early intelligent system for science informatics DEDNRAL’s primary task was to discover a chemical’s structure from its mass spectrum. A mass spectrometer uses electrons to break a chemical into fragments. Given a histogram of fragment abundance, DENDRAL identifies the original structure. Although never widely used, DENDRAL did lead to results publishable in the chemistry literature.
Using Dendral Users interacted with Dendral through a teletype, which imposed limitations on the system’s interface. The interface was limited and required users to learn a formal language. The ability to view chemical structure on a teletype was unique. Users entered problem specific information. DENDRAL returned an array of potential structures. from Lindsey et al. (1993) AI Journal
Knowledge in DENDRAL At its outset, DENDRAL was unique due to its incorporation of domain specific knowledge. Researchers involved with DENDRAL espoused the knowledge principle: A system exhibits a high level of intelligence primarily because of the knowledge that it can bring to bear. Following this, the system was given knowledge about chemical structures and mass spectrometry. This knowledge constrained the structures that DENDRAL would consider, eliminating many that were implausible. Domain knowledge also informed the evaluation of the plausible structures that remained.
Plan – Generate – Test The generate-and-test approach is a common strategy in discovery systems. A system following this strategy has a generator that suggests solutions and a tester that evaluates them. DENDRAL extended this approach by automatically adapting the generator to the particular problem at hand. The plan-generate-test strategy includes an initial problem assessment that produces situational constraints. These constraints limit the generator and substantially reduce the number of structures considered. This approach effectively creates a situation-specific solution generator for each problem it encounters.
CONGEN CONGEN, for CONstrained GENeration, produced the candidate chemical structures. To generate these structures, CONGEN requires a chemical formula, such as C12H14O; a set of superatoms that define partial molecular structures, such as the methyl group CH3; and constraints on how the atoms and superatoms may be assembled into a structure. CONGEN produced an exhaustive and nonredundant set of chemical structures. DENDRAL’s developers saw this as a necessary condition for chemists to trust the system’s output. Of course, humans are not so systematic.
PLANNER PLANNER automated the specification of constraints for working with a particular mass spectrum. To produce the constraints, PLANNER requires a structure common to a class of chemical compounds; descriptions of potential fragmentations, including the bonds they break and any side effects they have; and a mass spectrum. PLANNER’s constraints stated the atoms contained in the substructures attached to the class-level structure. Plausible substructures were placed on a GOODLIST and implausible ones were placed on a BADLIST to be avoided.
PREDICTOR PREDICTOR tested CONGEN’s structures by comparing simulated fragmentations to the mass spectrum. To evaluate a candidate structure, PREDICTOR requires a structure in the form used by CONGEN; a set of production rules that simulate the fragmentation processes that occur within a mass spectrometer; and a mass spectrum. When multiple structures can explain the data, they are ranked according to a user-provided scoring measure.
Why Wasn’t DENDRAL Commonly Used? Chemists were unaware of the program. Chemists didn’t want to invest the time to learn it. Exhaustive generation was not seen as essential to the structure elucidation problem. The niche that DENDRAL fills wasn’t considered important enough to warrant use of the system. DENDRAL was not cost-effective for single individuals. Attitudes such as "Machines can't think; that’s my job.” DENDRAL’s pieces were easier to market than the whole system. Edited and abridged conjectures from the Lindsey et al. 1993 article in AI Journal.
Science Informatics Lessons from DENDRAL An interactive user interface is not merely a nicety but is essential. Providing assistance to problem solvers is a more realistic goal than doing their jobs for them. Computer assistants should maintain records just as a human assistant would. A uniform knowledge representation eases user interaction and program development. Users must understand the scope of problems a system can solve and the limitations in its abilities. Explicit assumptions and initial conditions of a problem, help users understand the results. Edited and abridged lessons from the Lindsey et al. 1993 article in AI Journal.
DENDRAL: Summary DENDRAL used heuristic search to address a challenging task of scientific discovery. The system relied heavily on knowledge, and its underlying formalism affected its development and usability. As an informatics tool, DENDRAL resembles BLAST in that developers made it publicly available to researchers; it addressed a specific need of chemists. DENDRAL’s case differs from BLAST in that the speed and accessibility of the internet was limited; it was difficult to port to other computing environments; it was not integrated with other useful tools.