Download presentation
Presentation is loading. Please wait.
Published byEsa-Pekka Sipilä Modified over 5 years ago
1
System Model Acquisition from Requirements Text
SMART System Model Acquisition from Requirements Text Technion – Israel Institute of Technology
2
System Model Acquisition from Requirements Text
Operates on free text documentation, such as business process specifications or user requirements Results depend critically on the quality of the processed documentation Based on Object-Process Methodology (OPM) that has two semantically equivalent modalities: Textual – Object-Process Language (OPL) Graphic – Object-Process Diagram (OPD) Technion – Israel Institute of Technology
3
System Model Acquisition from Requirements Text
Significantly reduces the quantity of material that needs to be processed manually Reduces the initial level of conceptual complexity Graphic manipulation (OPD) much easier than text editing Quality, accuracy, and conciseness of the system architecture – higher due to the discipline OPM introduces Capable of automatic generation of UML diagrams Technion – Israel Institute of Technology
4
SMART - System Diagram SMART OPCAT Categorization Engine OPL Generator
System Model Acquisition System Requirements Unstructured Text System Architecting Team System Model Technion – Israel Institute of Technology
5
System Model Acquisition In-zoomed
SMART System Requirements Unstructured Text Category Extraction Categorization Engine Category List raw edited System Architecting Team List Editing Relation Set Relation Formulating OPL Generator OPL Sentence Generating OPL Sentence Set OPCAT OPD Constructing System Model Technion – Israel Institute of Technology
6
SMART – Procedural Steps
Automatic Extraction of Categories from Unstructured Text Manual Editing of Categories Automatic Search of OPM Relations Automatic Generation of OPL Sentences Manual Editing of the Results Technion – Israel Institute of Technology
7
Automatic Extraction of Categories from Unstructured Text
Categorization engine in Common LISP Categories = idiomatic phrases (word sequence) reflecting the underlying topics in a given corpus of documents Based on heuristics Could combine external ontologies/taxonomies/thesauri Technion – Israel Institute of Technology
8
Manual Editing of Categories
Selection of categories that can serve as things in the OPM model, and classifying them as either object or processes Clustering of alternative formulations for the selected OPM things based on their semantic similarity Optionally adding OPM things that did not show up among the extracted categories Technion – Israel Institute of Technology
9
Automatic Search of OPM Relations
Utilizes a set of configurable, predefined templates: Template consists of two things and the relation between them, expressed in alternative ways Utilizes second order regular expressions defined on any lexical or grammatical attribute (part‑of‑speech, capitalization, punctuation) Finite‑state automaton that operates on suffix‑tree index consisting of tokens Instead of comparing character strings compares word sequences Technion – Israel Institute of Technology
10
Automatic Generation of OPL Sentences
Every extracted natural language sentence straight‑forwardly translated into OPL Reformulation of outcome to better reflect the underlying relations: Custom relations transformed into processes (cached into => Caching) Complex relations transformed into two equivalent simple sentences (Actual Documents Cached into Document Repositories => (1) Caching requires Actual Documents, (2) Caching yields Document Repositories) Transformations do not modify the underlying semantics of the NL sentences Technion – Israel Institute of Technology
11
Manual Editing of the Results
Non-semantic corrections – extraction did not depict all of the existing or implied relations Additions and eliminations - semantically modify original output Scaling applied to simplify results without losing details Technion – Israel Institute of Technology
12
Benefits Significant cut-down in time and resources Minimizes efforts
Focus on the system overview ("big picture“) High-quality results Minimizes time-to-market Technion – Israel Institute of Technology
13
Future Research Directions
Tested on EEC IST GRACE (Grid Retrieval and Categorization Engine) To be utilized for system design in EEC IST COCOON (Building Knowledge-driven and Dynamically Networked Communities within European Healthcare Systems) Looking for commercial pilot application Technion – Israel Institute of Technology
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.