Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enhanced Verbalization of ORM Models

Similar presentations


Presentation on theme: "Enhanced Verbalization of ORM Models"— Presentation transcript:

1 Enhanced Verbalization of ORM Models
Matthew Curland1 and Terry Halpin2 1ORM Solutions, USA 2INTI International University, Malaysia and LogicBlox, Australia

2 Contents Introduction Avoiding Ambiguity
Improved Verbalization Patterns Implementing Verbalizations Involving One Join Path Future Plans

3 Introduction In an ORM model, facts, constraints, and derivation rules may be expressed naturally in a controlled natural language. This facilitates validation by verbalization, enabling non-technical domain experts to understand the model without understanding the ORM diagram. Supplementing this with population by concrete examples provides an effective way to validate how well the model captures the business domain.

4 } Model Validation via Verbalization and Population: a simple example
Phobos Mars Deimos Io Jupiter } Counter-example Io Mars Uniqueness constraint pattern +ve form: Each Moon orbits at most one Planet. It is possible that some Planet is orbited by more than one Moon. Illustrated by a satisfying fact population. -ve form: It is impossible that some Moon orbits more than one Planet. Test with a counter-example.

5 Main Features of the NORMA Tool
1 2 FORML input FORML output 1 2 5

6 We now consider some of our recent work to enhance NORMA’s verbalization of ORM models using the output version of FORML. The verbalization patterns are first captured in formal logic to ensure an unambiguous interpretation. The transforms to linguistic form are now more tightly bound to the underlying logical form.

7 Avoiding Ambiguity Simple UC: p :Person 0..1y :Year p was born in y
Each Person was born in at most one Year. We map  to “each” because it is always distributive, and it never requires its following term to be pluralized. In contrast “all” and “every” may be used both distributively and collectively and “all” requires pluralization. e.g. All baby whales weigh more than all baby humans. (True distributively, False collectively)

8 Mandatory role constraint: p :Person l :Language p speaks l
Each Person speaks some Language. We map  to “each” and  to “some”. Linguists regard the pattern “every-some” to be ambiguous, so we avoid it. e.g. Every Person speaks some Language. could mean For each person, that person speaks some language or There is some language that every person speaks

9 ORM allows mixfix predicates, with front text or trailing text
ORM allows mixfix predicates, with front text or trailing text. The presence of front text often requires a different verbalization pattern to avoid ambiguity. e.g. the following verbalization of the simple UC above could be misunderstood to mean that everyone was born in the same year. The birth of each Person occurred in at most one Year. Instead, we use a verbalization closer to the logical formalization. Unlike most other controlled natural languages (e.g. CLCE), we verbalize correlations using pronouns, demonstratives, or typed variables rather than untyped variables. p: Person 0..1y: Year the birth of p occurred in y For each Person, the birth of that Person occurred in at most one Year.

10 Our verbalization patterns often depend on whether there is a predicate reading that starts from the constrained role. e.g. the following simple verbalization of the mandatory role constraint could be misunderstood to mean that there is some language that everyone speaks Some Language is spoken by each Person. Instead, we use a verbalization closer to the logical formalization p: Person l: Language l is spoken by p For each Person, some Language is spoken by that Person.

11 Previously, we often used “the same” to verbalize some leading existential quantifiers, but this can sometimes result in ambiguous reference e.g. consider the following verbalization of the m:n nature of the UC It is possible that the same Person speaks more than one Language and that more than one Person speaks the same Language. Our new verbalization matches the logical formalization more closely  (p :Person 1.. l :Language p speaks l & l :Language 1.. p :Person p speaks l) It is possible that some Person speaks more than one Language and that for some Language more than one Person speaks that Language.

12 Adverbs (e.g. “rarely”) in predicate readings sometimes need special care to avoid ambiguous verbalizations of constraints, even when the verbalization of fact instances is unambiguous. The Person named ‘Ann Jones’ rarely drives the Car that has CarRegNr ‘ABC123’. This fact instance verbalization is unambiguous. But the UC verbalization Each Person rarely drives at most one Car. could mean For each person, there is at most one car that he/she rarely drives or For each person, it is rare that he/she drives at most one car.

13 p :Person 0..1c :Car p rarely drives c
Every ORM constraint has an unambiguous logical formalization. In this case, the UC formalizes as p :Person 0..1c :Car p rarely drives c This may now be rendered by the following verbalization in logical form For each Person, there is at most one Car such that that Person rarely drives that Car. Although clear, such logical form verbalizations are typically lengthy and awkward compared with our linguistic form verbalizations. Moreover, the cases where our linguistic form verbalizations are ambiguous are rare in practice. We display our linguistic form by default, but plan to offer the logical form on request (e.g. when a user is unclear about the verbalization’s meaning).

14 Improved Verbalization Patterns
ORM’s rich constraint notation and use of unrestricted mixfix predicates lead to a large number of constraint verbalization patterns. We have space here to discuss just two cases: External uniqueness constraints on simple join paths Inclusive-or constraints where each role starts a predicate reading with no front text Other verbalization patterns may be experienced by invoking NORMA’s Verbalization Browser.

15 External UC: Basic alethic pattern
For deontic constraints, prepend “It is obligatory that” (+ve form) or replace “impossible” by “forbidden” (-ve form). For n -ary predicates (n  3) , insert “some” before each object type variable for the extra role(s).

16 E.g. Internal UCs on ternary: For each Concert and Position, that Concert features at most one Performer in that Position. For each Concert and Performer, that Concert features that Performer in at most one Position. External UC: For each Performer and Date, there is at most one Concert such that that Concert features that Performer in some Position and is on that Date.

17 Inclusive-or Constraint: Basic +ve alethic pattern
For deontic constraints, prepend “It is obligatory that” .

18 E.g. Inclusive-or constraint verbalizations: (a) Each Vehicle was purchased from some AutoRetailer or is rented. (b) It is obligatory that each Vehicle was purchased from some AutoRetailer

19 Implementing Uniqueness Verbalizations involving one Join Path
Joined uniqueness constraints needed a better verbalization. Former verbalizations using ‘context’ were clumsy at best, and potentially ambiguous with non-trivial join paths. Approach: Critique previous verbalization patterns Extract known information from join path to improve verbalization Algorithmically extract additionally information from the path to obtain cleanest form

20 Previous: Establish Context
This original approach establishes a context by listing fact types associated with constrained roles, then references this context to indicate uniqueness. Context: Room is in Building; Room has RoomNr. In this context, each Building, RoomNr combination occurs at most once. Critique: Role players are only related if there are no ring fact types Does not extend to join paths Clearly indicates which role players are unique, but does not indicate which role player(s) this combination identifies.

21 Previous: Establish Context with join Path
This approach modifies the previous context approach by verbalizing the join path instead of listing the raw fact types. Context: some Room is in some Building and has some RoomNr. In this context, each Building, RoomNr combination occurs at most once. Critique: The quantifier ‘some’ is ambiguous. It is intended to be an existential quantifier, but can easily be interpreted as ‘some (but not all)’. This is the raw role path verbalization when no head (aka free) variables are registered with the system. The assumption is then that no variables are in scope and must be existentially quantified.

22 Approach 1: Pre-declare Projected Variables
This approach provides the project variables as free for the join path verbalization. The term ‘combination’ is added to the projection list so that it can be referenced later, and the verbalization does not start with ‘context’. For each RoomNr and Building combination, if some Room has that RoomNr and is in that Building then that combination occurs exactly once in this context. We can use ‘exactly once’ instead of ‘at most once’ because existence (at least once) is already implied by the context. Critique: Still uses unnatural helper text like ‘combination’ and ‘context’ to produce the verbalization, and variables identified by the uniqueness are not emphasized. This form is correct, and we are using this form with appropriate numeric quantification for frequency constraints.

23 Approach 2: Just Operators
This approach eliminates the extra ‘context’ and ‘combination’ words and is an expanded logical form. For each Building and RoomNr, there is at most one Room such that that Room is in that Building and has that RoomNr. No ‘context’ or ‘combination’ keywords. Uses  and 0..1 operators only. The implementation challenge here is that the list of numerically quantified items (after at most one) must be derived from the model. At first glance, this is just the variables not in the projection list. However, this does not hold for all role paths.

24 Extracting Unique Variables
Role path analysis steps: Walk up the path from nodes projected on constraint roles and remember all path nodes. Also walk up path from nodes correlated with passed path nodes. Traverse back down the path to find the natural order for remembered nodes. Theory: every path node not covered by this traversal form conditions for the nodes in this list and do need to be numerically limit, but can be existentially quantified as they occur in the role path verbalization. Example: For each Building and RoomNr, there is at most one Room such that that Room is in that Building that has been certified by some Inspector and that Room has that RoomNr.

25 Future Plans Implement logical form verbalizations on request.
Refine the verbalization of other constraints. Refine the verbalization of derivation rules. Refine verbalizations with conditional join paths. Implement verbalizations in languages other than English. Complete the implementation of FORML 2 as a full input language. Implement FORML 2 also as a query language.

26 Screenshot of automated verbalization in Bahasa Melayu (Malay) under development by PhD student Lim Shin Huei (Mandarin verbalization planned next)


Download ppt "Enhanced Verbalization of ORM Models"

Similar presentations


Ads by Google