Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to Avoid Redundant Object-References

Similar presentations


Presentation on theme: "How to Avoid Redundant Object-References"— Presentation transcript:

1 How to Avoid Redundant Object-References
by Andy Carver © 2008 Andrew H. Carver

2 What’s the Problem? Like most of you (I hope), I believe that...
ORM is the foremost information-modeling approach-- and, ORM practitioners generally claim that if the ORM schema is semantically accurate, with elementarity, it guarantees a REDUNDANCY-FREE relational schema. There’s only one problem with this: It DOESN’T!!

3 Example from “Reduction Transforms” paper:

4 The solution which the paper offered:

5 The question which this phenomenon raises:
HOW can we achieve or guarantee that a given ORM schema is free, not only of fact-redundancy, but also of any object-referential redundancy?

6 The two possible approaches to doing this:
Have a system to fix all schemas after creating them This would require having, and knowing we have, an exhaustive set of reduction transforms There is an indication that we don’t have this, and no indication that we do or will have it 2. Have a design method that creates only conceptual schemas that will never need a reduction transform That is the type of solution I’m going to present; however, first we really need to ask this...

7 What is this thing that we call “redundancy”?
When we look at a table with data in it, and say it contains “redundancy”, what in the world do we actually mean? Don’t we really mean that it is not actually necessary to store the data, since the computer could be programmed to tell us what the omitted data is, given its context?

8 Then, what modeling rule am I suggesting?
Well, first note that all the reduction transforms we have detected work by correcting a too-generous attribution of “role-playing” to some object or other... Our hypothesis: The proper modeling rule is, basically: attribute roles, not to objects being referenced by non-information-bearing references; rather, to those with information-bearing ones.

9 What do we mean by “information-bearing”?
When we speak here of “information-bearing” data, we are borrowing a notion, and a basic principle, from “information theory” (a.k.a. “communication theory”): In information theory, the “information”-content of a particular unit of expression is defined as a function of the unit’s probability of occurring. A special case of probability is certainty: a unit which cannot but occur in a given context, has there a probability of 1.

10 The basic principle we borrow is the following:
the information-content of an expression-unit is inversely proportional to its probability of occurring (in the given context). For example, if we disregard certain borrowed words and proper names, the probability of ‘u’ following the letter ‘q’ in written English is 1. If it were decided to omit the u (in queen, quaint, inquest, etc.), no information would be lost.

11 What would be a modeling example of this?
The simplest example of non-information-bearing object reference in conceptual modeling is a fact’s mention of a scope-defining object: e.g., in a model of Neumont University, any fact’s mention of that school itself: “The class CS140 at Neumont University is offered in quarters 1, 2, 3, and 4.”

12 If the second object mentioned be considered a role-player in this fact, the fact type has this as a possible fact-population: The second role here is non-information-bearing, because it can never be played by any object except Neumont University (lest we record a fact that is out of scope).

13 Thus, for such simple cases of object reference, the criterion of “information-bearing” is as follows: An object-reference is information-bearing iff it could have been to a different object of the same type, without changing the predicate text or any of the object types in the fact statement, and the fact still be in scope – a typical fact “of interest”.

14 Are there some less-“simple” cases?
Yes indeed - there are much more interesting, subtle, and/or tricky cases of non-information-bearing object terms. For example, natural language supplies many examples of anaphoric object references.

15 Huh? What the heck is an “anaphoric” term?
anaphoric term: one that references an object indirectly, because it refers via alluding to some other reference-occurrence in the linguistic (i.e., “discourse-”) context. The most common anaphoric terms, perhaps, are third-person pronouns (e.g. “she”, “he”, “it”, “her”, “his”, “its”; the latter two, of course, are possessive terms, meaning “of him”, “of it”). Others use a demonstrative pronoun: “that car”.

16 So, what would be a sample occurrence of a non-information-bearing “anaphoric” term?
Here, the “previous reference” is in another fact-statement recorded in the system. It could, however, have been in the “domain info”.

17 The criterion of “information-bearing” is therefore more complicated for anaphoric terms:
If it would produce a fact of interest, to change a particular anaphoric term (in a sentence) to refer to a different object – without also changing to point to that object, the previous term to which the original anaphoric term was (intermediately) pointing, and without changing any of the object types in the sentence – then the original anaphoric term is bearing information; otherwise, it is non-information-bearing.

18 Geez -- Are there any other complications??
Yes indeed : there is the complicating factor that intended, yet non-info-bearing references may be implicit (i.e. unstated), and thus left for the reader / hearer to supply. (This whole subject is explored in the field called linguistic pragmatics...)

19 For example, Let’s consider a statement of some facts likely intended for our first example’s domain: “The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’.”

20 Note the anomalous, non-unique reference to the “legislative district” (where the scope is the U.S.): does the lack of number-uniqueness keep us from understanding which district is being referenced? “The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’.”

21 No, it doesn’t; here’s how it works (from linguistic pragmatics):
the reader takes the word “the” to indicate an intent to uniquely reference an object; “The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’.”

22 No, it doesn’t; here’s how it works (from linguistic pragmatics):
the reader takes the word “the” to indicate an intent to uniquely reference an object; by “conversational implicature”, the reader thus infers the intended but unstated information: “The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’ (and which is in that address’s state).”

23 Note two important points:
the unstated object-reference here is anaphoric; “The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’ (and which is in that address’s state).”

24 Note two important points:
the unstated object-reference here is anaphoric; object terms intended but unspoken (i.e. “implicit”) are always non-information-bearing – which is why it was safe to leave them unstated. “The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’ (and which is in that address’s state).”

25 What are the implications for “role-playing”?
Unlike that last example, we have seen a couple of previous examples in which the non-information-bearing object term is one that is not an element within another one. As we might glean from those previous examples, an object referenced by such a maximal (i.e., non-included), non-information-bearing object term, should not be considered a role-player in that fact.

26 But what about that last example,
where the non-information-bearing object term is an element within another object term? Should we, in that sort of situation, allow the object that is referenced by the containing object term, to be considered a role-player in the fact?

27 Isn’t that exactly where the modeling went bad with the first example we looked at?
“The address with the Id ‘1’ is in the state ‘AK’, in the legislative district with the number ‘2’ (and which is in that address’s state).”

28 “The address with the Id ‘1’ is in the state ‘AK’,
in the legislative district with the number ‘2’ (and which is in that address’s state).”

29 So, the moral of this example:
Role-player’s object terms should not only be information-bearing; they should be completely information-bearing.

30 So: How to avoid non-informative roles
Expose the referential-containment hierarchy for every object reference in the sample fact; within this hierarchy, identify all completely information-bearing object terms: completely information-bearing object term: an object term which is information-bearing and which, in the object-term hierarchy implicit in its fact-statement, is either a leaf node, or else contains only nodes each of which is a completely information-bearing object term Each maximal such object-term, should be considered a reference to a role-playing object

31 Here’s the same example, doing the steps suggested:
“The address with the id ‘1’ is in the state ‘AK’, in the district with the number 2” (here, we’ve underlined all the object terms) “the address with the id ‘1’” “the id ‘1’” “the state with the code ‘AK’” “the state-code ‘AK’” “the district (in that address’s state) with the number ‘2’” “the state of that address” “the address mentioned already/elsewhere” “the district-number ‘2’” (Now, mark the completely info-bearing terms, with * ; it is necessary to start with the leaf-nodes.)

32 Here’s the same example, doing the steps suggested:
“The address with the id ‘1’ is in the state ‘AK’, in the district with the number 2” (here, we’ve underlined all the object terms) “the address with the id ‘1’” “the id ‘1’” “the state with the code ‘AK’” “the state-code ‘AK’” “the district (in that address’s state) with the number ‘2’” “the state of that address” “the address mentioned already/elsewhere” “the district-number ‘2’” (Now, mark the completely info-bearing terms, with * ; it is necessary to start with the leaf-nodes.) * * * * *


Download ppt "How to Avoid Redundant Object-References"

Similar presentations


Ads by Google