Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peter P. ChenInformation Modeling Past, Present, Future of Data/Information Modeling Foster Distinguished Chair Professor Computer Science Dept. Louisiana.

Similar presentations


Presentation on theme: "Peter P. ChenInformation Modeling Past, Present, Future of Data/Information Modeling Foster Distinguished Chair Professor Computer Science Dept. Louisiana."— Presentation transcript:

1 Peter P. ChenInformation Modeling Past, Present, Future of Data/Information Modeling Foster Distinguished Chair Professor Computer Science Dept. Louisiana State University Baton Rouge, LA 70803, USA pchen@lsu.edu http://www.csc.lsu.edu/~chen

2 Peter P. ChenInformation Modeling 2 Overview Historical Background – How Entity-Relationship Model (ERM) was Developed Last Twenty-Five Years – ER Conferences, IDEF, ANSI/IRDS, CASE Methodologies/tools The Present – OO, Data Mining, UML The Future – Discovering „Links/Relationships“ – Validity/Credibility of Data, Machine Learning/Reasoning – Natural Languages and Data/Information Modeling – Modeling for XML/Semantic Web Conclusions

3 Peter P. ChenInformation Modeling 3 The Needs of the DB Community in the Early 70‘s For Software/Hardware Vendors – Integration of Various File and DB Formats – Incorporating More „Data Semantics“ For User Organizations – A Unified Methodology for File and DB design for Various File and DBMS‘s – Incorporating More Business Rules

4 Peter P. ChenInformation Modeling 4 How the ERM was Developed -- Right Place at the Right Time (I)? I Got Ph.D. From Harvard in 1973 – Thesis Title: Optimal File Allocation Worked for Honeywell from 73 to 74 – In a 10-person Architecture Team for Next Generation Distributed System – Many Team Members Were DB Experts & 20 Years Older: – Charles Bachman, Henry Leftkovitz, John Lyon

5 Peter P. ChenInformation Modeling 5 Right Place at the Right Time (II)? Joined MIT Management School Faculty in 74 – Interacted with User Organizations – They wanted a unified modeling and design methodology – Completed the ERM Paper – Most other faculty members were busy implementing DBMS prototypes

6 Peter P. ChenInformation Modeling 6 Concepts of Entity and Relationship Figure 2

7 Peter P. ChenInformation Modeling 7 An Example of ER Diagram Figure 2

8 Peter P. ChenInformation Modeling 8 Theoretical Foundations of ER Model Set Theory Modern Algebra Logic Lattice Theory

9 Peter P. ChenInformation Modeling 9 Defining ER Concepts using Set Figure 2

10 Peter P. ChenInformation Modeling 10 Relationship as an Ordered Tuple Figure 2

11 Peter P. ChenInformation Modeling 11 First ER Paper & Initial Reactions (I) Published in ACM Transactions on DBMS, Vol. 1, No.1, pp. 9-36, March 76

12 Peter P. ChenInformation Modeling 12 First ER Paper & Initial Reactions (II) The Situation then – Most People were in Religious War – I was a „New Kid on the Block“ The Advices I got: – Dropped the ER Model – Joined one of the Religious Camps

13 Peter P. ChenInformation Modeling 13 The First Five Years (I) Persistence with the ER Model – Continue to Write Papers on ERM – Organized First ER Conference in 1979 at UCLA – 2nd ER Conference Two Years Later – Now, an Annual Conference – November 2001 in Japan – 2002 in Finland – 2003 in Chicago – 2004 in Shanghai, China

14 Peter P. ChenInformation Modeling 14 The First Five Years (II) Some Academic People Started to Develop Semantics-Richer Data Models – Mike Hammer of EECS, MIT, now a guru in „reverse-engineering“ US-AF ICAM/IDEF Project – Served as a consultant – The ERM Became the Basis of IDEF More Companies Started to Experiment ERM

15 Peter P. ChenInformation Modeling 15 Related Developments in Next 20 Years Codd‘s RM/T Model added ER Concepts Bachman‘s Partnership Model, too. ANSI/IRDS Standard of Information Resource Directory Systems (IRDS) – Adopted ERM CASE (Computer-Aided Software Engineering) – First Major CASE Symposium in Atlanta, 1987, Keynote speaker – IBM AD Cycle – Based on ERM – IBM DB2 Repository Mgr – ERM – Oracle Desinger/200 -- ERM

16 Peter P. ChenInformation Modeling 16 The Present Status of Data/Information Modeling (I) ER Modeling is the „most widely used methodology“ in the business DB application development world – more than 85% of the FORTUNE 3,000 companies and major organizations are using it More advanced ER concepts are proposed and used UML, which is a specific language syntax, reinforces the ER concepts

17 Peter P. ChenInformation Modeling 17 The Present Status of Information Modeling (II) OO Modeling incorporates many concepts of ERM – Object is an implementation concept – Current OO methodologies deed more general concepts of relationship What is „Data Mining“? – Discover hidden „relationships“, Discover the embedded ER Models

18 Peter P. ChenInformation Modeling 18 Future (I) Discovering Links/Relationships from Data in Various Sources (such as DARPA‘s EELD Program) Validity/Credibility Analysis and Integration of Data; Machine Learning/Reasoning Natural Languages vs. Data/Info Modeling. ER Modeling Concepts are Similar to – Chinese Character Composition Methods – Ancient Egyptian Hieroglyph – English Sentence Grammar Structure

19 Peter P. ChenInformation Modeling 19 The Future (II) ERM and the Fundamental Principles of Systems Architecture ER Model is closely related to – XML – Semantic Web

20 Peter P. ChenInformation Modeling 20 Information Validity/Credibility Analysis A Paper was published in InfoFusion 2001, Montreal Algorithm was developed Prototype developed Also, developed machine learning algorithm

21 Peter P. ChenInformation Modeling 21 ER Modeling & English  First presented the ideas (abstract of a paper) in 2nd ER Conference in Washington, D.C. 1981  Paper was published in Information Sciences, 1983  Adopted as a standard systems analysis technique by some large consulting firms  Recently, OO Analysis re-discovered some of the basic concepts  Also, the research community started to use, modify, or extend the concepts

22 Peter P. ChenInformation Modeling 22

23 Peter P. ChenInformation Modeling 23

24 Peter P. ChenInformation Modeling 24

25 Peter P. ChenInformation Modeling 25 Chinese Characters as Models of Real World Entities Figure 2

26 Peter P. ChenInformation Modeling 26 Ancient Egyptian Hieroglyph Figure 2

27 Peter P. ChenInformation Modeling 27 Various Components of XML XML has many components: – XML (language part) – XSL – DOM – DTD – XLink and XPointer – RDF – XML Schema – etc. Not all are compatible with each other

28 Peter P. ChenInformation Modeling 28 What is RDF? Acronym for “Resource Description Framework” As a way to specify metadata Two parts: – Model and Syntax – Schema The “RDF Schema” is not a W3C recommended specification, yet http://www.w3.org/TR/REC-rdf-syntax

29 Peter P. ChenInformation Modeling 29 W3C Pays Attention to ERM The Cambridge Communiqué ( http://www.w3.org/TR/schema-arch) states: – RDF can be viewed as a member of the Entity-Relationship Model Family In several articles, Tim Berners- Lee discusses the similarity and differences between ERM and RDF.

30 Peter P. ChenInformation Modeling 30 RDF vs. ER Model RDF can be viewed as a version of binary ER model ( but at a lower and more detailed level ) RDF’s dependence on sentence analysis is similar to a series of work done in the correspondence between the ER model and English (and several other natural languages). – Reference: Chen, P.P., “Entity-Relationship Diagram and English Sentence Structure,” Information Science, 1983, Academic Press. – Major concepts: – Noun --> Entity, Verb --> Relationship – Adjective --> Attribute of Entity, Adverb --> Attribute of Relationship – Gerund --> Relationship-converted Entity – Etc.

31 Peter P. ChenInformation Modeling 31 Real World Modeling & Fundamental Principles of Systems Architecture Entity Lattice and Other Mathematical Structures and Operations Fundamental Principles of Systems Architecture – Starting from Info System Architecture, and then extends to all kinds of systems – Fundamental Questions on – Representation/Understanding – Operations – Costs/Benefits/Optimization

32 Peter P. ChenInformation Modeling 32

33 Peter P. ChenInformation Modeling 33 Conclusions (I) ER Modeling was triggered by the needs – Unifying data views from top-down and bottom-up perspectives – For vendors & user organizations – Incorporating more sematics Entity and relationship are fundamental concepts for – Data/Knowledge Representation – Database design – Software engineering – Information system development – And others (data mining,...)

34 Peter P. ChenInformation Modeling 34 Conclusions (II) Future: – Discovering Missing/Intended/Un-intended Relationship from data – Prediction of the Validity of data and data model; Machine Learning/Reasonsing – Natual Languages and ERM – Multi-language Information Extraction and Understanding – Culture-based Modeling Methodology – Modeling & Design of Web  Theory of Web, Semantic Web – Fundamental Principles of Systems Architecture

35 Peter P. ChenInformation Modeling 35 References ER and other Conferences – ER2002 (Finland), ER2003 (Chicago), ER2004 (Shanghai) – http://www.conceptualmodeling.org Chen’s papers online: – http://www.csc.lsu.edu/~chen XML Schema: – Primer: http://www.w3c.org/TR/xmlschema-0/ – Structure: http://www.w3c.org/TR/xmlschema-1/ – Data Types: http://www.w3c.org/TR/xmlschema-2/ XML XLink & XPointer: – http://www.w3c.org/XML/Linking RDF – http://www.w3c.org/RDF/


Download ppt "Peter P. ChenInformation Modeling Past, Present, Future of Data/Information Modeling Foster Distinguished Chair Professor Computer Science Dept. Louisiana."

Similar presentations


Ads by Google