Presentation is loading. Please wait.

Presentation is loading. Please wait.

A presentation by W H Inmon BRIDGING THE GAP BETWEEN UNSTRUCTURED DATA AND STRUCTURED DATA.

Similar presentations


Presentation on theme: "A presentation by W H Inmon BRIDGING THE GAP BETWEEN UNSTRUCTURED DATA AND STRUCTURED DATA."— Presentation transcript:

1 A presentation by W H Inmon BRIDGING THE GAP BETWEEN UNSTRUCTURED DATA AND STRUCTURED DATA

2 - unstructured data -.doc files -.txt files -.xls files - email - transcripted telephone The informal systems of the corporation: Email.Txt.Doc - structured systems - structured data - corporate transactions - corporate reports - corporate databases -customer files - audit reports The formal systems of a corporation: Program

3 It is estimated that less than 20% of corporate systems are structured. 80 % Email.Txt.Doc 20% Program

4 Email.Txt.Doc search engines legal discovery email archive taxonomy ontology document mgmt web content Program dbms business intelligence applications transactions OLTP ERP compliance imagine what would happen if the two worlds could be integrated……. the world of dbms, analytics, and other processing opens up.

5 Email.Txt.Doc search engines legal discovery email archive taxonomy ontology document mgmt web content Program dbms business intelligence applications transactions OLTP ERP compliance Email.Txt.Doc tight integration between the two types of data.

6 There is a gulf between the two worlds: - technology - business practice - organizational - historical Email.Txt.Doc Program

7 Think of the possibilities! Email.Txt.Doc Program

8 Imagine this - Reports and visualization show a lot. have you ever wondered why you can’t hook up your Business Objects to email? or telephone conversations?

9 Email.Txt.Doc text numbers There is a fundamental disconnect between unstructured data and business intelligence. So what would happen if we had powerful visualization for text? Business Intelligence

10

11 liver cancer skin cancer thirst diabetes blood pressure correlative information becomes very easy to spot

12 for the general population for women for women who smoke over the age to 50 doing analysis on sub populations of women

13 for the general population for women who smoke over the age to 50 the contrast between the different correlations of different populations leads to great insight

14 service delivery late broken installation salesman attitude wait too long did not fit what about looking at customer feedback – complaints? now you can see the broader picture of what is happening

15 but there are plenty of other places where the technology applies – - manufacturing warranties – (what patterns of defects are there?) - Weblogs (marketing – who is saying what?) - customer complaints – (what are the problem products?) - general email – (What’s the buzz? what is on people’s minds?) - insurance claims (what are the circumstances of accidents?)

16 Email.Txt.Doc another possibility is the monitoring of email and the transport of email to the structured environment

17 Monitoring emails and other corporate conversations - Email.Txt.Doc Sarbanes Oxley HIPAA BASEL II compliance – making sure that email is being used properly - compliance - corporate standard for language

18 Jan 3 - vp to vp “This is going to be a real barn burner of a quarter….” Jan 5 – finance to vp “It looks like we are going to do $9,000,000 this quarter…” Jan 5 – president to analyst “This quarter looks like we are going to break new records…” Feb 1 – employee to employee “Did you see the stock market? Everything is going down…” Feb 3 – president to vp “What is happening to sales in the midwest? We didn’t expect this…” Feb 4 – sales manager to vp Feb 3 – vp to vp “The sales cycle looks like it is extending. The economy is tanking…” “It looks like we are going to be a little short this quarter…” Feb 6 – president to vp “What are we going to do to get sales up? Do we need to do some discounting?” Mar 2 – sales person to vp “Demand has dried up. We aren’t going to close as many sales this quarter as we thought…” A bunch of emails and conversations: What do you do with them?

19 Jan 3 - vp to vp “This is going to be a real barn burner of a quarter….” Jan 5 – finance to vp “It looks like we are going to do $9,000,000 this quarter…” Jan 5 – president to analyst “This quarter looks like we are going to break new records…” Feb 1 – employee to employee “Did you see the stock market? Everything is going down…” Feb 3 – president to vp “What is happening to sales in the midwest? We didn’t expect this…” Feb 4 – sales manager to vp Feb 3 – vp to vp “The sales cycle looks like it is extending. The economy is tanking…” “It looks like we are going to be a little short this quarter…” Feb 6 – president to vp “What are we going to do to get sales up? Do we need to do some discounting?” Mar 2 – sales person to vp “Demand has dried up. We aren’t going to close as many sales this quarter as we thought…” Examining emails (“combing” them) for important corporate information: Sarbanes Oxley quarter stock sales discount demand sales cycle external categories

20 sales email – Feb 2 email – Mar 5 phone – Mar 8 ……………… quarter email – Jan 2 email – Jan 4 email – Feb 5 ……………… discount phone conversation – Jan 6 email – Jan 12 email – Jan 14 ………………………….. sales cycle email – Feb 24 phone conversation – Mar 14 meeting notes – Mar 18 ……………………………. Structured Environment The “combed” information is brought over to the structured environment. Now you can use standard tools, such as Cognos, Business Objects, Crystal Reports, MicroStrategy to do analysis.

21 customer data probabilistic match Emails and telephone conversations can be linked to CDI/CRM data. But there are other ways that communications can be used

22 A true 360 degree view of the customer can be formed. “I placed an order last week and when it arrived it was the wrong size. And then your company would not take it back. I’m mad.” how easy is it going to be to engage Mrs Jones until she has satisfaction about her order

23 A true 360 degree view of the customer can be formed. communications demographics delivering on the promise of CDI

24 Email.Txt.Doc Program can’t I just use a search engine to link the two worlds? integration search engines do not integrate textual information

25 Email.Txt.Doc Program integration text doesn’t need to be searched, it needs to be integrated

26 Email.Txt.Doc Program integration “ha” “head ache” “heart attack” “Hepatitis A”

27 Email.Txt.Doc Program integration “oblique fractured ulna” “oblique fractured tibia” “obliq fractured tarsi” “broken bone”

28 Email.Txt.Doc Program 1 – stop word editing 2 – stemming 3 – synonym replacement 4 – synonym concatenation 5 – homograph resolution 6 – alternate spelling resolution 7 – external category classification 8 – theming 9 – probabilistic matching 10 – negation exclusion 11 – concept clustering 12 – mid process editing 13 – change sensitivity What is meant by editing, integrating text? integration

29 Email.Txt.Doc Program For a detailed description of how the unstructured environment should be linked to the structured environment, go to - www.inmoncif.com and look for DW 2.0 TM or go to - www.inmondatasystems.com

30 Unstructured Data Structured Environment Query Business Objects, Cognos, MicroStrategy, Crystal Reports DB2 probabilistic match visualization


Download ppt "A presentation by W H Inmon BRIDGING THE GAP BETWEEN UNSTRUCTURED DATA AND STRUCTURED DATA."

Similar presentations


Ads by Google