Presentation is loading. Please wait.

Presentation is loading. Please wait.

Usability and Integration H. V. Jagadish. Many Sources of Data Text XML/semi-structured Experimental measurements Public databases Some data may have.

Similar presentations


Presentation on theme: "Usability and Integration H. V. Jagadish. Many Sources of Data Text XML/semi-structured Experimental measurements Public databases Some data may have."— Presentation transcript:

1 Usability and Integration H. V. Jagadish

2 Many Sources of Data Text XML/semi-structured Experimental measurements Public databases Some data may have time/space variation Need to make sense of this big mess

3 Find Patterns in Data Conventional data mining seeks patterns that can be mathematically specified over (usually) global extents. Typically assume simple data structure. Need new approaches to find patterns in messy data.

4 Human in the Loop Hard for a machine to tell an interesting pattern apart from one that is not. Problem exacerbated when we seek smaller/localized patterns, or work with large vocabularies of possible patterns. Need human in the loop to make this judgment.

5 Computer-Assisted (Human) Analytics Patterns found by human and not by computer. Job of computer is to make patterns easy to find. So computer system must effectively support queries and display results. Eg.Visual Analytics

6 Organize Data for Analysis Join multiple complex temporal data streams into a “windowed” model suitable for efficient analysis. [Manish Singh] Permit organic change to schema as information needs evolve. [Eric Qian] Provide a spreadsheet interface for direct manipulation of complex and large data. Choose small sets of representatives effectively. [Ben Liu]

7 Access Data for Analysis Under-specified queries, particularly keyword queries. Derive “qunit” as response unit, mined from observed query logs. [Arnab Nandi] Visual manipulation algebra for analyzing large time-varying graphs with data on nodes and edges. [Anna Shaverdian]

8 Scientific Data Analysis Explain analysis results in terms of source data, even when the source may have been updated since. [Jing Zhang] Analyze gene expression microarray data, and electronic health record data, in light of known biomedical knowledge. [Fernando Farfan]


Download ppt "Usability and Integration H. V. Jagadish. Many Sources of Data Text XML/semi-structured Experimental measurements Public databases Some data may have."

Similar presentations


Ads by Google