Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research and Practice at University of Queensland Wei Lu ( 卢卫 ) 2/19/2009.

Similar presentations


Presentation on theme: "Research and Practice at University of Queensland Wei Lu ( 卢卫 ) 2/19/2009."— Presentation transcript:

1 Research and Practice at University of Queensland Wei Lu ( 卢卫 ) 2/19/2009

2 Seminar Plan Tour in Australia Join UQ How to write a good paper(by Xiaofang and Xuemin) Research Interests

3 Seminar Plan Tour in Australia Join UQ How to write a good paper (by Xiaofang and Xuemin) Research Interests

4 Tour in Australia

5

6 Seminar Plan Tour in Australia Join UQ How to write a good paper (by Xiaofang and Xuemin) Research Interests

7 Join UQ UQ

8

9 Introduction to UQ founded in 1910 Outstanding Majors –Business –Biology –Medical School of ITEE –Biomedical Engineering –Cognitive Systems Engineering –Complex & Intelligent Systems –Data & Knowledge Engineering (DKE) –eResearch –Microwave & Optical Communications –Power & Energy Systems –Security & Surveillance –Systems & Software Engineering –Ubiquitous Computing

10 Academic Staff in DKE Group Prof. Xiaofang Zhou He is the Head of the Data and Knowledge Engineering Research Group (DKE). He is also the Convenor of ARC Research Network in Enterprise Information Infrastructure (EII), and a Chief Investigator of ARC Centre of Excellence in Bioinformatics. Professor Zhou received his BSc and MSc degrees in Computer Science from Nanjing University in 1984 and 1987 respectively, and PhD in Computer Science from the University of Queensland in 1994. From 1994 to 1999, he worked as a Senior Research Scientist in CSIRO, leading its Spatial Information Systems group. His research focuses on finding effective and efficient solutions to managing, integrating and analysing very large amount of complex data for business and scientific applications. His research interests include spatial and multimedia databases, data quality, high performance query processing, Web information systems and bioinformatics.

11 Dr. Shazia Sadiq Her research interests are innovative solutions for Business Information Systems that span several areas including business process management, governance, risk and compliance, data quality management, workflow systems, and service oriented computing. Dr. Xue Li Associate Professors His research interests and expertise include: Data Mining, Multimedia Data Security, Database Systems, and Intelligent Web Information Systems.

12 Senior Lecture Dr. Hengtao Shen His Research interests: Media (Video)/Web Search Multimedia/Web/Spatial/Genome Database Management Nonlinear/Local Dimensionality Reduction Indexing and Query Processing P2P Database Management

13 Research Staff Ken Deng: 1.Data Quality 2.Spatial Database Helen Huang: 1.Video retrieval 2. knowledge discovery Gabriel: 1. Data Mining and Knowledge Discovery 2. Time Series Mining and Forecasting 3. Skyline Query Processing Stella: 1. Video Search & Retrieval, 2. Web Data Extraction & Analysis, 3. Recipe Data Modeling

14 Seminar Plan Tour in Australia Join UQ How to write a good paper (by Xiaofang and Xuemin) Research Interests

15 Seminar Plan Guide of writing a good paper (by Xiaofang and Xuemin) Research Interests: –Skyline Query Processing (From) –Data Quality—Record Linkage (To)

16 Guide of writing a good paper (by Xiaofang and Xuemin) Motivation –Interesting –Reasonable –Pure Solution –Smart –Sharp Conclusion –Experiment: time and space complexity

17 Cont. Tools –Word? –Latex: winEdt/eclipse+GNUPlot+Illustrator/smartdraw Format: Jian Pei’s papers –Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach –Efficiently Answering Top-k Typicality Queries on Large Databases –……

18 Seminar Plan Tour in Australia Join UQ How to write a good paper Research Interests

19 Skyline Query Processing (2007.12 ~ 2008.7) Data Quality—Record Linkage (2008.7~)

20 Skyline Query Processing

21 Motivations How to do experiments

22 Motivations---skyline Given a dataset of d- dimensional points height appearance

23 Motivations---skyline Given a dataset of d- dimensional points –a dominates b iff a outperforms b at least one dimension and not worse at other dimensions height appearance a b

24 Motivations---skyline Given a dataset of d- dimensional points –a dominates b iff a outperforms b at least one dimension and not worse at other dimensions –S contains points not dominated by others Example –Dataset of girls –Prefer good-looking tall height appearance a b

25 Motivations---skyline Given a dataset of d- dimensional points –a dominates b iff a outperforms b at least one dimension and not worse at other dimensions –S contains points not dominated by others Skyline points height appearance Example –Dataset of girls –Prefer good-looking tall

26 Dynamic Skyline Extension of skyline queries –Given a query point q –a dominates b iff a outperforms b at least one dimension and not worse at other dimensions –S contains points not dominated by others height appearance Query point q Example –User defines “ideal” girlfriend

27 Dynamic Skyline Extension of skyline queries –Given a query point q –a dominates b iff a outperforms b at least one dimension and not worse at other dimensions –S contains points not dominated by others height appearance Query point q Example –User defines “ideal” girlfriend

28 Dynamic Skyline Extension of skyline queries –Given a query point q –a dominates b iff a outperforms b at least one dimension and not worse at other dimensions –S contains points not dominated by others height appearance Query point q Example –User defines “ideal” girlfriend

29 Window Skyline Extension of skyline queries –Given an area and a query point q height appearance Example –User defines “ideal” girlfriend

30 Reverse Skyline Extension of skyline queries –A set of query points (red) –Which points (white) make q as its skyline point height appearance Query point q

31 Skyline Cube A real estate example price (100K)distage … P1P1 335 … P2P2 511 … P3P3 144 … P4P4 452 … P5P5 223 … Properties and Values Skyline on price & dist Skyline on price & age P1P1 P3P3 P5P5 P4P4 P2P2 price age P4P4 P3P3 P5P5 P1P1 P2P2 price dist

32 Variation of Skyline Queries Multi-Sources –Multiple query points P2P Skyline Computation –Each peer runs skyline computation ……

33 An Open Question Group Skyline –10 NBA Players (score, rebound, assist), choose 3 Player among them as a team to maximize the scores, rebounds, and assists

34 Fatal Defects of Skyline Queries The cardinality of result is huge –Given a set of d-dimensional points with the cardinality n, the expected number of skyline points is O(ln d−1 n/(d−1)!). (all the dimensions are independent ) Unpractical How to improve this problem?

35 Reduce the cardinality of skyline Selecting Star K-Dominant Skyline Core Skyline

36 Selecting Stars Skyline {p 2 , p 4 , p 6 } 1 Representative skyline point –p 6 2 Representative skyline points –{p 2 , p 6 }

37 Selecting Stars Skyline PointDominant set p2p2 p1p1 p4p4 p3p3 p6p6 p 3, p 5, p 7 Skyline {p 2 , p 4 , p 6 } 1 Representative skyline point –p 6

38 Skyline {p 2 , p 4 , p 6 } 2 Representative skyline points –{p 2 , p 6 } Challenge: NP-Complete when d > 2 Selecting Stars Skyline PointDominant set p 2, p 4 p 1, p 3 p 2, p 6 p 1, p 3, p 5, p 7 p 4, p 6 p 3, p 5, p 7

39 K-Dominant Skyline

40 k-Dominant Skyline (cont.) k-Dominate –If A is not worse than B on k dimensions, and better on at least one of the k dimensions, we say A k-dominate B.

41 k-Dominant Skyline (cont.) k-Dominant Skyline –k-dominant skyline contains all the points that cannot be k-dominated by any other point Problems: –The result can be null –Some good points may be pruned

42 Core Skyline

43 Skylines on Uncertain Data Consider game-by-game statistics Conventional methods compute the skyline on –Separate game records –Aggregate: mean or median Limitations –Biased by outliers –Lose data distributions Probabilistic skylines –An instance has a probability to represent the object –An object has a probability to be in the skyline The 33rd International Conference on Very Large Data Bases (VLDB), Vienna, Austria, September 23-28 2007

44 Possible worlds ABC 1a1a1 b1b1 c1c1 2a1a1 b1b1 c2c2 3a1a1 b2b2 c1c1 4a1a1 b2b2 c2c2 5a2a2 b1b1 c1c1 6a2a2 b1b1 c2c2 7a2a2 b2b2 c1c1 8a2a2 b2b2 c2c2 P(A) = 1 P(B) = 6/8 P(C) = 0

45 Top-K query –Given a threshold p, trying to identify all the players with skyline probability >= p;

46 KNN probabilistic skyline Given an object O, find K objects whose skyline probabilities are nearest to O. Applications: –Given an NBA player (singer star), try to find K NBA players (singer stars), whose performances are most similar to him/her.

47 Experiment Dataset –Real dataset: NBA –Synthetic dataset: anti-related, correlated, independent Parameters –Dimension –Cardinality of dataset Efficiency –Time –Memory

48 Thanks!


Download ppt "Research and Practice at University of Queensland Wei Lu ( 卢卫 ) 2/19/2009."

Similar presentations


Ads by Google