Download presentation
Presentation is loading. Please wait.
Published byStephen Kennedy Modified over 9 years ago
1
Database and Data-Intensive Systems
2
Data-Intensive Systems From monolithic architectures to diverse systems Dedicated/specialized systems, column stores Data centers, web architectures, distributed architectures From business data to all data Streaming and sensor data, semi-structured and unstructured data Multidimensional data, temporal data, spatio-temporal data Examples Clustering of high-dimensional data Tracking and continuous queries for moving objects Mobile service infrastructure Location privacy Spatio-textural search/hyper-local web search Multimedia similarity search This is where much of our research “lives.”
3
Staff Ira Assent, associate professor Christian S. Jensen, professor Vaida Ceikute, Ph.D. student Xiaohui Li, visiting Ph.D. student NN, Ph.D. student GEOCROWD – indoor positioning and services infrastructure NN, Ph.D. student GEOCROWD – spatial web objects NN, Ph.D. student eData – Anomaly Detection in e-Science NN, Ph.D. student Streamspin NN, Ph.D. student WallViz NN, Ph.D. student REDUCTION NN, Ph.D. student REDUCTION
4
Graduate Course Portfolio: dDO Data management for moving objects (Q3) The course covers selected research advances in the general area of indexing and update and query processing for moving objects. Moving object tracking Specific indexing techniques R-tree based indexing B-tree based indexing Techniques for the efficient handling of frequent updates Techniques for range and k nearest neighbor query processing, including one-time as well as continuous queries
5
Graduate Course Portfolio: MDDB Multidimensional databases (Q4) Selected techniques for the management of multidimensionally represented data Multidimensional data and applications Data warehouses and data mining Similarity search and query processing Efficient handling: indexing and associated query processing Multistep similarity search Indexing multidimensional data Skyline query processing Data mining techniques Subspace clustering Classification Outlier detection
6
Graduate Course Portfolio: Index Indexing of disk-based data (Q1) Indexing techniques for disk-based data for different types of data, as well as their support for queries and updates General overview over indexes and query processing Spatial indexing structures Space partitioning indexing structures Indexes for high dimensional data Metric approaches Special techniques for complex data types Coming up for the first time this fall
7
Graduate Course Portfolio: dDB2 Database management systems (Q2) The course aims to give the participants a solid conceptual foundation for making competent use of a database management system. Logical and physical query optimization and query processing Concurrency control techniques Database tuning Central concepts and techniques in relation to supporting temporal and multi-dimensional data Coming up for the first time this fall
8
8 Projects Streamspin Enable sites that are for mobile services what YouTube is for video Easy mobile service creation and sharing Advanced spatial and social context functionality Be an open, extensible, and scalable service delivery infrastructure MOVE Knowledge extraction from massive data about moving objects Cross-cutting activities, showcases, and evaluation Representation of movement data and spatio-temporal databases Analysis of movement and spatio-temporal data mining WallViz Collaborative analysis, joint decision making on wall-sized displays scale to massive data collections support ad-hoc queries automatically provide entry points for analysis
9
Projects (2) GEOCROWD Creating a Geospatial Knowledge World: advance the state-of-the-art in collecting, storing, analyzing, processing, reconciling, and publishing user-generated geospatial information on the Web REDUCTION Reducing the environmental footprint of fleets of vehicles Optimizing the behavior of drivers Supporting eco-routing of vehicles Enabling transparency in multi-modal transportation eData Robust analysis in the context of imperfect data in e-Science Detect and correct anomalies effectively on-line, interactive, lineage-preserving, and semi-automatic Scalable algorithms
10
How We Typically Work We target some real problem that we find interesting. We define the problem precisely. We develop a solution that is typically a data structure or an algorithm, i.e., a concrete technique. To evaluate, we build prototypes. These are built for the purpose of studying the properties of our solutions. We are often interested in performance, e.g., runtime, space usage, communication cost. For some solutions we state formal properties that we then prove, e.g., the correctness of a particular technique Brief: isolate and define problem, construct, then evaluate
11
Example 1: Spatial Web Querying Setting Google: ~90 billion queries/month, ~20 billion with local intent. We want to integrate exact locations of websites (for shops, bars, etc.) and users into web querying. Queries Results must match the query text and must be near the user. Results of continuous queries must be updated as the user moves. Challenges? Support such queries with low computation cost on the server and with little communication between server and client. Solution Invent an index that supports both text and location Use a safe zone to reduce the communication between user and server for continuous queries
12
Example 2: Fraud detection There are billions of financial transactions per minute How do we uncover fraud? Scalability In-time for reaction Manageable results Possible solution sketch Identify attributes of suspicious transactions Sort incoming transactions into a tree-structure of historic data When processing time is up, output degree of suspicion based on similarity to valid or fraudulent historic data
13
Interested? Come talk to us! We currently have M.Sc. and PhD. thesis openings
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.