Database and Data-Intensive Systems. Data-Intensive Systems From monolithic architectures to diverse systems Dedicated/specialized systems, column stores.

Slides:



Advertisements
Similar presentations
SensMax People Counting Solutions Visitors counting makes the most efficient use of resources - people, time and money, which leads to higher profits in.
Advertisements

Chapter 5: Introduction to Information Retrieval
ICS (072)Database Systems: A Review1 Database Systems: A Review Dr. Muhammad Shafique.
--Presented By Sudheer Chelluboina. Professor: Dr.Maggie Dunham.
Spatio-Temporal Databases
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Web Mining Research: A Survey
1 9 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 9 Database Management Approaches.
eGovernance Under guidance of Dr. P.V. Kamesam IBM Research Lab New Delhi Ashish Gupta 3 rd Year B.Tech, Computer Science and Engg. IIT Delhi.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Mining.
Overview of Distributed Data Mining Xiaoling Wang March 11, 2003.
A Scalable Framework for the Collaborative Annotation of Live Data Streams Thesis Proposal Tao Huang
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining Techniques
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Science & Technology Centers Program Center for Science of Information Bryn Mawr Howard MIT Princeton Purdue Stanford Texas A&M UC Berkeley UC San Diego.
Chapter 1 Introduction to Data Mining
The Eyeblaster ACM Advertising Campaign Management.
Data and Applications Security Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #1 Introduction to Data and Applications Security August.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
Information System Development Courses Figure: ISD Course Structure.
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia
Data Mining By Dave Maung.
ICS (072)Database Systems: An Introduction & Review 1 ICS 424 Advanced Database Systems Dr. Muhammad Shafique.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
AN INTELLIGENT AGENT is a software entity that senses its environment and then carries out some operations on behalf of a user, with a certain degree of.
1.1 © 2010 Dr. Tarek Abd El-Hafeez Decision Support Systems Lecture 2 Decision Support Systems.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Postgraduate Module Enterprise Database Systems Technological Educational Institution of Larisa in collaboration with Staffordshire University Larisa
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Overview Issues in Mobile Databases – Data management – Transaction management Mobile Databases and Information Retrieval.
Book web site:
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Data Mining – Intro.
Chapter 13 The Data Warehouse
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Pervasive Data Access (PDA) Research Group
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Topics Covered in COSC 6340 Data models (ER, Relational, XML (short))
Data Warehousing and Data Mining
Topics Covered in COSC 6340 Data models (ER, Relational, XML)
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Introduction to Database Systems
Data and Applications Security
Data and Applications Security
Data and Applications Security
Presentation transcript:

Database and Data-Intensive Systems

Data-Intensive Systems From monolithic architectures to diverse systems Dedicated/specialized systems, column stores Data centers, web architectures, distributed architectures From business data to all data Streaming and sensor data, semi-structured and unstructured data Multidimensional data, temporal data, spatio-temporal data Examples Clustering of high-dimensional data Tracking and continuous queries for moving objects Mobile service infrastructure Location privacy Spatio-textural search/hyper-local web search Multimedia similarity search This is where much of our research “lives.”

Staff Ira Assent, associate professor Christian S. Jensen, professor Vaida Ceikute, Ph.D. student Xiaohui Li, visiting Ph.D. student NN, Ph.D. student GEOCROWD – indoor positioning and services infrastructure NN, Ph.D. student GEOCROWD – spatial web objects NN, Ph.D. student eData – Anomaly Detection in e-Science NN, Ph.D. student Streamspin NN, Ph.D. student WallViz NN, Ph.D. student REDUCTION NN, Ph.D. student REDUCTION

Graduate Course Portfolio: dDO Data management for moving objects (Q3) The course covers selected research advances in the general area of indexing and update and query processing for moving objects. Moving object tracking Specific indexing techniques  R-tree based indexing  B-tree based indexing Techniques for the efficient handling of frequent updates Techniques for range and k nearest neighbor query processing, including one-time as well as continuous queries

Graduate Course Portfolio: MDDB Multidimensional databases (Q4) Selected techniques for the management of multidimensionally represented data  Multidimensional data and applications  Data warehouses and data mining  Similarity search and query processing  Efficient handling: indexing and associated query processing  Multistep similarity search  Indexing multidimensional data  Skyline query processing  Data mining techniques  Subspace clustering  Classification  Outlier detection

Graduate Course Portfolio: Index Indexing of disk-based data (Q1) Indexing techniques for disk-based data for different types of data, as well as their support for queries and updates  General overview over indexes and query processing  Spatial indexing structures  Space partitioning indexing structures  Indexes for high dimensional data  Metric approaches  Special techniques for complex data types Coming up for the first time this fall

Graduate Course Portfolio: dDB2 Database management systems (Q2) The course aims to give the participants a solid conceptual foundation for making competent use of a database management system. Logical and physical query optimization and query processing Concurrency control techniques Database tuning Central concepts and techniques in relation to supporting temporal and multi-dimensional data Coming up for the first time this fall

8 Projects Streamspin Enable sites that are for mobile services what YouTube is for video  Easy mobile service creation and sharing  Advanced spatial and social context functionality  Be an open, extensible, and scalable service delivery infrastructure MOVE Knowledge extraction from massive data about moving objects  Cross-cutting activities, showcases, and evaluation  Representation of movement data and spatio-temporal databases  Analysis of movement and spatio-temporal data mining WallViz Collaborative analysis, joint decision making on wall-sized displays  scale to massive data collections  support ad-hoc queries  automatically provide entry points for analysis

Projects (2) GEOCROWD Creating a Geospatial Knowledge World:  advance the state-of-the-art in collecting, storing, analyzing, processing, reconciling, and publishing user-generated geospatial information on the Web REDUCTION Reducing the environmental footprint of fleets of vehicles  Optimizing the behavior of drivers  Supporting eco-routing of vehicles  Enabling transparency in multi-modal transportation eData Robust analysis in the context of imperfect data in e-Science  Detect and correct anomalies effectively  on-line, interactive, lineage-preserving, and semi-automatic  Scalable algorithms

How We Typically Work We target some real problem that we find interesting. We define the problem precisely. We develop a solution that is typically a data structure or an algorithm, i.e., a concrete technique. To evaluate, we build prototypes. These are built for the purpose of studying the properties of our solutions. We are often interested in performance, e.g., runtime, space usage, communication cost. For some solutions we state formal properties that we then prove, e.g., the correctness of a particular technique Brief: isolate and define problem, construct, then evaluate

Example 1: Spatial Web Querying Setting Google: ~90 billion queries/month, ~20 billion with local intent. We want to integrate exact locations of websites (for shops, bars, etc.) and users into web querying. Queries Results must match the query text and must be near the user. Results of continuous queries must be updated as the user moves. Challenges? Support such queries with low computation cost on the server and with little communication between server and client. Solution Invent an index that supports both text and location Use a safe zone to reduce the communication between user and server for continuous queries

Example 2: Fraud detection There are billions of financial transactions per minute How do we uncover fraud? Scalability In-time for reaction Manageable results Possible solution sketch Identify attributes of suspicious transactions Sort incoming transactions into a tree-structure of historic data When processing time is up, output degree of suspicion based on similarity to valid or fraudulent historic data

Interested? Come talk to us! We currently have M.Sc. and PhD. thesis openings