Plan for Final Lecture What you may expect to be asked in the Exam?

Slides:



Advertisements
Similar presentations
Supervisor : Prof . Abbdolahzadeh
Advertisements

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
NoSQL Databases: MongoDB vs Cassandra
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
CS346: Advanced Databases
8/20/ Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously. Defined in many different ways, but.
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
Analysis Services 101 Dave Fackler, MCDBA, MCSE, MCT Director, Business Intelligence Practice Intellinet Corporation.
1 CUBE: A Relational Aggregate Operator Generalizing Group By By Ata İsmet Özçelik.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Cassandra - A Decentralized Structured Storage System
1 Data Warehouses BUAD/American University Data Warehouses.
Data Warehousing.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
NOSQL DATABASE Not Only SQL DATABASE
HDB++: High Availability with
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Advanced Database Design.
SQL Server Analysis Services Understanding Unified Dimension Model (UDM)
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cassandra Architecture.
2 Copyright © 2008, Oracle. All rights reserved. Building the Physical Layer of a Repository.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cloud Data Models Lecturer.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cassandra Tools and.
Cassandra The Fortune Teller
Plan for Cloud Data Models
CSE-291 (Distributed Systems) Winter 2017 Gregory Kesden
Design and Implementation
CS 405G: Introduction to Database Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
and Big Data Storage Systems
Cloud Computing and Architecuture
Cassandra - A Decentralized Structured Storage System
Cassandra Storage Engine
Introduction to Cassandra
Cassandra Tools and Config Files
Creating Repositories from Multidimensional Data Sources
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Chapter 13 Business Intelligence and Data Warehouses
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
On-Line Analytic Processing
MongoDB Distributed Write and Read
Learning MongoDB ZhangGang
Dynamo: Amazon’s Highly Available Key-value Store
Cassandra Transaction Processing
NOSQL.
The NoSQL Column Store used by Facebook
SQL: Advanced Options, Updates and Views Lecturer: Dr Pavle Mogin
IBM DATASTAGE online Training at GoLogica
NOSQL databases and Big Data Storage Systems
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
Apache Cassandra for the SQLServer DBA
Lecture 17: Distributed Transactions
Implementing Data Models & Reports with Microsoft SQL Server
Data Warehouse and OLAP
Enhance BI Applications and Simplify Development
DataMart (Data Warehouse) Tool:
Adding Multiple Logical Table Sources
Contents Preface I Introduction Lesson Objectives I-2
Cloud Computing for Data Analysis Pig|Hive|Hbase|Zookeeper
Data Warehouse and OLAP
Presentation transcript:

Exam and Lecture Overview Lecturer : Dr. Pavle Mogin

Plan for Final Lecture What you may expect to be asked in the Exam? The answer: All what we have learned is examinable, except the topic of the essay Exam questions will cover all the important topics we learned in lectures and applied in assignments 120 marks, 120 minutes Since we covered many topics, quiet a number of questions will be asking for a rather short answer Use Lecture Notes and Assignments as your main source for making the preparation

Exam – Structure and Content Exam structure will follow the structure of lectures: Cloud Databases General features of Cloud Databases (~23%) Cassandra (~44%) MongoDB (~16%) Data Warehousing and OLAP (~17%)

Cloud Databases – General Features Basic features Scalability Availability, Shared nothing, Data partitioning and replication, Consistent Hashing, Master – Slave, Membership changes Data versioning, Gossip protocol Trade-offs in cloud databases CAP theorem BASE Range of consistency levels Availability and consistency trade-offs

Cassandra (1) Data model: CQL: Column families with a CQL abstraction layer CQL: Keyspace (column families having the same replication factor), Tables (abstracting column families) (have non default features), Query syntax: very close to the relational SQL, but not that powerful and with many restrictions not present in SQL

Cassandra (2) Data modeling Storage Engine Consistency levels: Table Primary Key and Partitioning Storage Engine Rows Log Structured Merge Trees (mem table, log file, SSTables) Write Paths for Insert and Update About Reads (use of Bloom filters) About Deletes (use of tombstones) Compaction (deleting obsolete and deleted column values, uniting row fragments) Consistency levels: Spread from strict via strong to eventual Adjustable per a DML statement

Cassandra (3) Architecture Light weight transactions ( CAS): Internode Communication Gossip Protocol Seed Nodes Partitioning Consistent hashing A partitioner maps a partition key value into a token Snitches (files containing topology information) Keyspace and the replication strategy cassasndra.yaml cassandra-topology.properties Light weight transactions ( CAS): To avoid overwrite of other people’s work

Cassandra (4) Repair mechanisms Read repair Hinted Handoff Writes Extreme write availability Anti-entropy Node Repair (Merkle Trees)

MongoDB (1) Data Model: Data representation by documents Document implementation by: Embedding or Referencing A rich query language: Shell methods: find(), insert(), update() Simple aggregation (count(), distinct()), Pipelined Aggregation with several stages, each having several expressions, Powerful when combined with java scripts

MongoDB (2) Architecture Sharding: Sharding architecture Replication: Shard Key, Balancing Data Distribution, Sharding architecture Drivers (accept client’s requests), Routers (mogos processes - rout user requests to appropriate replica set using meta data), Configuration servers (contain routing information) Replica set (contain shard data) Replication: Master - Slave Mode, Replica Set Operation, Failover - Election of a new Master (HA Cluster)

Data Warehousing and OLAP (1) OLAP Database Structures Star Schema Facts, Dimensions, Attributes and attribute hierarchies, Snowflakes, and Constellations Hyper Cube (Multidimensional Cube) Basic OLAP Queries Roll-up (hierachical and dimensional) Drill-down, Slice and Dice, and Pivoting.

Data Warehousing (2) OLAP Specific Extensions of SQL CUBE, ROLLUP, WINDOW, and RANK Materialized Views Dimension Hierarchy A syntax for defining attribute hierarchies and functional dependencies between attributes Query Rewriting Text match, and General query rewrite with: Join compatibility check, Data sufficiency check, Grouping compatibility check, and Aggregate computability check

Data Warehousing (3) Aggregate Functions OLAP Architectures Distributive (SUM, COUNT, MIN, MAX), Algebraic (AVG, VAR), and Holistic (Median, RANK, TopN) OLAP Architectures Virtual Datawarehouse, Data Mart in a Box, Stovepipe Data Mart, Architected Data Mart, Enterprise Data Warehouse with HOLAP