Course notes CS2606: Data Structures and Object-Oriented Development Ryan Richardson John Paul Vergara Department of Computer Science Virginia Tech Spring.

Slides:



Advertisements
Similar presentations
Overview of Data Structures and Algorithms
Advertisements

Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
They’re not just binary anymore!
Trees, Binary Trees, and Binary Search Trees COMP171.
Chapter 3 Data Abstraction: The Walls. © 2005 Pearson Addison-Wesley. All rights reserved3-2 Abstract Data Types Modularity –Keeps the complexity of a.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter Trees and B-Trees.
Data Structures Data Structures Topic #8. Today’s Agenda Continue Discussing Table Abstractions But, this time, let’s talk about them in terms of new.
Abstract Data Types (ADT)
Fundamentals of Python: From First Programs Through Data Structures
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 20: Binary Trees.
Coursenotes CS3114: Data Structures and Algorithms Clifford A. Shaffer Yang Cao Department of Computer Science Virginia Tech Copyright ©
C++ fundamentals.
Introduction - The Need for Data Structures Data structures organize data –This gives more efficient programs. More powerful computers encourage more complex.
1 A Introduction to Data Structures and Algorithm Analysis Data Structures Asst. Professor Kiran Soni.
Binary Search Trees Chapter 6.
C o n f i d e n t i a l Developed By Nitendra NextHome Subject Name: Data Structure Using C Title: Overview of Data Structure.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
Lecture No.01 Data Structures Dr. Sohail Aslam
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
CS Data Structures Chapter 15 Trees Mehmet H Gunes
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Searching: Binary Trees and Hash Tables CHAPTER 12 6/4/15 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education,
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Data Structure: Introduction.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
INTRODUCTION TO BINARY TREES P SORTING  Review of Linear Search: –again, begin with first element and search through list until finding element,
Prepared By Ms.R.K.Dharme Head Computer Department.
1 CS 350 Data Structures Chaminade University of Honolulu.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
Chapter 1 Data Structures and Algorithms. Primary Goals Present commonly used data structures Present commonly used data structures Introduce the idea.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Trees, Binary Trees, and Binary Search Trees COMP171.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
BSTs Data Structures & OO Development I 1 Computer Science Dept Va Tech June 2006 ©2006 McQuain & Ribbens Binary Search Trees A binary search tree or BST.
1 C++ Classes and Data Structures Jeffrey S. Childs Chapter 15 Other Data Structures Jeffrey S. Childs Clarion University of PA © 2008, Prentice Hall.
Binary Trees Data Structures & OO Development I 1 Computer Science Dept Va Tech June 2006 ©2006 McQuain & Ribbens Binary Trees A binary tree is either.
1 A Practical Introduction to Data Structures and Algorithm Analysis Chaminade University of Honolulu Department of Computer Science Text by Clifford Shaffer.
Data Structure and Algorithms
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
AVL Trees. AVL Tree In computer science, an AVL tree is the first-invented self-balancing binary search tree. In an AVL tree the heights of the two child.
Prof. I. J. Chung Data Structure #1 Professor I. J. Chung.
Maitrayee Mukerji. INPUT MEMORY PROCESS OUTPUT DATA INFO.
Lecture 1 Data Structures Aamir Zia. Introduction Course outline Rules and regulations Course contents Good Programming Practices Data Types and Data.
Mohammed I DAABO COURSE CODE: CSC 355 COURSE TITLE: Data Structures.
Advanced Data Structures Lecture 1
Introduction toData structures and Algorithms
Mehdi Kargar Department of Computer Science and Engineering
School of Computing Clemson University Fall, 2012
Chapter 4 The easy stuff.
Multiway Search Trees Data may not fit into main memory
B-Trees B-Trees.
Chapter Trees and B-Trees
Chapter Trees and B-Trees
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
structures and their relationships." - Linus Torvalds
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
B- Trees D. Frey with apologies to Tom Anastasio
Data Structures: Introductory lecture
B- Trees D. Frey with apologies to Tom Anastasio
B- Trees D. Frey with apologies to Tom Anastasio
Introduction to Data Structure
CSE 373 Data Structures and Algorithms
B-Trees.
structures and their relationships." - Linus Torvalds
Presentation transcript:

Course notes CS2606: Data Structures and Object-Oriented Development Ryan Richardson John Paul Vergara Department of Computer Science Virginia Tech Spring 2008

Goals of this Course 1.Reinforce the concept that costs and benefits exist for every data structure. 2.Learn the commonly used data structures. –These form a programmer's basic data structure ``toolkit.'‘ 3.Understand how to measure the cost of a data structure or program. –These techniques also allow you to judge the merits of new data structures that you or others might invent.

The Need for Data Structures Data structures organize data  more efficient programs. More powerful computers  more complex applications. More complex applications demand more calculations. Complex computing tasks are unlike our everyday experience.

Organizing Data Any organization for a collection of records can be searched, processed in any order, or modified. The choice of data structure and algorithm can make the difference between a program running in a few seconds or many days.

Efficiency A solution is said to be efficient if it solves the problem within its resource constraints. –Space –Time The cost of a solution is the amount of resources that the solution consumes.

Selecting a Data Structure Select a data structure as follows: 1.Analyze the problem to determine the basic operations that must be supported. 2.Quantify the resource constraints for each operation. 3.Select the data structure that best meets these requirements.

Some Questions to Ask Are all data inserted into the data structure at the beginning, or are insertions interspersed with other operations? Can data be deleted? Are all data processed in some well- defined order, or is random access allowed?

Costs and Benefits Each data structure has costs and benefits. Rarely is one data structure better than another in all situations. Any data structure requires: –space for each data item it stores, –time to perform each basic operation, –programming effort.

Costs and Benefits (cont) Each problem has constraints on available space and time. Only after a careful analysis of problem characteristics can we know the best data structure for the task. Bank example: –Start account: a few minutes –Transactions: a few seconds –Close account: overnight

Example 1.2 Problem: Create a database containing information about cities and towns. Tasks: Find by name or attribute or location Exact match, range query, spatial query Resource requirements: Times can be from a few seconds for simple queries to a minute or two for complex queries

Abstract Data Types Abstract Data Type (ADT): a definition for a data type solely in terms of a set of values and a set of operations on that data type. Each ADT operation is defined by its inputs and outputs. Encapsulation: Hide implementation details.

Data Structure A data structure is the physical implementation of an ADT. –Each operation associated with the ADT is implemented by one or more subroutines in the implementation. Data structure usually refers to an organization for data in main memory. File structure: an organization for data on peripheral storage, such as a disk drive.

Metaphors An ADT manages complexity through abstraction: metaphor. –Hierarchies of labels Ex: transistors  gates  CPU. In a program, implement an ADT, then think only about the ADT, not its implementation.

Logical vs. Physical Form Data items have both a logical and a physical form. Logical form: definition of the data item within an ADT. –Ex: Integers in mathematical sense: +, - Physical form: implementation of the data item within a data structure. –Ex: 16/32 bit integers, overflow.

Data Type ADT: Type Operations Data Items: Logical Form Data Items: Physical Form Data Structure: Storage Space Subroutines

Example 1.8 A typical database-style project will have many interacting parts.

Binary Search Trees What is good and bad about BSTs? Space requirements? Time requirements? Average vs. Worst Case? What drives worst-case behavior? What problems can a BST solve?

Example: BST Template Interface template class BST { private: BinNodeT * Root; // additional private members not shown public: BST(); // create empty BST BST(const T& D); // root holds D BST(const BST & D); // deep copy support BST & operator=(const BST & D) const; bool Insert(const T& D); // insert element bool Delete(const T& D); // delete element T* const Find(const T& D); // return access to D const T* const Find(const T& D) const; // return access to D void Clear(); // restore to empty // state ~BST(); };

Spatial Data Structures What if we want to search for points in 2D or 3D space? How do we store the data? Could store separate data structures organizing by x- and y-dimensions (list, BST, etc.) This is OK for exact-match queries, but doesn’t handle range queries well Could combine We need a spatial data structure to handle spatial queries.

Spatial Data Structures Spatial data records include a sense of location as an attribute. Typically location is represented by coordinate data (in 2D or 3D). If we are to search spatial data using the locations as key values, we need data structures that efficiently represent selecting among more than two alternatives during a search.

Spatial Data Structures One approach for 2D data is to employ quadtrees, in which each internal node can have up to 4 children, each representing a different region obtained by decomposing the coordinate space. There are a variety of such quadtrees, many of which are described in: –The Quadtree and Related Hierarchical Data Structures, Hanan Samet, ACM Computing Surveys, June 1984

Spatial Decomposition In binary search trees, the structure of the tree depends not only upon what data values are inserted, but also in what order they are inserted. In contrast, the structure of a Point-Region Quadtree is determined entirely by the data values it contains, and is independent of the order of their insertion. In effect, each node of a PR Quadtree represents a particular region in a 2D coordinate space.

Spatial Decomposition Internal nodes have exactly 4 children (some may be empty), each representing a different, congruent quadrant of the region represented by their parent node. Internal nodes do not store data. Leaf nodes hold a single data value. Therefore, the coordinate space is partitioned as insertions are performed so that no region contains more than a single point. PR quadtrees represent points in a finitely-bounded coordinate space.

PR Quadtree Insertion Insertion proceeds recursively, descending until the appropriate leaf node is found, and then partitioning and descending until there is no more than one point within the region represented by each leaf. It is possible for a single insertion to add many levels to the relevant subtree, if points lie close enough together. Of course, it is also possible for an insertion to require no splitting whatsoever. The shape of the tree is entirely independent of the order in which the data elements are added to it.

PR Quadtree Deletion Deletion of elements is more complex (naturally) and may involve collapsing of nodes. Since deletion is not required for the first project, a discussion of the details will be deferred until a later time.

PR Quadtree Node Implementation Of course, the PR Quadtree will be implemented as a C++ template. However, it may be somewhat less generic than the general BST discussed earlier. During insertion and search, it is necessary to determine whether one point lies NW, NE, SE or SW of another point. Clearly this cannot be accomplished by using the usual relational operators to compare points.

PR Quadtree Node Implementation Two possible approaches: 1)have the data type provide accessors for the x- and y- coordinates 2)have the type provide a comparator that returns NW, NE, SE or SW Either is feasible. It is possible to argue either is better, depending upon the value placed upon various design goals. It is also possible to deal with the issue in other ways. In any case, the PR Quadtree implementation will impose fairly strict requirements on any data type that is to be stored in it.

PR Quadtree Implementation Here's a possible PR Quadtree interface: template class prQuadTree { public: prQuadTree(int xMinimum, int xMaximum, int yMinimum, int yMaximum); bool Insert(const T& Elem); bool Delete(const T& Elem); T* const Find(const T& Elem); const T* const Find(const T& Elem) const; void Display(std::ostream& Out) const; private: prQuadNode * Root; int xMin, xMax, yMin, yMax; //... };

PR Quadtree Implementation Some comments: the tree must be created to organize data elements that lie within a particular, bounded region for the partitioning logic to be correct the question of how to manage different types for internal and leaf nodes raises some fascinating design and coding issues… there will, of course, be a number of private, recursive helper functions how to display the tree also raises some fascinating issues…