CS 315 Data Structures Fall 2011 Instructor: B. (Ravi) Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 Course Web site:

Slides:



Advertisements
Similar presentations
Advanced Data Structures
Advertisements

Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Review. What to know You are responsible for all material covered in lecture, the readings, or the programming assignments There will also be some questions.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing CS 3358 Data Structures.
CSCE 210 Data Structures and Algorithms
BTrees & Bitmap Indexes
CS 307 Fundamentals of Computer Science 1 Abstract Data Types many slides taken from Mike Scott, UT Austin.
Welcome CS 315 Data Structures. Instructor: B. Ravikumar Office: 116 I Darwin Hall Phone: Course Web site:
Design a Data Structure Suppose you wanted to build a web search engine, a la Alta Vista (so you can search for “banana slugs” or “zyzzyvas”) index say.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
1 Advanced Data Structures. 2 Topics Data structures (old) stack, list, array, BST (new) Trees, heaps, union-find, hash tables, spatial, string Algorithm.
Data Structures & Algorithms What The Course Is About s Data structures is concerned with the representation and manipulation of data. s All programs.
CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone: Course Web site:
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
CS 206 Introduction to Computer Science II 04 / 29 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 12 / 10 / 2008 Instructor: Michael Eckmann.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Important Problem Types and Fundamental Data Structures
DATA STRUCTURE Subject Code -14B11CI211.
CSCE 3110 Data Structures and Algorithm Analysis.
Data Structures and Programming.  John Edgar2.
C o n f i d e n t i a l Developed By Nitendra NextHome Subject Name: Data Structure Using C Title: Overview of Data Structure.
Instructor: Dr. Sahar Shabanah Fall Lectures ST, 9:30 pm-11:00 pm Text book: M. T. Goodrich and R. Tamassia, “Data Structures and Algorithms in.
WEEK 1 CS 361: ADVANCED DATA STRUCTURES AND ALGORITHMS Dong Si Dept. of Computer Science 1.
Teaching Teaching Discrete Mathematics and Algorithms & Data Structures Online G.MirkowskaPJIIT.
CS223 Algorithms D-Term 2013 Instructor: Mohamed Eltabakh WPI, CS Introduction Slide 1.
Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.
9/17/20151 Chapter 12 - Heaps. 9/17/20152 Introduction ► Heaps are largely about priority queues. ► They are an alternative data structure to implementing.
Data Structures Lecture 1: Introduction Azhar Maqsood NUST Institute of Information Technology (NIIT)
Design and Analysis of Algorithms CSC201 Shahid Hussain 1.
Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.
Introduction. 2COMPSCI Computer Science Fundamentals.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
CS212: DATA STRUCTURES Lecture 1: Introduction. What is this course is about ?  Data structures : conceptual and concrete ways to organize data for efficient.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
DATA STRUCTURE & ALGORITHMS (BCS 1223) NURUL HASLINDA NGAH SEMESTER /2014.
Data Structures and Algorithms Lecture 1 Instructor: Quratulain Date: 1 st Sep, 2009.
Data Structures Lecture 1: Introduction. Course Contents Data Types   Overview, Introductory concepts   Data Types, meaning and implementation  
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Data Structures and Algorithm Analysis Introduction Lecturer: Ligang Dong, egan Tel: , Office: SIEE Building.
Elementary Data Organization. Outline  Data, Entity and Information  Primitive data types  Non primitive data Types  Data structure  Definition 
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Course Review Fundamental Structures of Computer Science Margaret Reid-Miller 29 April 2004.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CES 592 Theory of Software Systems B. Ravikumar (Ravi) Office: 124 Darwin Hall.
DATA STRUCTURES (CS212D) Overview & Review Instructor Information 2  Instructor Information:  Dr. Radwa El Shawi  Room: 
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Chapter 5 Record Storage and Primary File Organizations
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
CS 315 Data Structures Spring 14 Instructors (lecture) Bala Ravikumar Office: 116 I Darwin Hall Phone: piazza, (Lab.
Mohammed I DAABO COURSE CODE: CSC 355 COURSE TITLE: Data Structures.
CSCE 210 Data Structures and Algorithms
Why indexing? For efficient searching of a document
Subject : Computer Science
Planning & System installation
Azita Keshmiri CS 157B Ch 12 indexing and hashing
CS 315 Data Structures B. Ravikumar Office: 116 I Darwin Hall Phone:
Datastructure.
Chapter 11: Indexing and Hashing
Chapter 12 Query Processing (1)
Hashing.
Chapter 11: Indexing and Hashing
CSE 326: Data Structures Lecture #14
Presentation transcript:

CS 315 Data Structures Fall 2011 Instructor: B. (Ravi) Ravikumar Office: 116 I Darwin Hall Phone: Course Web site:

Textbook: Data Structures and Algorithm Analysis in C++. 3 rd edition by Mark Allen Weiss Course Schedule Lecture: M W 10:45 – 12, Salazar 2024 Lab: F 9 to 11:50 AM, Darwin 28

Data Structures – what is the main focus? (more) programming larger problems (than what was studied in CS 215) new data types (images, audio, text) (systematic) algorithm design performance issues comparison of algorithms time, storage requirements applications image processing compression web search board games

Data Structure – what is the main focus? How to organize data in memory so that the we can solve the problem most efficiently? Hard disk RAM CPU

Data Structure – what is the main focus? Topics of concern: software design, problem solving and applications online vs. offline phase of a solution preprocessing vs. updating trade-offs between operations efficiency

Course Goals Learn to use fundamental data structures: arrays, linked lists, stacks and queues hash table priority queue binary search tree etc. Improve programming skill recursion, classes, algorithm design, implementation build projects using different data structures Analytical and experimental analysis quantitative reasoning about the performance of algorithms (time, storage, etc.) comparing different data structures

Course Goals Applications: image storage, manipulation compression (text, image, video etc.) audio processing web search algorithms for networking and communication routing protocols computer interconnections

Data Structures – key to software design Data structures play a key role in every type of software. Data structure deals with how to store the data internally while solving a problem in order to Optimize the overall running time of a program the response time (for queries) the memory requirements other resources (e.g. band-width of a network) Simplify software design make solution extendible, more robust

Abstract vs. concrete data structures w Abstract data structure (sometimes called ADT -> Abstract Data Type) is a collection of data with a set of operations supported to manipulate the structure w Examples: stack, queue insert, delete priority queue insert, deleteMin Dictionary insert, search, delete w Concrete data structures are the implementations of abstract data structures: Arrays, linked lists, trees, heaps, hash table w A recurring theme: Find the best mapping between abstract and concrete data structures.

Abstract Data Structure (ADT) container supporting operations Dictionary search insert primary operations Delete deleteMin Range search Successor secondary operations Merge Priority queue Insert deleteMin Merge, split etc. Secondary operations primary operations

Linear data structures key properties of the (1-dim.) array: a sequence of items are stored in consecutive physical memory locations. main advantage: array provides a constant time access to k-th element for any k. (access the element by: Element[k].) other operations are expensive: Search Insert delete

2-dim. arrays w Used to store images, tables etc. w Given row number r, and column number s, the element in A[r, s] can be accessed in one clock cycle. w (usually row major or column major order is used.) w Other operations are expensive. w Sparse array representation Used to compress images Trade-offs between storage and time

Linked lists w Linked lists: Storing a sequence of items in non- consecutive locations of the memory. Not easy to search for a key (even if sorted). Inserting next to a given item is easy. Array vs. linked list: Don’t need to know the number of items in advance. (dynamic memory allocation) disadvantages order is important

stacks and queues stacks: insert and delete at the same end. equivalently, last element inserted will be the first one to be deleted. very useful to solve many problems Processing arithmetic expressions queues: insert at one end, deletion at the other end. equivalently, first element inserted is the first one to be deleted.

Non-linear data structures w Various versions of trees Binary search trees Height-balanced trees etc. Lptr key Rptr 15 Main purpose of a binary search tree  support dictionary operations efficiently

Priority queue w Max priority key is the one that gets deleted next. Equivalently, support for the following operations: insert deleteMin w Useful in solving many problems fast sorting (heap-sorting) shortest-path, minimum spanning tree, scheduling etc.

Hashing Supports dictionary operations very efficiently (most of the time). Main advantages: Simple to design, implement on average very fast not good in the worst-case.

Applications arithmetic expression evaluation data compression (Huffman coding, LZW algorithm) image segmentation, image compression backtrack searching finding the best path to route in a network geometric problems (e.g. rectangle area)

What data structure to use? Example 1: There are more than 1 billion web pages. When you type on google search page something like: You get instantaneous response. What kind of data structure is used here? The details are quite complicated, but the main data structure used is quite simple.

Data structure used - inverted index Array of lists – each array entry contains a word and a pointer to all the web pages that contain that word: Data structure This list is kept sorted 876 Question: How do we access the array index from key word? Hashing is used.

Example 2: The entire landscape of the world is being digitized (there is a whole new branch that combines information technology and geography called GIS – Geographic Information System). What kind of data structure should be used to store all this information? Snapshot from Google earth

Some general issues related to GIS How much memory do we need? Can this be stored in one computer? Building the database is done in the background (off-line processing) How fast can the queries be answered? Response to query is called the on-line processing Suppose each square mile is represented by a 1024 by 1024 pixel image, how much storage do we need to store the map of the United States?

Calculate the memory needed Very rough estimate of the memory needed: Area of USA is 4 x 10 6 sq miles (roughly) Each square mile needs 10 6 pixels (roughly) Each pixel requires 32 bits usually. Thus the total memory needed = 4 x 10 6 x 32 x 10 6 = 168 x = Giga bits (A standard desk top has ~ 200 Giga bits of memory.) Need about 800 such computers to store the data

What data structure to store the images? each 1024 x 1024 image can be stored in a two- dimensional array. (standard way to store all kinds of images – bmp, jpg, png etc.) The actual images are stored in a secondary memory (hard disks on several servers either in a central location or distributed). The number of images would be roughly 4 x A set of pointers to these images can be stored in a 1 (or 2) dimensional array. When you click on a point on the map, its index in the array is calculated. From that index, the image is accessed and sent by a network to the requesting client.

Some projects from past semesters Generate all the poker hands More generally, given a set of N items and a number k<= N, generate all possible combinations or permutations of k items. (concept: recursion, arrays, lists) Image manipulation: (concept: arrays, library, algorithm analysis) 1)

image manipulation: (concept: arrays, library, analysis of algorithm)

Bounding box construction: OCR is one of the early success stories in software applications. Scan a printed page and recognize the characters in it. First step: bounding box construction.

Input: Output: “In 1830 there were but twenty-three miles of railroad in operation in the United States, and in that year Kentucky took … “ Final step:

Spelling checker: Given a text file T, identify all the misspelled words in T. Idea: build a hash table H of all the words in a dictionary, and search for each word of the text T in the table H. For each misspelled word, suggest the correct spelling. (hashing, strings, vectors)

Peg solitaire (backtracking, recursion, hash table) Find a sequence of moves that leaves exactly one peg on the board. (starting position can be specified. In some cases, there may be no solution.)

Geometric computation problem – given a set of rectangles, determine the total area covered by them. Trace the contour, report all intersections etc. Data structure: binary search tree.

Given two photographs of the same scene taken from two different positions, combine them into a single image.

Image compression original (compressed x10) (compressed x 50) (Quadtree data structure)

Index generation for a document Index contains the list of all the words appearing in a document, with the line numbers in which they appear. Typical index for a book looks: Data structure binary search tree, hash table