DATA STRUCTURES INTRODUCTION CSC 172 SPRING 2004
COURSE GOALS Write (lots of) great code Understand the use of abstraction in computer science Design efficient data structures and algorithms Understand the use of mathematical tools in analysis for computer science Develop general problem solving skills
What this course is about Abstraction – Specifically, abstraction in computer programming through the use of abstract data types (ADTs) – Abstraction is powerful Complexity of detail is encapulated and hidden Allowing operations at a higher level – Your ability to abstract directly effects your productivity (as a lot of things) Analysis – What makes a good abstraction?
Abstraction “The The acts of the mind, wherein it exerts its power over simple ideas, are chiefly these three: 1. Combining several simple ideas into one compound one, and thus all complex ideas are made. 2. The second is bringing tow ideas, whether simple or complex, together, and setting them by one another so as to take a view of them at once, without uniting them into one, by which it gets all its ideas of relations. 3. The third is separating them from all other ideas that accompany them in their real existence: this is called abstraction, and thus all its general ideas are made” - John Lock, An Essay Concerning Human Understanding (1690)
Abstraction “Leaving out of consideration one or more qualities of a complex object so as to attend to others.” - Mirriam Webster’s 9 th Collegiate
Computer Science The Mechanization of Abstraction “Computer Science is a science of abstraction – creating the right model for thinking about a problem and devising the appropriate mechanizable techniques to solve it” - Alfred Aho, 1995
Creativity as Science “Every other science deals with the universe as it is. The physicist’s job, for example, is to understand how the world works, not to invent a world in which physical laws would be simpler or more pleasant to follow. Computer Scientists, on the other hand, must create abstractions of real-world problems that can be understood by computer users and, at the same time, that can be represented and manipulated inside a computer” - Jeffrey Ullman, 1995
Example: Dictionary ADT A dictionary is an abstract model of a database In dictionaries, we look up definitions using words – Words : “keys” – Definitions : “elements” The main operation supported by a dictionary is searching by key
Dictionary ADT Simple container methods – size() – isEmpty() Querry methods – findElement(k) Update methods – insertItem(k,e) – remove(k) Special object – NO_SUCH_KEY, returned by an unsuccessful search
Example: Dictionary ADT As “users”, we can think about dictionaries in terms of such functionality without worrying about how they are implemented – Abstraction, problem conceptualization As “programmers” we can think about how dictionaries are implemented without worrying about how they are used – Analysis, what is fast & efficient
Which Abstractions/Implementations? Lists Stacks Queues Trees Sets Graphs
What analysis? “Anyone” can write a computer program. What is the difference between a good solution and a not-so-good solution? – Running time – Memory requirements Mathematical tools – Proof by induction – Combinatorics – Probability – Asymptotic analysis (Big-Oh)
Example One dimensional pattern recognition Input: a vector x of n floating point numbers Output: the maximum sum found in any contiguous subvector of the input. X[2..6] or 187 How would you solve this?
Obvious solution Check all pairs int sum; int maxsofar = 0; for (int i = 0; i<x.length;i++) for (int j = i; j<x.length;j++){ sum = 0; for (int k = i;k<=j;k++) sum += x[k]; maxsofar = max(sum,maxsofar); }
How long does the obvious solution take? We could “measure” it – benchmarking – What is a “good size” of input to measure? – What machine do we measure it on?
How long does the obvious solution take? We could “analyse” it – Multiply the “cost” (time required) to do something by the number of times you have to do it. – If n is the length of the array – Outer loop runs exactly n times – Inner loop runs at most n times – Inner most loop runs no more than n times – Let’s say the “+=“ and “max” take unit time
How long does the obvious solution take? Innermost cost = n * 1 //worst case Innerloop cost = n * (Innermost cost) +1 Outerloop cost = n * (Innerloop cost) Outerloop cost = n * ( n * (Innermost cost) +1) Outerloop cost = n * (n *(n + 1) +1) Outerloop cost = n * (n 2 + n + 1) Outerloop cost = n 3 + n 2 +n
How long does the obvious solution take? We call this an “n 3 ” solution Can you think of a better (faster) way? Can you do an analysis that will prove it better? That is what we do in CSC 172 – For some very common tasks
Instructor Prof. Ted Pawlicki, CSB 722, ext Office Hours: TR 2PM-3PM -lunch meetings also available Lecture: T,R 2:00PM-3:15PM AM Dewey 1-101
Text Data Structures & Problem Solving using Java 2 nd Ed. By Mark Allen Weiss – Class, Lab, & Workshops
Grad TAs (Mostly Projects)
Lead UG TA (Mostly grades)
Labs: Taylor 30 Labs: Taylor 30 MW 4:50-6:05 MW 6:15-7:30 * MW 2:00-3:15 TR 3:25-4:40 * Register as a class
WORKSHOPS WORKSHOPS Sign up Tuesday
EXAMS
Workshops Required - attendance 10% of Grade Positive effect on grades – Independent of extra credit Group skills are valued by employers – Industrial & academic
Extra Credit Reports 10% boost for page analysis of the UG CS curriculum at a selected institution – Instructor assigns institution – Overview of curricular requirements – Syllabus of 1 st year courses (CS1 & CS2) – Compare & contrast to UR CS curriculum – Revision expected
Is this a hard course? Yes One course has to be the hardest All majors have hard courses “No pain, no gain” Start early
Q&A Do you understand what is expected of you?