Java AP Computer Science AB: Tradeoffs, Intuition,

Slides:



Advertisements
Similar presentations
Abstract Data Types (ADTs) Data Structures The Java Collections API
Advertisements

SIGCSE Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan
Week 2 CS 361: Advanced Data Structures and Algorithms
CPS 100, Spring Analysis: Algorithms and Data Structures l We need a vocabulary to discuss performance and to reason about alternative algorithms.
Analysis of Algorithms
1 Java: AP Curriculum Focus and Java Subset Alyce Brady.
CPS 100, Fall COMPSCI 100, Fall 2009 Owen Astrachan
CompSci 100E 1.1 CompSci 100E Dietolf (Dee) Ramm Robert A. Wagner
CPS From sets, toward maps, via big-Oh l Consider the MemberCheck weekly problem  Find elements in common to two vectors  Alphabetize/sort resulting.
CPS What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread.
Project JETT/Duke 1 Project JETT Java Engagement for Teacher Training Duke University Department of Computer Science Association for Computing Machinery.
CPS 100, Spring COMPSCI 100, Spring 2009 Owen Astrachan
Efficiency of Algorithms. Node - data : Object - link : Node + createNode() + getData() + setData() + getLink() + setLink() + addNodeAfter() + removeNodeAfter()
CompSci Analyzing Algorithms  Consider three solutions to SortByFreqs, also code used in Anagram assignment  Sort, then scan looking for changes.
Collections Mrs. C. Furman April 21, Collection Classes ArrayList and LinkedList implements List HashSet implements Set TreeSet implements SortedSet.
List Interface and Linked List Mrs. Furman March 25, 2010.
CompSci 100E 15.1 What is a Binary Search?  Magic!  Has been used a the basis for “magical” tricks  Find telephone number ( without computer ) in seconds.
Algorithm Analysis. What is an algorithm ? A clearly specifiable set of instructions –to solve a problem Given a problem –decide that the algorithm is.
CPS 100e 5.1 Inheritance and Interfaces l Inheritance models an "is-a" relationship  A dog is a mammal, an ArrayList is a List, a square is a shape, …
CPS What's in Compsci 100? l Understanding tradeoffs: reasoning, analyzing, describing…  Algorithms  Data Structures  Programming  Design l.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
 2016, Marcus Biel, ArrayList Marcus Biel, Software Craftsman
AP National Conference, AP CS A and AB: New/Experienced A Tall Order? Mark Stehlik
CompSci 100e 7.1 Plan for the week l Recurrences  How do we analyze the run-time of recursive algorithms? l Setting up the Boggle assignment.
TAFTD (Take Aways for the Day)
Algorithm Analysis 1.
Sorting and "Big Oh" ASFA AP Computer Science A SortingBigOh.
Using the Java Collection Libraries COMP 103 # T2
Analysis of Algorithms
Analysis: Algorithms and Data Structures
Algorithmic complexity: Speed of algorithms
Analysis of Algorithms
Searching – Linear and Binary Searches
CSC 222: Object-Oriented Programming
CompSci 100 Dietolf (Dee) Ramm
Analysis of Algorithms
Introduction to Algorithms
Compsci 201 Recursion, Recurrences, & Trees
From data to information
Compsci 201, Mathematical & Emprical Analysis
What's a pointer, why good, why bad?
Introduction to Algorithms
Tools: Solve Computational Problems
COSC160: Data Structures Linked Lists
Top Ten Words that Almost Rhyme with “Peas”
structures and their relationships." - Linus Torvalds
Analysis of Algorithms
Big-Oh and Execution Time: A Review
Algorithm Analysis (not included in any exams!)
Building Java Programs
Algorithm design and Analysis
Algorithm Efficiency Chapter 10.
Data Structures Review Session
Programming and Data Structure
Analysis of Algorithms
Programming and Data Structure
Algorithmic complexity: Speed of algorithms
Building Java Programs
ArrayLists 22-Feb-19.
Collections Framework
Algorithmic complexity: Speed of algorithms
At the end of this session, learner will be able to:
Sum this up for me Let’s write a method to calculate the sum from 1 to some n public static int sum1(int n) { int sum = 0; for (int i = 1; i
COMPSCI 100, Spring 2008 Owen Astrachan
Analysis of Algorithms
structures and their relationships." - Linus Torvalds
Java Basics – Arrays Should be a very familiar idea
Algorithms and data structures: basic definitions
Compsci 201, O-Notation and Maps (Interfaces too)
Presentation transcript:

Java AP Computer Science AB: Tradeoffs, Intuition, Java AP Computer Science AB: Tradeoffs, Intuition, Object-Oriented Stuff Owen Astrachan Department of Computer Science, Duke University Workshop sponsored by NCSSM/Project NOW and IBM Eclipse

Why Java and C++ aren't so different James Gosling Bjarne Stroustrup

Efficient Programming Designing and building efficient programs efficiently requires knowledge and practice Hopefully the programming language helps, it’s not intended to get in the way Object-oriented concepts, and more general programming concepts help in developing programs Knowledge of data structures and algorithms helps Tools of the engineer/scientist/programmer/designer A library or toolkit is essential, API or wheel re-invention? Programming: art, science, engineering? None or All? Mathematics is a tool Design Patterns are a tool

David Parnas "For much of my life, I have been a software voyeur, peeking furtively at other people's dirty code. Occasionally, I find a real jewel, a well-structured program written in a consistent style, free of kludges, developed so that each component is simple and organized, and designed so that the product is easy to change. " "We must not forget that the wheel is reinvented so often because it is a very good idea; I've learned to worry more about the soundness of ideas that were invented only once. "

Questions If you gotta ask, you’ll never know Louis Armstrong: “What’s Jazz?” If you gotta ask, you ain’t got it Fats Waller: “What’s rhythm?” What questions did you ask today? Arno Penzias

Tradeoffs The AB course is about all kinds of tradeoffs: programming, structural, algorithmic Programming: simple, elegant, quick to run/to program Tension between simplicity and elegance? Structural: how to structure data for efficiency What issues in efficiency? Time, space, programmer-time Algorithmic: similar to structural issues How do we decide which choice to make, what tradeoffs are important?

See WordReader.java This reads words, how can we count unique words? WordReader reader = new WordReader(); ArrayList list = new ArrayList(); while (reader.hasNext()) { list.add(reader.next()); } System.out.println("# words = "+list.size()); What do we know about WordReader? Interfaces implemented? How does it work?

Tracking different/unique words We want to know how many times ‘the’ occurs Do search engines do this? Does the number of occurrences of “basketball” on a page raise the priority of a webpage in some search engines? Downside of this approach for search engines? Constraints on solving this problem We must read every word in the file (or web page) We must search to see if the word has been read before We must process the word (bump a count, store the word) Are there fundamental limits on any of these operations? Where should we look for data structure and algorithmic improvements?

Search: measuring performance How fast is fast enough? // post: return true if and only if key found in a boolean search(String[] list, String key) { for(k=0; k < list.length; k++) if (a[k].equals(key)) return true; return false; } Java details: parameters? equals? array code? How do we measure performance of code? Of algorithm? Does processor make a difference? PIII, PIV, G4, G5, ??

Tradeoffs in reading and counting Read words, then sort, determine # unique words? frog, frog, frog, rat, tiger, tiger, tiger, tiger If we look up words as we're reading them and bump a counter if we find the word, is this slower than previous idea? How do we look up word, how do we add word Are there kinds of data that make one approach preferable? What is best case, worst case, average case? How do we reason about tradeoffs, what tools needed?

What is big-Oh about? Intuition: avoid details when they don’t matter, and they don’t matter when input size (N) is big enough For polynomials, use only leading term, ignore coefficients y = 3x y = 6x-2 y = 15x + 44 y = x2 y = x2-6x+9 y = 3x2+4x The first family is O(n), the second is O(n2) Intuition: family of curves, generally the same shape More formally: O(f(n)) is an upper-bound, when n is large enough the expression cf(n) is larger Intuition: linear function: double input, double time, quadratic function: double input, quadruple the time

More on O-notation, big-Oh Big-Oh hides/obscures some empirical analysis, but is good for general description of algorithm Allows us to compare algorithms in the limit 20N hours vs N2 microseconds: which is better? O-notation is an upper-bound, this means that N is O(N), but it is also O(N2); we try to provide tight bounds. Formally: A function g(N) is O(f(N)) if there exist constants c and n such that g(N) < cf(N) for all N > n cf(N) g(N) x = n

Big-Oh calculations from code Search for element in ArrayList: What is complexity of code (using O-notation)? What if array doubles, what happens to time? for(int k=0; k < a.size(); k++) { if (a.get(k).equals(target)) return true; }; return false; Complexity if we call N times on M-element vector? What about best case? Average case? Worst case?

Big-Oh calculations again Consider coding problem 2: first element to occur 3 times What is complexity of code (using O-notation)? for(int k=0; k < a.length k++) { int count = 1; for(int j=0; j < k; k++) { if (a[j].equals(a[k])) count++; } if (count >= 3) return a[k]; }; return ""; // no one on probation What if we initialize counter to 0, loop to <= k ? What is invariant describing value stored in count? What happens to time if array doubles in size?

Big-Oh calculations again Add a new element at front of ArrayList, complexity? Only count element assignments, not list growth a.add(0,newElement); // one statement! a.add(newElement); // make room for it for(int k=a.size()-2; k >=0; k--) { a.set(k+1,a.get(k)); // shift right }; a.set(0,newElement); If we call the code above N times on an initially empty vector, what’s the complexity using big-Oh? What about growing the ArrayList? How does this work? If vector doubles in size? If vector increases by one?

Some helpful mathematics 1 + 2 + 3 + 4 + … + N N(N+1)/2, exactly = N2/2 + N/2 which is O(N2) why? N + N + N + …. + N (total of N times) N*N = N2 which is O(N2) N + N + N + …. + N + … + N + … + N (total of 3N times) 3N*N = 3N2 which is O(N2) 1 + 2 + 4 + … + 2N 2N+1 – 1 = 2 x 2N – 1 which is O(2N ) Impact of last statement on adding 2N+1 elements to a vector 1 + 2 + … + 2N + 2N+1 = 2N+2-1 = 4x2N-1 which is O(2N) log(1) + log(2) + log(3) + … + log(N) log(N!) roughly N log (N)

Running times @ 106 instructions/sec O(log N) O(N) O(N log N) O(N2) 10 0.000003 0.00001 0.000033 0.0001 100 0.000007 0.00010 0.000664 0.1000 1,000 0.000010 0.00100 0.010000 1.0 10,000 0.000013 0.01000 0.132900 1.7 min 100,000 0.000017 0.10000 1.661000 2.78 hr 1,000,000 0.000020 19.9 11.6 day 1,000,000,000 0.000030 16.7 min 18.3 hr 318 centuries

Sequential search revisited What is postcondition of code below? How would it be called initially? Another overloaded function search with 2 parameters? boolean search(String[] v, int index, String target) { if (index >= v.length) return false; else if (v[index].equals(target)) return true; else return search(v,index+1,target); } What is complexity (big-Oh) of this function?

Recurrences What is complexity? justification? boolean search(String[] v, int index, String target) { if (index >= v.length) return false; else if (v[index].equals(target)) return true; else return search(v,index+1,target); } What is complexity? justification? T(n) = time to search an n-element array T(n) = T(n-1) + 1 T(0) = 1 instead of 1, use O(1) for constant time independent of n, the measure of problem size

Solving recurrence relations plug, simplify, reduce, guess, verify? T(n) = T(n-1) + 1 T(0) = 1 T(n) = T(n-k) + k find the pattern! Now, let k=n, then T(n) = T(0)+n = 1+n get to base case, solve the recurrence: O(n) T(n-1) = T(n-1-1) + 1 T(n) = [T(n-2) + 1] + 1 = T(n-2)+2 T(n-2) = T(n-2-1) + 1 T(n) = [(T(n-3) + 1) + 1] + 1 = T(n-3)+3

Consider merge sort for linked lists Given a linked list, we want to sort it Divide the list into two equal halves Sort the halves Merge the sorted halves together What’s complexity of dividing an n-node list in half? How do we do this? What’s complexity of merging (zipping) two sorted lists? T(n) = time to sort n-node list = 2 T(n/2) + O(n) why?

sidebar: solving recurrence T(n) = 2T(n/2) + n T(XXX) = 2T(XXX/2) + XXX T(1) = 1 (note: n/2/2 = n/4) T(n) = 2[2T(n/4) + n/2] + n = 4 T(n/4) + n + n = 4[2T(n/8) + n/4] + 2n = 8T(n/8) + 3n = ... eureka! = 2k T(n/2k) + kn let 2k = n k = log n, this yields 2log n T(n/2log n) + n(log n) n T(1) + n(log n) O(n log n)

Complexity Practice What is complexity of Build? (what does it do?) ListNode Build(int n) { if (0 == n) return null; ListNode first = new ListNode(n,Build(n-1)); for(int k = 0; k < n-1; k++) { first = new ListNode(n,first); } return first; Write an expression for T(n) and for T(0), solve.

Recognizing Recurrences Solve once, re-use in new contexts T must be explicitly identified n must be some measure of size of input/parameter T(n) is the time for quicksort to run on an n-element vector T(n) = T(n/2) + O(1) binary search O( ) T(n) = T(n-1) + O(1) sequential search O( ) T(n) = 2T(n/2) + O(1) tree traversal O( ) T(n) = 2T(n/2) + O(n) quicksort O( ) T(n) = T(n-1) + O(n) selection sort O( ) Remember the algorithm, re-derive complexity log n n n n log n n2

Nancy Leveson: Software Safety Founded the field Mathematical and engineering aspects Air traffic control Microsoft word "C++ is not state-of-the-art, it's only state-of-the-practice, which in recent years has been going backwards" Software and steam engines: once extremely dangerous? http://sunnyday.mit.edu/steam.pdf THERAC 25: Radiation machine that killed many people http://sunnyday.mit.edu/papers/therac.pdf

Java Object Basics Class is an object factory Class can have (static) methods Defines interface programmers depend on Part of inheritance hierarchy (every class is-an Object) May define getter and setter methods for accessing state Usually defines methods for behavior An object is an instance of a class Attributes aka state, usually private, sometimes accessible Behavior invoked by calling methods Allocated by calling new

java.util.Collection Interface for collections of objects See API for details, AP subset requires only a few of the methods for subinterfaces List and Set, and for separate interface Map Which interface and class should be used? Tradeoffs. Why choose ArrayList vs LinkedList? Why choose HashSet vs TreeSet? What about classes, interfaces, and methods not in the subset? Collection interface compared to Collections class Collection.addAll(Collection); // object method Collections.sort(Collection); // static method

Collection hierarchy, tradeoffs? List Set ArrayList LinkedList HashSet TreeSet add size iterator listIterator contains* remove** get(int) set(int,o) add(int,o) remove(int) getFirst getLast addFirst addLast removeFirst removeLast *Collection method **Collection/optional

Arrays and the AP subset One and two-dimensional arrays in subset Two-dimensional arrays are AB only topic int[][] grid = new int[6][10]; int rows = int.length; int cols = int[0].length; Initialization in subset, e.g., int[] list = {1,2,3,4,5}; No java.util.Arrays methods in subset sort, binarySearch, fill, asList, …

ArrayList and the AP subset Inheritance hiearchy (List in java.util) is AB only Iterator and ListIterator are AB only Downcast from Object to expected type is in subset list.add(new String(“hello”)); String s = (String) list.get(0); Required methods : size(), get(int), set(int, Object), add(Object), add(int, Object), remove(int) NOT required, but very useful remove(Object), addAll(Collection), clear()

What is an Iterator? What problems do Iterators address? Access elements independently of implementation Client programs written in terms of generic component public void print(Collection c) { Iterator it = c.iterator(); while (it.hasNext()) { System.out.println(it.next()); } How do you add all elements of Set to a List?

What is an interface? Indication to programmers and code that a class implements some specified functions, most likely with requirements on behavior Iterator is an interface: what do we expect from classes that implement the Iterator interface? Comparable: are some objects incomparable? Why? Why isn’t Equatable an interface? Where is .equals() ? A class can implement multiple interfaces Comparable, Cloneable, Tickable, … Think twice before developing inheritance hierarchy Single inheritance, problems with protected/private data

What is Comparable? String a = “hello”; String b = “zebra”; int x = a.compareTo(b); // what values assigned? int y = b.compareTo(a); int z = a.compareTo(a); Contract: compareTo() should be consistent with equals() What’s the simple way to write equals for Comparable? See also java.util.Comparator Not in subset, useful for sorting on other criteria

TreeNode and Comparable APCS AB TreeNode class stores Objects Why not Comparable objects rather than generic ones? When building search trees, what do we do? public TreeNode add(TreeNode root, Object obj) { if (root == null) return new TreeNode(obj,null,null); Comparable rootValue = (Comparable) root.getValue(); if (rootValue.compareTo(obj) < 0) root.setRight(add(root.getRight(),obj)); else root.setLeft(add(root.getLeft(), obj)); return root; }

Collections and Comparable The TreeSet class only stores Comparable objects Function to add requires Object, why not Comparable? public boolean add(Object o) { Comparable comp = (Comparable) o; // more code here… The AP PriorityQueue class has Object parameters Priority requires a Comparable object public Object peekMin() // value returned will implement Comparable

Who is Alan Perlis? It is easier to write an incorrect program than to understand a correct one Simplicity does not precede complexity, but follows it If you have a procedure with ten parameters you probably missed some If a listener nods his head when you're explaining your program, wake him up Programming is an unnatural act Won first Turing award http://www.cs.yale.edu/homes/perlis-alan/quotes.html