CompSci 101 Introduction to Computer Science

Slides:



Advertisements
Similar presentations
Announcements You survived midterm 2! No Class / No Office hours Friday.
Advertisements

How does Google search for everything? Searching For and Organizing Data Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014.
Complexity (Running Time)
Searching/Sorting Introduction to Computing Science and Programming I.
 2005 Pearson Education, Inc. All rights reserved Searching and Sorting.
 Pearson Education, Inc. All rights reserved Searching and Sorting.
How to Read Code Benfeard Williams 6/11/2015 Susie’s lecture notes are in the presenter’s notes, below the slides Disclaimer: Susie may have made errors.
Announcements Course evaluation Your opinion matters! Attendance grades Will be posted prior to the final Project 5 grades Will be posted prior to the.
CompSci 101 Introduction to Computer Science November 18, 2014 Prof. Rodger.
Compsci 101.2, Fall Plan for TDAFB (after fall break) l Hear via about plans to catch-up and stay-ahead  Profs. Rodger and Astrachan to.
Compsci 101.2, Fall Plan for FWON l Review current assignments and APTs  Review Dictionaries and how to use them  Code and APT walk-through.
CompSci 101 Introduction to Computer Science March 31, 2015 Prof. Rodger Thanks to Elizabeth Dowd for giving this lecture Review for exam.
Compsci 101.2, Fall Plan for October 29 l Review dictionaries and their use  Very efficient, easy to use  Efficiency doesn't matter much for.
Compsci 101.2, Fall Plan for eleven-four l Thinking about APTs and test problems  How do you choose: list, string, set, dictionary  Experience?
Copyright © 2014 Curt Hill Algorithms From the Mathematical Perspective.
CompSci 101 Introduction to Computer Science January 15, 2015 Prof. Rodger 1.
CMPT 120 Topic: Searching – Part 2 and Intro to Time Complexity (Algorithm Analysis)
Searching/Sorting. Searching Searching is the problem of Looking up a specific item within a collection of items. Searching is the problem of Looking.
CompSci 101 Introduction to Computer Science March 29, 2016 Prof. Rodger.
CompSci 101 Introduction to Computer Science February 5, 2015 Prof. Rodger Lecture given by Elizabeth Dowd compsci101 spring151.
CompSci 101 Introduction to Computer Science March 31, 2016 Prof. Rodger.
CMSC201 Computer Science I for Majors Lecture 23 – Algorithms and Analysis Prof. Katherine Gibson Based on slides from previous iterations.
16 Searching and Sorting.
CMSC201 Computer Science I for Majors Lecture 20 – Recursion (Continued) Prof. Katherine Gibson Based on slides from UPenn’s CIS 110, and from previous.
CompSci 101 Introduction to Computer Science
CMPT 120 Topic: Searching – Part 2
Week 9 - Monday CS 113.
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Week 13: Searching and Sorting
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
CompSci 101 Introduction to Computer Science
CMSC201 Computer Science I for Majors Lecture 22 – Searching
Intro to Computer Science CS1510 Dr. Sarah Diesburg
CompSci 101 Introduction to Computer Science
CMSC201 Computer Science I for Majors Lecture 18 – Recursion
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Introduction to Programmng in Python
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
More complexity analysis & Binary Search
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Last Class We Covered Data representation Binary numbers ASCII values
CompSci 101 Introduction to Computer Science
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
Winter 2018 CISC101 12/1/2018 CISC101 Reminders
CompSci 101 Introduction to Computer Science
CompSci 101 Introduction to Computer Science
Intro to Computer Science CS1510 Dr. Sarah Diesburg
Prof. Susan Rodger Computer Science Dept Duke University Oct. 31, 2014
Winter 2018 CISC101 12/2/2018 CISC101 Reminders
CMSC201 Computer Science I for Majors Lecture 19 – Recursion
Scratch Where Are You Now?
Last Class We Covered Dictionaries Hashing Dictionaries vs Lists
Computational Thinking for KS3
Searching and Sorting Topics Sequential Search on an Unordered File
Algorithmic complexity: Speed of algorithms
Searching, Sorting, and Asymptotic Complexity
Data Structures Sorted Arrays
CompSci 101 Introduction to Computer Science
Algorithmic complexity: Speed of algorithms
Data Structures Introduction
slides created by Ethan Apter
CSE 326: Data Structures Lecture #14
Chapter 1: Creating a Program.
CompSci 101 Introduction to Computer Science
Presentation transcript:

CompSci 101 Introduction to Computer Science March 30, 2017 Prof. Rodger compsci 101 spring 2017

Announcements Assign 6 due, Assign 7 out due April 6 APT 8 due Tuesday APT Quiz 2 Sunday-Tuesday available 6pm or earlier on Sunday Exam 2 Tuesday, April 11 Today: Why are dictionaries so fast? More problem solving with dictionaries Finish problem from last time No practice quiz, do APT 8 for practice, try to complete it before doing the quiz.

Be in the know…. ACM, compsci mailing lists Association of Computing Machinery (ACM) Professional organization for computer science Duke Student ACM Chapter – join for free Join duke email lists to find out info on jobs, events for compsci students lists.duke.edu – join lists: compsci – info from compsci dept dukeacm – info from student chapter

From Last time - Dictionary Consider the Python dictionary below maps schools to number of students in the ACM Club at their school d = {'duke':30, 'unc':50, 'ncsu':40, 'wfu':50, 'ecu': 80, 'meridith':30, 'clemson':80, 'gatech':50, 'uva':120, 'vtech':110} Dictionary to answer which schools have X students? … which schools have groups of students 1-49, 50-99, etc? compsci 101 spring 2017

Dictionary of schools to number students keys values ecu 110 vtech duke 50 clemson 30 wfu unc 80 gatech meridth Note we do not know their ordering for either keys or values 40 ncsu 120 uva compsci 101 spring 2017

Dictionary of schools to number students ecu 110 vtech duke 30 meridith wfu 50 unc 80 gatech clemson 40 ncsu 120 uva compsci 101 spring 2017

Dictionary of schools to number students Dictionary of number students to schools ecu 110 vtech duke 30 meridith wfu 50 unc 80 gatech clemson 40 ncsu 120 uva compsci 101 spring 2017

Dictionary of schools to number students Dictionary of number students to schools ecu 110 vtech duke 30 meridith wfu 50 unc 80 gatech clemson 40 ncsu 120 uva compsci 101 spring 2017

Dictionary of schools to number students Dictionary of number students to schools vtech ecu 110 vtech duke meridith duke 30 meridith wfu gatech unc wfu 50 unc 80 gatech ecu clemson clemson 40 ncsu ncsu 120 uva uva compsci 101 spring 2017

Inverted Dictionary bit.ly/101s17-0330-1 Start with dictionary of keys to values Schools to number of students Use it to build an inverted dictionary of values to keys (actually list of keys) Number of students to list of schools Lets look at the code compsci 101 spring 2017

Dictionary of number groups to list of schools duke Meridith ncsu 0-49 wfu gatech unc 50-99 100-150 ecu clemson vtech uva compsci 101 spring 2017

APT EmailsCourse bit.ly/101s17-0330-2

Dictionary Song problem bit.ly/101s17-0330-3 songs = ["Hey Jude:Let it be:Day Tripper", "Let it be:Drive my car:Hey Jude", "I want to hold your hand:Day Tripper:Help!", "Born to run:Thunder road:She's the one", "Hungry heart:The river:Born to run", "The river:Thunder road:Drive my car", "Angie:Start me up:Ruby Tuesday", "Born to run:Angie:Drive my car"] compsci 101 spring 2017

Assignment 7 – Demo Snarky, Evil, Frustrating Hangman Computer changes secret word every time player guesses to make it "hard" to guess Must be consistent with all previous guesses Idea: the more words there are, harder it is Not always true! Example of greedy algorithm Locally optimal decision leads to best solution More words to choose from means more likely to be hung compsci 101 spring 2017

Canonical Greedy Algorithm How do you give change with fewest number of coins? Pay $1.00 for something that costs $0.43 Pick the largest coin you need, repeat If you have 50 cent piece it also works 50cent is the guy in the right, a wrapper compsci 101 spring 2017

Greedy not always optimal What if you have no nickels? Give $0.31 in change Algorithms exist for this problem too, not greedy! compsci 101 spring 2017

Snarky Hangman When you guess a letter, you're really guessing a category (secret word "salty") _ _ _ _ _ and user guesses 'a' "gates", "cakes", "false" are all a the same, in 2cd position "flats", "aorta", "straw", "spoon" are all a in different places How can we help ensure player always has many words to distinguish between? Should we tell the user she has guessed an 'a' correctly? Or should we switch to a new secret word. IT DEPENDS! If we switch to "aorta", user might get word quickly, how many words fit A _ _ _ A? If we switch to SPOON? Run it with DEDUG OFF Run it with DEBUG ON

Debugging Output number of misses left: 8 secret so far: _ _ _ _ _ _ _ _ (word is catalyst ) # possible words: 7070 guess a letter: a a__a___a 1 … _a______ 587 __aa____ 1 __a_____ 498 ________ 3475 ___a____ 406 ____a___ 396 # keys = 48 number of misses left: 7 letters guessed: a … (word is designed ) # possible words: 3475 guess a letter: This is debugging output, shows some of the 48 different categories when an 'a' is guessed for an 8 letter word. Notice That after one guess the computer switches to "designed" which has 3,475 possibilities to distinguish between, where as switching To _ A A _ _ _ _ wouldn't give so many words, turns out that' isaacson, not really a word anyway , but the secret word ANACONDA would be just as bad, only one word fitting that pattern too compsci 101 spring 2017

Debugging Output and Game Play Sometimes we want to see debugging output, and sometimes we don't While using microsoft word, don't want to see the programmer's debugging statements Release code and development code You'll approximate release/development using a global variable DEBUG Initialize to False, set to True when debugging Ship with DEBUG = False The program is in the code directory for assignment6 on the course site, I tested it there and it runs

Look at howto and categorizing words Play a game with a list of possible words Initially this is all words List of possible words changes after each guess Given template "_ _ _ _", list of all words, and a letter, choose a secret word Choose all equivalent secret words, not just one Greedy algorithm, choose largest category words = categorize(words, guess, letter) This slide is all about explaining the code at the end of the slide, which is from the howto compsci 101 spring 2017

Computing the Categories Loop over every string in words, each of which is consistent with guess (template) This is important, also letter cannot be in guess Put letter in template according to word _ _ _ a _ t might become _ _ _ a n t Build a dictionary of templates with that letter to all words that fit in that template. How to create key in dictionary? compsci 101 spring 2017

Everytime guess a letter, build a dictionary based on that letter Example: Four letter word, guess o Key is string, value is list of strings that fit

Keys can’t be lists [“O”,”_”,”O”,”_”] need to convert to a string to be the key representing this list: “O_O_” compsci 101 spring 2017

DifferentTimings.py Problem: Start with a large file, a book, hawthorne.txt For each word, count how many times the word appears in the file Create a list of tuples, for each word: Create a tuple (word, count of word) We will look at several different solutions If time, look at different timings in this snarf file from Tuesday, also in today's snarf. The linear they've seen before, the dictionary they've seen before, this also shows binary search, just so at least they know there's a library for it? Why sorting data makes things fast compsci 101 spring 2017

DifferentTimings.py Problem: (word,count of word) Updating (key,value) pairs in structures Three different ways: Search through unordered list Search through ordered list Use dictionary Why is searching through ordered list fast? Guess a number from 1 to 1000, first guess? What is 210? Why is this relevant? 220? Dictionary is faster! But not ordered If time, look at different timings in this snarf file from Tuesday, also in today's snarf. The linear they've seen before, the dictionary they've seen before, this also shows binary search, just so at least they know there's a library for it? Why sorting data makes things fast compsci 101 spring 2017

Linear search through list o' lists Maintain list of [string,count] pairs List of lists, why can't we have list of tuples? If we read string 'cat', search and update If we read string 'frog', search and update [ ['dog', 2], ['cat', 1], ['bug', 4], ['ant', 5] ] [ ['dog', 2], ['cat', 2], ['bug', 4], ['ant', 5] ] [ ['dog', 2],['cat', 2],['bug', 4],['ant', 5],['frog',1] ] compsci 101 spring 2017

See DifferentTimings.py def linear(words): data = [] for w in words: found = False for elt in data: if elt[0] == w: elt[1] += 1 found = True break if not found: data.append([w,1]) return data N new words? By rows compsci 101 spring 2017

Binary Search Find Narten FOUND! How many times divide in half? Anderson Applegate Bethune Brooks Carter Douglas Edwards Franklin Griffin Holhouser Jefferson Klatchy Morgan Munson Narten Oliver Parker Rivers Roberts Stevenson Thomas Wilson Woodrow Yarbrow FOUND! How many times divide in half? The magenta/purple shows where Narten could be. After each guess, roughly in the middle, we eliminate half of the names from being the range Of names in which Narten could be. So how many times can we cut the range in half? log2(N) for N element list compsci 101 spring 2017

Binary search through list o' lists Maintain list of [string,count] pairs in order If we read string 'cat', search and update If we read string ‘dog‘ twice, search and update [ [‘ant', 4], [‘frog', 2] ] [ [‘ant', 4], [‘cat’, 1], [‘frog', 2] ] [ [‘ant', 4], [‘cat’, 1], [‘dog’, 1], [‘frog', 2] ] [ [‘ant', 4], [‘cat’, 1], [‘dog’, 2], [‘frog', 2] ] compsci 101 spring 2017

See DifferentTimings.py bit.ly/101s17-0330-4 def binary(words): data = [] for w in words: elt = [w,1] index = bisect.bisect_left(data, elt) if index == len(data): data.append(elt) elif data[index][0] != w: data.insert(index,elt) else: data[index][1] += 1 return data Bisec.bisect_left(data, elt) finds the location of where the data is inserted. Data.insert is shifting compsci 101 spring 2017

Search via Dictionary In linear search we looked through all pairs In binary search we looked at log pairs But have to shift lots if new element!! In dictionary search we look at one pair Compare: one billion, 30, 1, for example Note that 210 = 1024, 220 = million, 230=billion Dictionary converts key to number, finds it Need far more locations than keys Lots of details to get good performance One billion for linear, 30 for binary, 1 for dictionary – Quite a difference! compsci 101 spring 2017

See DiifferentTimings.py def dictionary(words): d = {} for w in words: if w not in d: d[w] = 1 else: d[w] += 1 return [[w,d[w]] for w in d] compsci 101 spring 2017

Running times @ 109 instructions/sec List unordered Dictionary List sorted This is a real focus in Compsci 201 linear is N2, binary search is N log N, dictionary N N O(log N) O(N) O(N log N) O(N2) 102 0.0 0.00001 103 0.0000001 0.001 106 0.02 16.7 min 109 1.0 29.9 31.7 years 1012 9.9 secs 11.07 hr 31.7 million years This is what you do in 201 compsci 101 spring 2017

Running times @ 109 instructions/sec List unordered Dictionary List sorted This is a real focus in Compsci 201 linear is N2, binary search is N log N, dictionary N N O(log N) O(N) O(N log N) O(N2) 102 0.0 0.00001 103 0.0000001 0.001 106 0.02 16.7 min 109 1.0 29.9 31.7 years 1012 9.9 secs 11.07 hr 31.7 million years This is what you do in 201 compsci 101 spring 2017

Running times @ 109 instructions/sec List unordered Dictionary List sorted This is a real focus in Compsci 201 linear is N2, binary search is N log N, dictionary N N O(log N) O(N) O(N log N) O(N2) 102 0.0 0.00001 103 0.0000001 0.001 106 0.02 16.7 min 109 1.0 29.9 31.7 years 1012 9.9 secs 11.07 hr 31.7 million years This is what you do in 201 compsci 101 spring 2017

Running times @ 109 instructions/sec List unordered Dictionary List sorted This is a real focus in Compsci 201 linear is N2, binary search is N log N, dictionary N N O(log N) O(N) O(N log N) O(N2) 102 0.0 0.00001 103 0.0000001 0.001 106 0.02 16.7 min 109 1.0 29.9 31.7 years 1012 9.9 secs 11.07 hr 31.7 million years This is what you do in 201 compsci 101 spring 2017

Running times @ 109 instructions/sec List unordered Dictionary List sorted This is a real focus in Compsci 201 linear is N2, binary search is N log N, dictionary N N O(log N) O(N) O(N log N) O(N2) 102 0.0 0.00001 103 0.0000001 0.001 106 0.02 16.7 min 109 1.0 29.9 31.7 years 1012 9.9 secs 11.07 hr 31.7 million years This is what you do in 201 compsci 101 spring 2017

What's the best and worst case? Bit.ly/101s17-0330-5 If every word is the same …. Does linear differ from dictionary? Why? If every word is different in alphabetical … Does binary differ from linear? Why? When would dictionary be bad? In practice dictionary never bad compsci 101 spring 2017