Design Exercise UW CSE 160 Spring 2018.

Slides:



Advertisements
Similar presentations
Programming for Linguists
Advertisements

Data Abstraction UW CSE 190p Summer Recap of the Design Exercise You were asked to design a module – a set of related functions. Some of these functions.
CS324e - Elements of Graphics and Visualization Java Intro / Review.
More Data Abstraction UW CSE 190p Summer Recap of the Design Exercise You were asked to design a module – a set of related functions. Some of these.
Data Abstraction UW CSE 140 Winter What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection.
Lecture 06 – Reading and Writing Text Files.  At the end of this lecture, students should be able to:  Read text files  Write text files  Example.
Design Exercise UW CSE 140 Winter Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For.
111 Protocols CS 4311 Wirfs Brock et al., Designing Object-Oriented Software, Prentice Hall, (Chapter 8) Meyer, B., Applying design by contract,
More On Classes UW CSE 160 Spring Classes define objects What are objects we've seen? 2.
Design Exercise UW CSE 190p Summer 2012 download examples from the calendar.
Design Exercise UW CSE 160 Spring Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For.
12. MODULES Rocky K. C. Chang November 6, 2015 (Based on from Charles Dierbach. Introduction to Computer Science Using Python and William F. Punch and.
Data Abstraction UW CSE 160 Spring What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Lesson Use the Windows Start button 2 Use a desktop shortcut 3 Used Most Frequently Used Programs on left side of Start Menu 4 Right-click a Word.
More Python Data Structures  Classes ◦ Should have learned in Simpson’s OOP ◦ If not, read chapters in Downey’s Think Python: Think like a Computer Scientist.
CSc 110, Autumn 2016 Lecture 26: Sets and Dictionaries
COMPSCI 107 Computer Science Fundamentals
SQL – Python and Databases
Microsoft Word 2010 Lesson 1 Revised February 9, 2011
Topics Dictionaries Sets Serializing Objects. Topics Dictionaries Sets Serializing Objects.
CSc 110, Spring 2017 Lecture 29: Sets and Dictionaries
Dictionaries CSE 1310 – Introduction to Computers and Programming
Wednesday Notecards. Wednesday Notecards Wednesday Notecards.
CSc 120 Introduction to Computer Programing II
CSc 110, Autumn 2016 Lecture 27: Sets and Dictionaries
Map Reduce.
Fundamentals of Programming I Design with Functions
Fastest way for already created documents
Design Exercise UW CSE 140 Winter 2014.
Design Exercise UW CSE 160 Winter 2017.
CSc 110, Spring 2018 Lecture 32: Sets and Dictionaries
CSc 110, Autumn 2017 Lecture 31: Dictionaries
Microsoft Word 2010 Lesson 1.
Microsoft Word 2010 Lesson 1 Word Lesson 1 presentation prepared by Kathy Clark (Southside H.S. IT Academy Teacher at Chocowinity, NC). Content from Microsoft.
Road Map CS Concepts Data Structures Java Language Java Collections
Data Abstraction UW CSE 160 Winter 2017.
Learning to Program in Python
SQL – Python and Databases (Continued)
Data Abstraction UW CSE 160 Winter 2016.
Topics Introduction to File Input and Output
CSc 110, Spring 2018 Lecture 33: Dictionaries
The Object-Oriented Thought Process Chapter 05
Introduction to Python
CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Spring 2013.
Microsoft Word 2010 Lesson 1 Word Lesson 1 presentation prepared by Kathy Clark (Southside H.S. IT Academy Teacher at Chocowinity, NC). Content from Microsoft.
Learning to Program in Python
Data Abstraction UW CSE 160 Spring 2018.
CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Spring 2016.
CSE 303 Concepts and Tools for Software Development
Exercise Write a program that counts the number of unique words in a large text file (say, Moby Dick or the King James Bible). Store the words in a collection.
Topics Dictionaries Sets Serializing Objects. Topics Dictionaries Sets Serializing Objects.
Rocky K. C. Chang 15 November 2018 (Based on Dierbach)
Protocols CS 4311 Wirfs Brock et al., Designing Object-Oriented Software, Prentice Hall, (Chapter 8) Meyer, B., Applying design by contract, Computer,
slides created by Marty Stepp
Microsoft Word 2010 Lesson 1 Word Lesson 1 presentation prepared by Kathy Clark (Southside H.S. IT Academy Teacher at Chocowinity, NC). Content from Microsoft.
15-110: Principles of Computing
Design Exercise UW CSE 160 Winter 2016.
Algorithmic complexity: Speed of algorithms
slides created by Marty Stepp
A Level Computer Science Topic 6: Introducing OOP
Topics Introduction to File Input and Output
CMPE212 – Reminders Assignment 2 due next Friday.
CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Spring 2019.
Message Passing Systems Version 2
ASCII and Unicode.
Data Abstraction UW CSE 160.
Python Modules.
Globalization Services: Spell Checking API
Message Passing Systems
Presentation transcript:

Design Exercise UW CSE 160 Spring 2018

Exercise Given a problem description, design a module to solve the problem Specify a set of functions For each function, provide the name of the function a doc string for the function

Problem: Text analysis Design a module for basic text analysis with the following capabilities: Compute the total number of words in a file Find the 10 most frequent words in a file. Find the number of times a given word appears in the file. Also show how to use the interface by computing the top 10 most frequent words in the file testfile.txt

Compare a Few Potential Designs Consider the 3 designs For each design, state positives and negatives Which one do you think is best, and why?

Text Analysis Module, Version 1 def word_count(filename, word): """Given a filename and a word, return the count of the given word in the given file.""" def top10(filename): """Given a filename, return a list of the top 10 most frequent words in the given file, from most frequent to least frequent.""" def total_words(filename): """Given a filename, return the total number of words in the file.""" # client program to compute top 10: result = top10("somedocument.txt")

Pros: Cons:

Text Analysis Module, Version 2 def read_words(filename): """Given a filename, return a list of words in the file.""" def word_count(wordlist, word): """Given a list of words and a word, returns a pair (count, allcounts_dict). count is the number of occurrences of the given word in the list, allcounts_dict is a dictionary mapping words to counts.""" def top10(wordcounts_dict): """Given a dictionary mapping words to counts, return a list of the top 10 most frequent words in the dictionary, from most to least frequent.""" def total_words(wordlist): """Return total number of words in the given list.""" # client program to compute top 10: word_list = read_words("somedocument.txt") (count, word_dict) = word_count(word_list, "anyword") result = top10(word_dict)

Pros: Cons:

Text Analysis Module, Version 3 def read_words(filename): """Given a filename, return a dictionary mapping each word in filename to its frequency in the file""" def word_count(word_counts_dict, word): """Given a dictionary mapping word to counts, return the count of the given word in the dictionary.""" def top10(word_counts_dict): """Given a dictionary mapping word to counts, return a list of the top 10 most frequent words in the dictionary, from most to least frequent.""" def total_words(word_counts_dict): """Given a dictionary mapping word to counts, return the total number of words used to create the dictionary""" # client program to compute top 10: word_dict = read_words("somedocument.txt") result = top10(word_dict)

Pros: Cons:

Changes to text analysis problem The users have requests some changes…. Ignore stopwords (common words such as “the”) A list of stopwords is provided in a file, one per line. Show the top k words rather than the top 10. How would the three designs handle these two changes?

Design criteria Ease of use vs. ease of implementation Generality Module may be written once but re-used many times Generality Can it be used in a new situation? Decomposability: Can parts of it be reused? Testability: Can parts of it be tested? Documentability Can you write a coherent description? Extensibility: Can it be easily changed?

From Word Counts Exercise: def read_words(filename): """Given a filename, return a dictionary mapping each word in filename to its frequency in the file""" wordfile = open(filename) worddata = wordfile.read() word_list = worddata.split() wordfile.close() wordcounts_dict = {} for word in word_list: if word in wordcounts_dict: wordcounts_dict[word] = wordcounts_dict[word] + 1 else: wordcounts_dict[word] = 1 return wordcounts_dict This “default” pattern is so common, there is a special method for it.

This “default” pattern is so common, there is a special method for it. setdefault def read_words(filename): """Given a filename, return a dictionary mapping each word in filename to its frequency in the file""" wordfile = open(filename) worddata = wordfile.read() word_list = worddata.split() wordfile.close() wordcounts_dict = {} for word in word_list: count = wordcounts_dict.setdefault(word, 0) wordcounts_dict[word] = count + 1 return wordcounts_dict This “default” pattern is so common, there is a special method for it.

setdefault Will NOT be on final exam setdefault(key[, default]) for word in word_list: if word in wordcounts_dict: wordcounts_dict[word] = wordcounts_dict[word] + 1 else: wordcounts_dict[word] = 1 VS: count = wordcounts_dict.setdefault(word, 0) wordcounts_dict[word] = count + 1 setdefault(key[, default]) If key is in the dictionary, return its value. If key is NOT present, insert key with a value of default, and return default. If default is not specified, the value None is used. get(key[, default])Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError. setdefault(key[, default])If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.