Design Exercise UW CSE 140 Winter 2013. Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For.

Slides:



Advertisements
Similar presentations
1 After completing this lesson, you will be able to: Check spelling in a document. Check for grammatical errors. Find specific text. Replace specific text.
Advertisements

Programming for Linguists
MapReduce.
© Paradigm Publishing, Inc Word 2010 Level 2 Unit 1Formatting and Customizing Documents Chapter 2Proofing Documents.
Introduction to Computing Using Python CSC Winter 2013 Week 8: WWW and Search  World Wide Web  Python Modules for WWW  Web Crawling  Thursday:
WWW 2014 Seoul, April 8 th SNOW 2014 Data Challenge Two-level message clustering for topic detection in Twitter Georgios Petkos, Symeon Papadopoulos, Yiannis.
Data Abstraction UW CSE 190p Summer Recap of the Design Exercise You were asked to design a module – a set of related functions. Some of these functions.
Programming Logic and Design Fourth Edition, Introductory
Professor Michael J. Losacco CIS 1150 – Introduction to Computer Information Systems Programming and Languages Chapter 13.
Exercising these ideas  You have a description of each item in a small collection. (30 web sites)  Assume we are looking for information about boxers,
Jump to first page 1 System Design (Finalizing Design Specifications) Chapter 3d.
INEX 2003, Germany Searching in an XML Corpus Using Content and Structure INEX 2003, Germany Yiftah Ben-Aharon, Sara Cohen, Yael Grumbach, Yaron Kanza,
The Online Activities Module OAM Describing the F-7 & F-7A.
1st Gr: Dot Images Counting All/Counting On: Set A
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Dictionaries Chapter Chapter Contents Specifications for the ADT Dictionary A Java Interface Iterators Using the ADT Dictionary A Directory of Telephone.
Dictionaries Chapter 17 Slides by Steve Armstrong LeTourneau University Longview, TX  2007,  Prentice Hall.
Distributed Software Development
CS324e - Elements of Graphics and Visualization Java Intro / Review.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Software Evaluation Criteria Automated Assignment Applications RSCoyner 10/8/04.
Chapter 6 Functions -- QuickStart. "The Practice of Computing Using Python", Punch & Enbody, Copyright © 2013 Pearson Education, Inc. What is a function?
More Data Abstraction UW CSE 190p Summer Recap of the Design Exercise You were asked to design a module – a set of related functions. Some of these.
Data Abstraction UW CSE 140 Winter What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection.
CIS 451: Regular Expressions Dr. Ralph D. Westfall January, 2009.
CSE 143 Lecture 11 Maps Grammars slides created by Alyssa Harding
CSS446 Spring 2014 Nan Wang.  Java Collection Framework ◦ Set ◦ Map 2.
Chapter 12: Design Phase n 12.1 Design and Abstraction n 12.2 Action-Oriented Design n 12.3 Data Flow Analysis n Data Flow Analysis Example n
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
More On Classes UW CSE 160 Spring Classes define objects What are objects we've seen? 2.
CSE 143 Lecture 14 (B) Maps and Grammars reading: 11.3 slides created by Marty Stepp
Design Exercise UW CSE 190p Summer 2012 download examples from the calendar.
Design Exercise UW CSE 160 Spring Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
14. DICTIONARIES AND SETS Rocky K. C. Chang 17 November 2014 (Based on from Charles Dierbach, Introduction to Computer Science Using Python and Punch and.
 Software Development Life Cycle  Software Development Tools  High Level Programming:  Structures  Algorithms  Iteration  Pseudocode  Order of.
12. MODULES Rocky K. C. Chang November 6, 2015 (Based on from Charles Dierbach. Introduction to Computer Science Using Python and William F. Punch and.
CSE 373 Data Structures and Algorithms Lecture 1: Introduction; ADTs; Stacks; Eclipse.
1 PDMLink Application - User Features & Functions Module 6: Search Capabilities.
Chapter 1 The Phases of Software Development. Software Development Phases ● Specification of the task ● Design of a solution ● Implementation of solution.
Data Abstraction UW CSE 160 Spring What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection.
Why indexing? For efficient searching of a document
EGR 115 Introduction to Computing for Engineers
Computing and Data Analysis
Algorithmic complexity: Speed of algorithms
CSC 458– Predictive Analytics I, Fall 2017, Intro. To Python
Map Reduce.
C++ Standard Library.
Data Structures: Lists
Design Exercise UW CSE 140 Winter 2014.
Design Exercise UW CSE 160 Winter 2017.
CSc 110, Autumn 2017 Lecture 31: Dictionaries
Collections Michael Ernst UW CSE 140.
1st Gr: Dot Images Counting All/Counting On: Set A
Data Abstraction UW CSE 160 Winter 2017.
Ruth Anderson UW CSE 160 Spring 2015
Data Abstraction UW CSE 160 Winter 2016.
CSc 110, Spring 2018 Lecture 33: Dictionaries
Design Exercise UW CSE 160 Spring 2018.
Algorithm Efficiency, Big O Notation, and Javadoc

Data Abstraction UW CSE 160 Spring 2018.
CSC 458– Predictive Analytics I, Fall 2018, Intro. To Python
1st Gr: Dot Images Counting All/Counting On: Set A
Algorithmic complexity: Speed of algorithms
Rocky K. C. Chang 15 November 2018 (Based on Dierbach)
Design Exercise UW CSE 160 Winter 2016.
Algorithmic complexity: Speed of algorithms
1st Gr: Dot Images Counting All/Counting On: Set A
Data Abstraction UW CSE 160.
Presentation transcript:

Design Exercise UW CSE 140 Winter 2013

Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For each function, provide the name of the function a doc string for the function 2) Sketch an implementation of each function – In English, describe what the implementation needs to do – This will typically be no more than about 4-5 lines per function

Example of high-level “pseudocode” def read_scores(filename) """Return a dictionary mapping words to scores""" open the file For each line in the file, insert the word and its score into a dictionary called scores return the scores dictionary def compute_total_sentiment(searchterm): """Return the total sentiment for all words in all tweets in the first page of results returned for the search term""" Construct the twitter search url for searchterm Fetch the twitter search results using the url For each tweet in the response, extract the text add up the scores for each word in the text add the score to the total return the total

Exercise 1: Text analysis Design a module for basic text analysis with the following capabilities: Compute the number of words in a file Find the 10 most frequent words in a file. Find the number of times a given word appears in the file. Also show how to use the interface by computing the top 10 most frequent words in the file testfile.txt

Text Analysis, Version 1 def wordcount(filename, word): """Return the count of the given word in the file""" def top10(filename): """Return a list of the top 10 most frequent words in the file""" def totalwords(filename): """Return the total number of words in the file""" # program to compute top 10: result = top10("somedocument.txt") print result

Text Analysis, Version 2 def read_words(filename): """Return a list of words in the file""" def wordcount(wordlist, word): """Returns a pair (count, allcounts). count is the number of occurrences of the given word allcounts is a dictionary from words to counts.""" def top10(wordcounts): """Return a list of the top 10 most frequent words in the dictionary, in order.""" def totalwords(wordlist): """Return the total number of words in the list""" # program to compute top 10: words = read_words(filename) (cnt, allcounts) = wordcount(words, "anyword") result = top10(allcounts)

Text Analysis, Version 3 def read_words(filename): """Return a dictionary mapping each word in filename to its frequency in the file""" def wordcount(wordcounts, word): """Return the count of the given word in the dictionary.""" def top10(wordcounts): """Return a list of the top 10 most frequent words in the dictionary, in order""" def totalwords(wordcounts): """Return the total number of words used to create the dictionary""" # program to compute top 10: wordcounts = read_words(filename) result = top10(wordcounts)

Analysis Consider the 3 designs For each design, state positives and negatives Which one do you think is best, and why?

Changes to text analysis problem Ignore stopwords (common words such as “the”) – A list of stopwords is provided in a file, one per line. Show the top k words rather than the top 10.

Design criteria Ease of implementation – More important for client than for library Generality – Can it be used in a new situation? – Decomposability: Can parts of it be reused? – Testability: Can parts of it be tested? Documentability – Can you write a coherent description? Extensibility: Can it be easily changed?

Exercise 2: Quantitative Analysis Design a module for basic statistical analysis of files in UWFORMAT with the following capabilities: Create an S-T plot: the salinity plotted against the temperature. Compute the minimum o2 in a file. UWFORMAT: line 0: site, temp, salt, o2 line N:,,,

Quantitative Analysis, Version 1 import matplotlib.pyplot as plt def read_measurements(filename): """Return a list of 4-tuples, each one of the form (site, temp, salt, oxygen)""" def STplot(measurements): """Given a list of 4-tuples, generate a scatter plot comparing salinity and temperature""" def minimumO2(measurements): """Given a list of 4-tuples, return the minimum value of the oxygen measurement"""

Changes UWFORMAT has changed: UWFORMAT2: line 0: site, date, chl, salt, temp, o2 line N:,,,,, Find the average temperature for site “X”