Data Abstraction UW CSE 160 Spring 2015 1. What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection.

Slides:



Advertisements
Similar presentations
Data Abstraction UW CSE 190p Summer Recap of the Design Exercise You were asked to design a module – a set of related functions. Some of these functions.
Advertisements

1 Chapter 1 Object-Oriented Programming. 2 OO programming and design Object-oriented programming and design can be contrasted with alternative programming.
Module: Definition ● A logical collection of related program entities ● Not necessarily a physical concept, e.g., file, function, class, package ● Often.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Graph Implementations Chapter 31 Slides by Steve Armstrong LeTourneau University Longview, TX  2007,  Prentice Hall.
CS 106 Introduction to Computer Science I 03 / 17 / 2008 Instructor: Michael Eckmann.
Computer Science 111 Fundamentals of Programming I Introduction to Programmer-Defined Classes.
University of Limerick1 Work with API’s. University of Limerick2 Learning OO programming u Learning a programming language can be broadly split into two.
CSM-Java Programming-I Spring,2005 Introduction to Objects and Classes Lesson - 1.
Computer Science Standard Level Mastery Aspects. Mastery Item Claimed JustificationWhere Listed Arrays Used to store the student data Lines P.
Chapter 11 Introduction to Classes Intro to Computer Science CS1510, Section 2 Dr. Sarah Diesburg.
CSC1018F: Object Orientation, Exceptions and File Handling Diving into Python Ch. 5&6 Think Like a Comp. Scientist Ch
1 CSC 221: Introduction to Programming Fall 2012 Functions & Modules  standard modules: math, random  Python documentation, help  user-defined functions,
Programming in Java Unit 2. Class and variable declaration A class is best thought of as a template from which objects are created. You can create many.
More Data Abstraction UW CSE 190p Summer Recap of the Design Exercise You were asked to design a module – a set of related functions. Some of these.
Data Abstraction UW CSE 140 Winter What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection.
CS 106 Introduction to Computer Science I 03 / 19 / 2007 Instructor: Michael Eckmann.
CS200 Algorithms and Data StructuresColorado State University Part 4. Advanced Java Topics Instructor: Sangmi Pallickara
Design Exercise UW CSE 140 Winter Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For.
JAVA 0. HAFTA Algorithms FOURTH EDITION Robert Sedgewick and Kevin Wayne Princeton University.
Review of ICS 102. Lecture Objectives To review the major topics covered in ICS 102 course Refresh the memory and get ready for the new adventure of ICS.
Cohesion and Coupling CS 4311
CSC 508 – Theory of Programming Languages, Spring, 2009 Week 3: Data Abstraction and Object Orientation.
Software Design Design is the process of applying various techniques and principles for the purpose of defining a device, a process,, or a system in sufficient.
More On Classes UW CSE 160 Spring Classes define objects What are objects we've seen? 2.
Computer Science 112 Fundamentals of Programming II Interfaces and Implementations.
Documentation / References Python Full Documentation – Python Quick Reference –
Introduction to Computing Using Python Data Storage and Processing  How many of you have taken IT 240?  Databases and Structured Query Language  Python.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Topic 1 Object Oriented Programming. 1-2 Objectives To review the concepts and terminology of object-oriented programming To discuss some features of.
Chapter 10 Defining Classes. The Internal Structure of Classes and Objects Object – collection of data and operations, in which the data can be accessed.
Chapter 6 Introduction to Defining Classes. Objectives: Design and implement a simple class from user requirements. Organize a program in terms of a view.
Chapter 9 Dictionaries and Sets.
Python Primer 1: Types and Operators © 2013 Goodrich, Tamassia, Goldwasser1Python Primer.
Chapter 10, Slide 1 ABSTRACT DATA TYPES Based on the fundamental concept of ABSTRACTION:  process abstraction  data abstraction Both provide:  information.
Views of Data Data – nouns of programming world the objects that are manipulated information that is processed Humans like to group information Classes,
Design Exercise UW CSE 190p Summer 2012 download examples from the calendar.
Design Exercise UW CSE 160 Spring Exercise Given a problem description, design a module to solve the problem 1) Specify a set of functions – For.
AP Computer Science A – Healdsburg High School 1 Unit 9 - Why Use Classes - Anatomy of a Class.
CS 106 Introduction to Computer Science I 03 / 22 / 2010 Instructor: Michael Eckmann.
Introduction to Classes Intro to Computer Science CS1510, Section 2 Dr. Sarah Diesburg.
Chapter 9: Coupling & Cohesion Omar Meqdadi SE 273 Lecture 9 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
CSC 243 – Java Programming, Spring, 2014 Week 4, Interfaces, Derived Classes, and Abstract Classes.
David Evans CS201J: Engineering Software University of Virginia Computer Science Lecture 5: Implementing Data Abstractions.
Coupling and Cohesion Pfleeger, S., Software Engineering Theory and Practice. Prentice Hall, 2001.
CSC 243 – Java Programming, Fall, 2008 Tuesday, September 30, end of week 5, Interfaces, Derived Classes, and Abstract Classes.
COMPSCI 107 Computer Science Fundamentals
Classes and Objects: Encapsulation
COMPSCI 107 Computer Science Fundamentals
Software Development Java Classes and Methods
Department of Computer Science,
Computer Science 111 Fundamentals of Programming I
Design Exercise UW CSE 140 Winter 2014.
Design Exercise UW CSE 160 Winter 2017.
Creating and Using Classes
Data Abstraction UW CSE 160 Winter 2017.
SQL – Python and Databases (Continued)
Chapter 3 Introduction to Classes, Objects Methods and Strings
Data Abstraction UW CSE 160 Winter 2016.
Design Exercise UW CSE 160 Spring 2018.
Data Abstraction UW CSE 160 Spring 2018.
Computer Science 111 Fundamentals of Programming I
Python Primer 1: Types and Operators
Chapter 8 Advanced SQL.
Design Exercise UW CSE 160 Winter 2016.
Defining Classes and Methods
A Level Computer Science Topic 6: Introducing OOP
Namespaces – Local and Global
Data Abstraction UW CSE 160.
Presentation transcript:

Data Abstraction UW CSE 160 Spring

What is a program? – A sequence of instructions to achieve some particular purpose What is a library? – A collection of functions that are helpful in multiple programs What is a data structure? – A representation of data, and – Routines to manipulate the data Create, query, modify 2

Why break a program into parts? Easier to understand each part – Abstraction: When using a part, understand only its specification (documentation string); ignore its implementation Easier to test each part Reuse parts 3

Breaking a program into parts: the parts, and how to express them Organizing the program & algorithm: Function (procedure) Library (collection of useful functions) Data structure (representation + methods) Organizing the code (related but not the same!): Files Modules Namespaces 4

Namespace Disambiguates duplicate variable names Examples: – math.sin – File system directories 5

Review: Accessing variables in a namespace import math... math.sin... import networkx as nx g = nx.Graph() module name alias from networkx import Graph, DiGraph g = Graph() Graph and DiGraph are now available in the global namespace 6

Recall the design exercise We created a module or library: a set of related functions The functions operated on the same data structure – a dictionary associating words with a frequency count – a list of tuples of measurements Each module contained: – A function to create the data structure – Functions to query the data structure – We could have added functions to modify the data structure 7

Two types of abstraction Abstraction: Ignoring/hiding some aspects of a thing In programming, ignore everything except the specification or interface The program designer decides which details to hide and to expose Procedural abstraction: Define a procedure/function specification Hide implementation details Data abstraction: Define what the datatype represents Define how to create, query, and modify Hide implementation details of representation and of operations – Also called “encapsulation” or “information hiding” 8

Data abstraction Describing field measurements: – “A dictionary mapping strings to lists, where the strings are sites and each list has the same length and its elements corresponds to the fields in the data file.” – “FieldMeasurements” Which do you prefer? Why? (This must appear in the documentation string of every function related to field measurements!) 9

Representing a graph A graph consists of: – nodes/vertices – edges among the nodes Representations: – Set of edge pairs (a, a), (a, b), (a, c), (b, c), (c, b) – For each node, a list of neighbors { a: [a, b, c], b: [c], c: [b] } – Matrix with boolean for each entry a b c abc a ✓✓✓ b ✓ c ✓ 10

def read_words(filename): """Return dictionary mapping each word in filename to its frequency.""" wordfile = open(filename) word_list = wordfile.read().split() wordfile.close() wordcounts_dict = {} for word in word_list: count = wordcounts_dict.setdefault(word, 0) wordcounts_dict[word] = count + 1 return wordcounts_dict def word_count(wordcounts_dict, word): """Return count of the word in the dictionary. """ if wordcounts_dict.has_key(word): return wordcounts_dict[word] else: return 0 def topk(wordcounts_dict, k=10): """Return list of (count, word) tuples of the top k most frequent words.""" counts_with_words = [(c, w) for (w, c) in wordcounts_dict.items()] counts_with_words.sort(reverse=True) return counts_with_words[0:k] def total_words(wordcounts_dict): """Return the total number of words.""" return sum(wordcounts_dict.values()) Text analysis module (group of related functions) representation = dictionary # program to compute top 5: wordcounts = read_words(filename) result = topk(wordcounts, 5) 11

Problems with the implementation The wordcounts dictionary is exposed to the client: the user might corrupt or misuse it. If we change our implementation (say, to use a list), it may break the client program. We prefer to – Hide the implementation details from the client – Collect the data and functions together into one unit # program to compute top 5: wordcounts = read_words(filename) result = topk(wordcounts, 5) 12

Datatypes and classes A class creates a namespace for: – Variables to hold the data – Functions to create, query, and modify Each function defined in the class is called a method – Takes “ self ” (a value of the class type) as the first argument A class defines a datatype – An object is a value of that type – Comparison to other types: Type is int, value is 22 Type is the class, value is an object also known as an instantiation or instance of that type 13

def read_words(filename): """Return dictionary mapping each word in filename to its frequency.""" wordfile = open(filename) word_list = wordfile.read().split() wordfile.close() wordcounts_dict = {} for word in word_list: count = wordcounts_dict.setdefault(word, 0) wordcounts_dict[word] = count + 1 return wordcounts_dict def word_count(wordcounts_dict, word): """Return count of the word in the dictionary. """ if wordcounts_dict.has_key(word): return wordcounts_dict[word] else: return 0 def topk(wordcounts_dict, k=10): """Return list of (count, word) tuples of the top k most frequent words.""" counts_with_words = [(c, w) for (w, c) in wordcounts_dict.items()] counts_with_words.sort(reverse=True) return counts_with_words[0:k] def total_words(wordcounts_dict): """Return the total number of words.""" return sum(wordcounts_dict.values()) Text analysis module (group of related functions) representation = dictionary # program to compute top 5: wordcounts = read_words(filename) result = topk(wordcounts, 5) 14

class WordCounts: """Represents the words in a file.""" # Internal representation: # variable wordcounts is a dictionary mapping words to their frequency def read_words(self, filename): """Populate a WordCounts object from the given file""" word_list = open(filename).read().split() self.wordcounts = {} for w in word_list: self.wordcounts.setdefault(w, 0) self.wordcounts[w] += 1 def word_count(self, word): """Return the count of the given word""" return self.wordcounts[word] def topk(self, k=10): """Return a list of the top k most frequent words in order""" scores_with_words = [(c,w) for (w,c) in self.wordcounts.items()] scores_with_words.sort(reverse=True) return scores_with_words[0:k] def total_words(self): """Return the total number of words in the file""" return sum(self.wordcounts.values()) Each function in a class is called a method. Its first argument is of the type of the class. Text analysis, as a class Defines a class (a datatype) named WordCounts Modifies a WordCounts object Queries a WordCounts object read_words does not return a value; it mutates self The type of self is WordCounts wordcounts read_words word_count topk total_words The namespace of a WordCounts object: dict fn # program to compute top 5: wc = WordCounts() wc.read_words(filename) result = wc.topk(5) topk takes 2 arguments The type of wc is WordCounts 15

# program to compute top 5: wc = WordCounts() wc.read_words(filename) result = wc.topk(5) result = WordCounts.topk(wc, 5) A namespace, like a module A function that takes two arguments A value of type WordCounts Two equivalent calls Weird constructor: it does not do any work You have to call a mutator immediately afterward 16

Class with constructor class WordCounts: """Represents the words in a file.""" # Internal representation: # variable wordcounts is a dictionary mapping words to their frequency def __init__(self, filename): """Create a WordCounts object from the given file""" words = open(filename).read().split() self.wordcounts = {} for w in words: self.wordcounts.setdefault(w, 0) self.wordcounts[w] += 1 def word_count(self, word): """Return the count of the given word""" return self.wordcounts[word] def topk(self, k=10): """Return a list of the top k most frequent words in order""" scores_with_words = [(c,w) for (w,c) in self.wordcounts.items()] scores_with_words.sort(reverse=True) return scores_with_words[0:k] def total_words(self): """Return the total number of words in the file""" return sum([c for (w,c) in self.wordcounts]) # program to compute top 5: wc = WordCounts(filename) result = wc.topk(5) 17

Alternate implementation class WordCounts: """Represents the words in a file.""" # Internal representation: # variable words is a list of the words in the file def __init__(self, filename): """Create a WordCounts object from the given file""" self.words = open(filename).read().split() def word_count(self, word): """Return the count of the given word""" return self.words.count(word) def topk(self, k=10): """Return a list of the top k most frequent words in order""" scores_with_words = [(self.wordcount(w),w) for w in set(self.words)] scores_with_words.sort(reverse=True) return scores_with_words[0:k] def total_words(self): """Return the total number of words in the file""" return len(self.words) # program to compute top 5: wc = WordCounts(filename) result = wc.topk(5) Exact same program! words __init__ word_count topk total_words The namespace of a WordCounts object: fn list 18

Quantitative analysis def read_measurements(filename): """Return a dictionary mapping column names to data. Assumes the first line of the file is column names.""" datafile = open(filename) rawcolumns = zip(*[row.split() for row in datafile]) columns = dict([(col[0], col[1:]) for col in rawcolumn]) return columns def tofloat(measurements, columnname): """Convert each value in the given iterable to a float""" return [float(x) for x in measurements[columnname]] def STplot(measurements): """Generate a scatter plot comparing salinity and temperature""" xs = tofloat(measurements, "salt") ys = tofloat(measurements, "temp") plt.plot(xs, ys) plt.show() def minimumO2(measurements): """Return the minimum value of the oxygen measurement""" return min(tofloat(measurements, "o2")) # Program to plot mydict = read_measurements(filename) result = mydict.Stplot() 19

Quantitative analysis, as a class class Measurements: """Represents a set of measurements in UWFORMAT.""“ def read_measurements(self, filename): """Populate a Measurements object from the given file. Assumes the first line of the file is column names.""" datafile = open(filename) rawcolumns = zip(*[row.split() for row in datafile]) self.columns = dict([(col[0], col[1:]) for col in rawcolumn]) return columns def tofloat(self, columnname): """Convert each value in the given iterable to a float""" return [float(x) for x in self.columns[columnname]] def STplot(self): """Generate a scatter plot comparing salinity and temperature""" xs = tofloat(self.columns, "salt") ys = tofloat(self.columns, "temp") plt.plot(xs, ys) plt.show() def minimumO2(self): """Return the minimum value of the oxygen measurement""" return min(tofloat(self.columns, "o2")) # Program to plot mm = Measurements() mm.read_measurements(filename) result = mm.Stplot() 20

Quantitative analysis, with a constructor class Measurements: """Represents a set of measurements in UWFORMAT.""“ def __init__(self, filename): """Create a Measurements object from the given file. Assumes the first line of the file is column names.""" datafile = open(filename) rawcolumns = zip(*[row.split() for row in datafile]) self.columns = dict([(col[0], col[1:]) for col in rawcolumn]) def tofloat(self, columnname): """Convert each value in the given iterable to a float""" return [float(x) for x in self.columns[columnname]] def STplot(self): """Generate a scatter plot comparing salinity and temperature""" xs = tofloat(self.columns, "salt") ys = tofloat(self.columns, "temp") plt.plot(xs, ys) plt.show() def minimumO2(self): """Return the minimum value of the oxygen measurement""" return min(tofloat(self.columns, "o2")) # Program to plot mm = Measurements(filename) result = mm.Stplot() 21