Python dicts and sets Some material adapted from Upenn cis391 slides and other sources.

Slides:



Advertisements
Similar presentations
Dictionaries: Keeping track of pairs
Advertisements

Python 2 Some material adapted from Upenn cis391 slides and other sources.
Python Dictionary.
Lecture 05 – Dictionaries.  Consider the following methods of sorting information in a sequence. What is the output produced by each program? 2COMPSCI.
COMPSCI 101 Principles of Programming Lecture 24 – Using dictionaries to manage a small file of information.
1 Basic Text Processing and Indexing. 2 Document Processing Steps Lexical analysis (tokenizing) Stopwords removal Stemming Selection of indexing terms.
Introduction to Python II CIS 391: Artificial Intelligence Fall, 2008.
Introduction to Python II CSE-391: Artificial Intelligence University of Pennsylvania Matt Huenerfauth January 2005.
A Brief Introduction to Python
Python Programming Chapter 10: Dictionaries Saad Bani Mohammad Department of Computer Science Al al-Bayt University 1 st 2011/2012.
Built-in Data Structures in Python An Introduction.
Introduction to Computing Using Python Dictionaries: Keeping track of pairs  Class dict  Class tuple.
LECTURE 3 Python Basics Part 2. FUNCTIONAL PROGRAMMING TOOLS Last time, we covered function concepts in depth. We also mentioned that Python allows for.
COSC 1306—COMPUTER SCIENCE AND PROGRAMMING PYTHON TUPLES, SETS AND DICTIONARIES Jehan-François Pâris
Dictionaries Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See
Introduction to Python Aug 22, 2013 Hee-gook Jun.
Quiz 3 Topics Functions – using and writing. Lists: –operators used with lists. –keywords used with lists. –BIF’s used with lists. –list methods. Loops.
1 Python Data Structures LING 5200 Computational Corpus Linguistics Martha Palmer.
Copyright © 2013 by John Wiley & Sons. All rights reserved. SETS AND DICTIONARIES CHAPTER Slides by James Tam Department of Computer Science, University.
More Python Data Structures  Classes ◦ Should have learned in Simpson’s OOP ◦ If not, read chapters in Downey’s Think Python: Think like a Computer Scientist.
CSc 110, Autumn 2016 Lecture 26: Sets and Dictionaries
COMPSCI 107 Computer Science Fundamentals
10 - Python Dictionary John R. Woodward.
Lesson 10: Dictionaries Topic: Introduction to Programming, Zybook Ch 9, P4E Ch 9. Slides on website.
Topics Dictionaries Sets Serializing Objects. Topics Dictionaries Sets Serializing Objects.
Python – May 18 Quiz Relatives of the list: Tuple Dictionary Set
CSc 110, Spring 2017 Lecture 29: Sets and Dictionaries
CMSC201 Computer Science I for Majors Lecture 21 – Dictionaries
When to use Tuples instead of Lists
Containers and Lists CIS 40 – Introduction to Programming in Python
CSc 120 Introduction to Computer Programing II
CSc 110, Autumn 2016 Lecture 27: Sets and Dictionaries
Department of Computer Science,
Data types: Complex types (Dictionaries)
Python - Dictionaries.
Week 5: Dictionaries and tuples
Lecture 10 Data Collections
CSc 110, Spring 2017 Lecture 28: Sets and Dictionaries
CSc 110, Spring 2018 Lecture 32: Sets and Dictionaries
CSc 110, Autumn 2017 Lecture 31: Dictionaries
CSc 110, Autumn 2017 Lecture 30: Sets and Dictionaries
Introduction to Python
Dictionaries GENOME 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Introduction to Python
CSc 110, Spring 2018 Lecture 33: Dictionaries
6. Lists Let's Learn Python and Pygame
Python 2 Some material adapted from Upenn cis391 slides and other sources.
COSC 1306 COMPUTER SCIENCE AND PROGRAMMING
Introduction to Dictionaries
IST256 : Applications Programming for Information Systems
Topics Dictionaries Sets Serializing Objects. Topics Dictionaries Sets Serializing Objects.
CS 1111 Introduction to Programming Fall 2018
Dictionaries Dictionary: object that stores a collection of data
Python I Some material adapted from Upenn cmpe391 slides and other sources.
Nate Brunelle Today: Dictionaries
COSC 1306 COMPUTER SCIENCE AND PROGRAMMING
Lesson 10: Dictionaries Class Chat: Attendance: Participation
6. Dictionaries and sets Rocky K. C. Chang 18 October 2018
Reference semantics, variables and names
COSC 1306 COMPUTER SCIENCE AND PROGRAMMING
Dictionaries: Keeping track of pairs
Python for Informatics: Exploring Information
“Everything Else”.
Nate Brunelle Today: Dictionaries
CSE 231 Lab 8.
Winter 2019 CISC101 5/26/2019 CISC101 Reminders
CSE 231 Lab 9.
CMSC201 Computer Science I for Majors Lecture 19 – Dictionaries
Sample lecture slides.
Dictionary.
Presentation transcript:

Python dicts and sets Some material adapted from Upenn cis391 slides and other sources

Overview Python doesn’t have traditional vectors and arrays! Instead, Python makes heavy use of the dict datatype (a hashtable) which can serve as a sparse array Efficient traditional arrays are available as modules that interface to C A Python set is derived from a dict

Dictionaries: A Mapping type Dictionaries store a mapping between a set of keys and a set of values Keys can be any immutable type. Values can be any type A single dictionary can store values of different types You can define, modify, view, lookup or delete the key-value pairs in the dictionary Python’s dictionaries are also known as hash tables and associative arrays

Creating & accessing dictionaries >>> d = {‘user’:‘bozo’, ‘pswd’:1234} >>> d[‘user’] ‘bozo’ >>> d[‘pswd’] 1234 >>> d[‘bozo’] Traceback (innermost last): File ‘<interactive input>’ line 1, in ? KeyError: bozo

Updating Dictionaries >>> d = {‘user’:‘bozo’, ‘pswd’:1234} >>> d[‘user’] = ‘clown’ >>> d {‘user’:‘clown’, ‘pswd’:1234} Keys must be unique Assigning to an existing key replaces its value >>> d[‘id’] = 45 {‘user’:‘clown’, ‘id’:45, ‘pswd’:1234} Dictionaries are unordered New entries can appear anywhere in output Dictionaries work by hashing

Removing dictionary entries >>> d = {‘user’:‘bozo’, ‘p’:1234, ‘i’:34} >>> del d[‘user’] # Remove one. >>> d {‘p’:1234, ‘i’:34} >>> d.clear() # Remove all. {} >>> a=[1,2] >>> del a[1] # del works on lists, too >>> a [1]

Useful Accessor Methods >>> d = {‘user’:‘bozo’, ‘p’:1234, ‘i’:34} >>> d.keys() # List of keys, VERY useful [‘user’, ‘p’, ‘i’] >>> d.values() # List of values [‘bozo’, 1234, 34] >>> d.items() # List of item tuples [(‘user’,‘bozo’), (‘p’,1234), (‘i’,34)]

A Dictionary Example Problem: count the frequency of each word in text read from the standard input, print results Six versions of increasing complexity wf1.py is a simple start wf2.py uses a common idiom for default values wf3.py sorts the output alphabetically wf4.py downcase and strip punctuation from words and ignore stop words wf5.py sort output by frequency wf6.py add command line options: -n, -t, -h

Dictionary example: wf1.py #!/usr/bin/python import sys freq = {} # frequency of words in text for line in sys.stdin: for word in line.split(): if word in freq: freq[word] = 1 + freq[word] else: freq[word] = 1 print freq

Dictionary example wf1.py #!/usr/bin/python import sys freq = {} # frequency of words in text for line in sys.stdin: for word in line.split(): if word in freq: freq[word] = 1 + freq[word] else: freq[word] = 1 print freq This is a common pattern

Dictionary example wf2.py #!/usr/bin/python import sys freq = {} # frequency of words in text for line in sys.stdin: for word in line.split(): freq[word] = 1 + freq.get(word, 0) print freq key Default value if not found

Dictionary example wf3.py #!/usr/bin/python import sys freq = {} # frequency of words in text for line in sys.stdin: for word in line.split(): freq[word] = freq.get(word,0) for w in sorted(freq.keys()): print w, freq[w]

Dictionary example wf4.py #!/usr/bin/python import sys punctuation = """'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'""" freq = {} # frequency of words in text stop_words = set() for line in open("stop_words.txt"): stop_words.add(line.strip())

Dictionary example wf4.py for line in sys.stdin: for word in line.split(): word = word.strip(punct).lower() if word not in stop_words: freq[word] = freq.get(word,0)+1 # print sorted words and their frequencies for w in sorted(freq.keys()): print w, freq[w]

Dictionary example wf5.py #!/usr/bin/python import sys from operator import itemgetter … words = sorted(freq.items(), key=itemgetter(1), reverse=True) for (w,f) in words: print w, f

Dictionary example wf6.py from optparse import OptionParser # read command line arguments and process parser = OptionParser() parser.add_option('-n', '--number', type="int", default=-1, help='number of words to report') parser.add_option("-t", "--threshold", type="int", default=0, help=”print if frequency > threshold") (options, args) = parser.parse_args() ... # print the top option.number words but only those # with freq>option.threshold for (word, freq) in words[:options.number]: if freq > options.threshold: print freq, word

Why must keys be immutable? The keys used in a dictionary must be immutable objects? >>> name1, name2 = 'john', ['bob', 'marley'] >>> fav = name2 >>> d = {name1: 'alive', name2: 'dead'} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: list objects are unhashable Why is this? Suppose we could index a value for name2 and then did fav[0] = “Bobby” Could we find d[name2] or d[fav] or …?

defaultdict >>> from collections import defaultdict >>> kids = defaultdict(list, {'alice': ['mary', 'nick'], 'bob': ['oscar', 'peggy']}) >>> kids['bob'] ['oscar', 'peggy'] >>> kids['carol'] [] >>> age = defaultdict(int) >>> age['alice'] = 30 >>> age['bob'] 0 >>> age['bob'] += 1 >>> age defaultdict(<type 'int'>, {'bob': 1, 'alice': 30})