Dictionaries CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.

Slides:



Advertisements
Similar presentations
Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all.
Advertisements

In Pictures The Gettysburg Address Photo by Tim EvansonTim Evanson.
What should be done with 7,000+ deceased soldiers after a battle?
American History Museum Walkthrough. Bombing of Fort McHenry during the War of 1812.
1861 – 1865 Timeline & Photo Presentation
Gettysburg Address Four score and seven years ago, our fathers brought forth upon this continent a new nation: conceived in liberty, and dedicated to the.
A new way of looking at texts
Microsoft PowerPoint The Bells and Whistles.
The Gettysburg Address Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated.
Richardson 3040 PowerPoint Rules Rule 1 Everything should enhance the content of the presentation Regions of Tennessee.
Files CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Hashes a “hash” is another fundamental data structure, like scalars and arrays. Hashes are sometimes called “associative arrays”. Basically, a hash associates.
Reading Pointers: Chapter 4, pages Strings as Arrays of Characters string Greeting = "hello"; vs. char *Greeting = "hello";
Plagiarism: what it means to you Ms. Allen JTA Library Media Specialist.
Basics: Text boxes Backgrounds Shapes Fonts Transitions Animations Spell check Clip Organizer Advanced: Music Video / URL Merge Presentations Narration/voice.
By Karissa Lynn Montag The Lincoln Museum is Located in Springfield Illinois. In one room there is Lincoln’s house and you can dress up as Lincoln or.
Improving Your Communication Skills & Speaking in Turbulent Situations.
CS 177 Week 11 Recitation Slides 1 1 Dictionaries, Tuples.
A First Program CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington Credits: a significant part of.
LINCOLN’S GETTYSBURG ADDRESS Lincoln gave the battle a higher meaning. The war has a purpose. These men died to make Americans live up to their own beliefs-
LINCOLN’S GETTYSBURG ADDRESS November 19, To understand what Abraham Lincoln was stating in the Gettysburg Address.
© Copyright 2012 by Pearson Education, Inc. All Rights Reserved. Chapter 14 Tuples, Sets, and Dictionaries 1.
Strings CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
The Call For Change Supplemental Information 20. MCS Intervention Strategy Repeated Reading Readers’ Theater 1. Choose a script. Choose a prepared script,
Visual Aids Communication delivered over multiple channels is more efficient than communication over a single channel –More likely the whole message.
15,000 spectators were in attendance The Gettysburg Address.
Lists CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Built-in Data Structures in Python An Introduction.
*701 *702 Graduate Seminar Useful Tips Host: JJ Hoyt.
World Affairs 9/7/11 Legacy of Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and.
15,000 spectators were in attendance The Gettysburg Address.
Lincoln’s Gettysburg Address Given November 19, 1863 on the battlefield near Gettysburg, Pennsylvania, U.S.A.
Battle Hymn for Gettysburg Music adapted/arr. by Teresa Jennings Music K-8, Vol.19, Num.3 © 2009 Plank Road Publishing, Inc. All Rights Reserved- used.
Civil War, pt3. Andersonville Prison Libby Prison.
President for a day Can you handle it???. Your Task… You are being asked to dedicate a cemetery for fallen soldiers. The cemetery is on the site where.
Abraham Lincoln He was born on February 12, 1809 in Hodgenville Kentucky. He is the 16 th President of the United States of America He was in office from.
Last lecture: Point Estimation A point estimator is function of the observations in a random sample which is used to estimate an unknown parameter. A point.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.
ENGLISH LANGUAGE ARTS GRADE 9 Mrs. Jeffries. Parallelism Unit 1 PARALLEL STRUCTURE means using the same pattern of words to show that two or more ideas.
A First Program CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington Credits: a significant part of.
“EVERY SPEECH IS A RHYMELESS, METERLESS VERSE.” -WINSTON CHURCHILL Power Poetry.
1/9/14 O CO: Evaluate Lincoln’s efforts to abolish slavery and to end the Civil War. O QW: O Read and analyze the quotes from Lincoln’s letters.
The Civil War Antietam Gettysburg. What does Secession mean? What was Fort Sumter? Who took control of it? Who was the confederate commander at the Battle.
The Gettysburg Address By Zoe and Bryony. Information Abraham Lincoln wrote and read the famous speech It was spoken at the dedication of the soldiers'
Gettysburg Picture Analysis- Gallery Walk Civil War Picture Analysis- With a partner- Use post-it notes to analyze and annotate the photos. Put the post-its.
People Cannot Choose a Representative Sample Carla L. Hill Marist College.
% The percent sign is computer language for: Get ready, here comes something you want or OK that is all you needed A “%” should be at the beginning and.
Computer Skills and Applications 8th Grade
Dictionaries CSE 1310 – Introduction to Computers and Programming
Presentation Purpose 6.01 Understand business uses of presentation software and methods of distribution Presentation Purpose.
Did Lincoln free the slaves? Or did the slaves free themselves?
warm-up: Complete on your own sheet of paper.
Raise your hand if… you have ever read an entire paragraph, passage, or page only to realize that you have absolutely no clue what you just read.
Do Now What things do you think finally pushed the United States into civil war?
Raise your hand if… you have ever read an entire paragraph, passage, or page only to realize that you have absolutely no clue what you just read.
Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all.
Gettysburg Picture Analysis- Gallery Walk
The Gettysburg Address
Style Analysis: SYNTAX
Raise your hand if… you have ever read an entire paragraph, passage, or page only to realize that you have absolutely NO clue what you just read.
Presentation Purpose 6.01 Understand business uses of presentation software and methods of distribution Presentation Purpose.
The Gettysburg Address
Raise your hand if… you have ever read an entire paragraph, passage, or page only to realize that you have absolutely no clue what you just read.
SOAPSTone is a reading and writing strategy that helps us recognize the structure of a text and aides student writing from planning through to revision.
7X Monday The Tide of War Turns
Bell Ringer Write a short response to the following quote. I am looking for about 4 sentences total. Tell me what you think the quote means and how it.
Rhetorical Devices…SPEECHES!
Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all.
Emancipation Proclamation
Presentation transcript:

Dictionaries CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1

Keys and Values Oftentimes we need lists of pairs, that associate values to specific keys. – E.g.: product codes are keys, and prices are values. – E.g.: names are keys, and phone numbers are values. phone_list = [['mary', 2341], ['joe', 5423]] 2

Life Without Dictionaries Oftentimes we need lists of pairs, that associate values to specific keys. – E.g.: product codes are keys, and prices are values. – E.g.: names are keys, and phone numbers are values. phone_list = [['mary', 2341], ['joe', 5423]] Then, finding the value for a specific key is doable, but a bit of a pain. 3

Life Without Dictionaries Oftentimes we need lists of pairs, that associate values to specific keys. – E.g.: product codes are keys, and prices are values. – E.g.: names are keys, and phone numbers are values. phone_list = [['mary', 2341], ['joe', 5423]] Then, finding the value for a specific key is doable, but a bit of a pain. for item in phone_list: if (item[0] == 'joe'): print(item[1]) 4

Life With Dictionaries With dictionaries, dealing with key-value pairs becomes much easier. phonebook = {'mary' : 2341, 'joe' : 5423} Then, finding the value for a specific key is very simple: print(phonebook['joe']) 5

The in Operator key in dictionary returns true if the specified key is a key in the dictionary. IMPORTANT: for dictionaries, the in operator only looks at keys, not values. 6 >>> phonebook {'mary': 59013, 'joe': 23432} >>> 'joe' in phonebook True >>> in phonebook False >>> 'bill' in phonebook False

The del Function You can delete dictionary entries using the del function. If your dictionary is stored in a variable called dictionary_name, and you want to delete the entry associated with the specific key, use this syntax: del(dictionary_name[key]) 7 >>> phonebook {'mary': 59013, 'joe': 23432} >>> del(phonebook['mary']) >>> phonebook {'joe': 23432}

The len Function As in lists and strings, the len operator gives you the number of elements (i.e., the number of key-value pairs) in the dictionary. 8 >>> phonebook {'mary': 59013, 'joe': 23432} >>> len(phonebook) 2

The items Method dictionary.items() returns the set of key-value pairs in dictionary. 9 >>> phonebook {'mary': 59013, 'joe': 23432} >>> entries = phonebook.items() >>> for entry in entries: print(entry) ('mary', 59013) ('joe', 23432)

The keys Method dictionary.keys() returns the set of keys in dictionary. 10 >>> elements = phonebook.keys() >>> for element in elements: print(element) mary joe

The values Method dictionary.values() returns the set of values in dictionary. 11 >>> elements = phonebook.values() >>> for element in elements: print(element)

Example: Frequencies of Words in Text Suppose that we have a text file, and we want to: – count how many unique words appear in the text. – count how many times each of those words appears. 12

Step 1: Count Frequencies def count_words(filename): in_file = open(filename, "r") # initialize the dictionary to empty result = {} for line in in_file: words = line.split() for word in words: if (word in result): result[word] += 1 else: result[word] = 1 return result 13 dictionary operations shown in red

Step 2: Printing the Result def main(): filename = "file1.txt" dictionary = count_words(filename) print(dictionary) {'final': 1, 'men,': 1, 'brought': 1, 'met': 1, 'and': 5, 'here,': 2, 'Four': 1, 'years': 1, 'nor': 1, 'any': 1, 'not': 5, 'it,': 1, 'nation,': 3, 'say': 1, 'God,': 1, 'unfinished': 1, 'have': 5, 'battlefield': 1, 'nation': 2, 'or': 2, 'come': 1, 'nobly': 1, 'vain-that': 1, 'proposition' … 14 This printout is hard to read

Step 2 Revised def print_word_frequencies(dictionary): print() for word in dictionary: frequency = dictionary[word] print(word + ":", frequency) print() 15 OUTPUT: nor: 1 fought: 1 last: 1 hallow: 1 endure.: 1 can: 5 highly: 1 rather: 1 of: 5 men,: 1 in: 4 here,: 2 brought: 1 here.: 1 The: 2 on: 2 our: 2 or: 2 … We have defined a function that prints the dictionary in a nicer format. Remaining problems?

Step 2 Revised def print_word_frequencies(dictionary): print() for word in dictionary: frequency = dictionary[word] print(word + ":", frequency) print() 16 OUTPUT: nor: 1 fought: 1 last: 1 hallow: 1 endure.: 1 can: 5 highly: 1 rather: 1 of: 5 men,: 1 in: 4 here,: 2 brought: 1 here.: 1 The: 2 on: 2 our: 2 or: 2 … the: 9 … Remaining problems? – Result must be case-insensitive. E.g., "The" and "the" should not be counted as separate words.

Step 2 Revised def print_word_frequencies(dictionary): print() for word in dictionary: frequency = dictionary[word] print(word + ":", frequency) print() 17 OUTPUT: nor: 1 fought: 1 last: 1 hallow: 1 endure.: 1 can: 5 highly: 1 rather: 1 of: 5 men,: 1 in: 4 here,: 2 brought: 1 here.: 1 The: 2 on: 2 our: 2 or: 2 … Remaining problems? – We should ignore punctuation. E.g., "here" and "here." should not be counted as separate words.

Step 2 Revised def print_word_frequencies(dictionary): print() for word in dictionary: frequency = dictionary[word] print(word + ":", frequency) print() 18 OUTPUT: nor: 1 fought: 1 last: 1 hallow: 1 endure.: 1 can: 5 highly: 1 rather: 1 of: 5 men,: 1 in: 4 here,: 2 brought: 1 here.: 1 The: 2 on: 2 our: 2 or: 2 … Remaining problems? – Would be nice to sort, either alphabetically or by frequency.

Making Result Case Insensitive 19 To make the results case-insensitive, we can simply convert all text to lower case as soon as we read it from the file.

Making Result Case Insensitive def count_words(filename): in_file = open(filename, "r") # initialize the dictionary to empty result = {} for line in in_file: line = line.lower() words = line.split() for word in words: if (word in result): result[word] += 1 else: result[word] = 1 return result 20 PREVIOUS OUTPUT: in: 4 here,: 2 brought: 1 here.: 1 The: 2 on: 2 our: 2 or: 2 … the: 9 … 156 words found

Making Result Case Insensitive def count_words(filename): in_file = open(filename, "r") # initialize the dictionary to empty result = {} for line in in_file: line = line.lower() words = line.split() for word in words: if (word in result): result[word] += 1 else: result[word] = 1 return result 21 NEW OUTPUT: … devotion-that: 1 field,: 1 honored: 1 testing: 1 far: 2 the: 11 from: 2 advanced.: 1 full: 1 above: 1 … 153 words found

22 To ignore punctuation, we will: – delete from the text that we read all occurrences of punctuation characters (periods, commas, parentheses, exclamation marks, quotes). – We will replace dashes with spaces (since dashes are used to separate individual words). To isolate this processing step, we make a separate function for it, that we call process_line. Ignoring Punctuation

def process_line(line): line = line.lower() new_line = "" for letter in line: if letter in """,.!"'()""":: continue elif letter == '-': letter = ' ' new_line = new_line + letter words = new_line.split() return words 23

Ignoring Punctuation def count_words(filename): in_file = open(filename, "r") # initialize the dictionary to empty result = {} for line in in_file: words = process_line(line) for word in words: if (word in result): result[word] += 1 else: result[word] = 1 return result 24 PREVIOUS OUTPUT: people,: 3 under: 1 those: 1 to: 8 men,: 1 full: 1 are: 3 it,: 1 … for: 5 whether: 1 men: 1 sense,: 1 … 153 words found

Ignoring Punctuation def count_words(filename): in_file = open(filename, "r") # initialize the dictionary to empty result = {} for line in in_file: words = process_line(line) for word in words: if (word in result): result[word] += 1 else: result[word] = 1 return result 25 NEW OUTPUT: fathers: 1 people: 3 forth: 1 for: 5 men: 2 ago: 1 field: 1 increased: 1 … 138 words found

26 We want to sort the results by frequency. Frequencies are values in the dictionary. So, our problem is the more general problem of sorting dictionary entries by value. To do that, we will create an inverse dictionary, called inverse, where: – inverse[frequency] is a list of all words in the original dictionary having that frequency. To isolate this processing step, we make a separate function for it, that we call inverse_dictionary. We use inverse_dictionary in print_word_frequencies. Sorting by Frequency

def inverse_dictionary(in_dictionary): out_dictionary = {} for key in in_dictionary: value = in_dictionary[key] if (value in out_dictionary): list_of_keys = out_dictionary[value] list_of_keys.append(key) else: out_dictionary[value] = [key] return out_dictionary 27 Sorting by Frequency

def print_word_frequencies(dictionary): print() inverse = inverse_dictionary(dictionary) frequencies = inverse.keys() frequencies = list(frequencies) frequencies.sort() frequencies.reverse() for frequency in frequencies: list_of_words = inverse[frequency] list_of_words.sort() for word in list_of_words: print(word + ":", frequency) 28 Sorting by Frequency

def print_word_frequencies(dictionary): print() inverse = inverse_dictionary(dictionary) frequencies = inverse.keys() frequencies = list(frequencies) frequencies.sort() frequencies.reverse() for frequency in frequencies: list_of_words = inverse[frequency] list_of_words.sort() for word in list_of_words: print(word + ":", frequency) 29 Sorting by Frequency PREVIOUS OUTPUT: live: 1 above: 1 but: 2 government: 1 gave: 2 note: 1 remember: 1 advanced: 1 world: 1 whether: 1 equal: 1 seven: 1 task: 1 they: 3 … 153 words found

def print_word_frequencies(dictionary): print() inverse = inverse_dictionary(dictionary) frequencies = inverse.keys() frequencies = list(frequencies) frequencies.sort() frequencies.reverse() for frequency in frequencies: list_of_words = inverse[frequency] list_of_words.sort() for word in list_of_words: print(word + ":", frequency) 30 Sorting by Frequency NEW OUTPUT: that: 13 the: 11 we: 10 here: 8 to: 8 a: 7 and: 6 can: 5 for: 5 have: 5 it: 5 nation: 5 not: 5 of: 5 … 153 words found