LING 388: Computers and Language Lecture 7
Today's Topics Programming exercises (to be done in class) Homework
Python https://docs.python.org/3/tutorial/introduction.html We have covered: Numbers Strings (indexing, slices) Lists (stacks, queues – see deques) Sets Dictionaries Ranges
Exercise 1 Enter this (1st paragraph in "Alice's Adventures in Wonderland" by Lewis Carroll) into Python: paragraph1 = 'Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do. Once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"' len(paragraph1) counts what? What does paragraph1.split() do? Store the result of split into a variable paragraph2 len(paragraph2) counts what? Calculate the average number of characters per word
Exercise 2 Write a for loop to print each word on a separate line paragraph2 ['Alice', 'was', 'beginning', 'to', 'get', 'very', 'tired', 'of', 'sitting', 'by', 'her', 'sister', 'on', 'the', 'bank,', 'and', 'of', 'having', 'nothing', 'to', 'do.', 'Once', 'or', 'twice', 'she', 'had', 'peeped', 'into', 'the', 'book', 'her', 'sister', 'was', 'reading,', 'but', 'it', 'had', 'no', 'pictures', 'or', 'conversations', 'in', 'it,', '"and', 'what', 'is', 'the', 'use', 'of', 'a', 'book,"', 'thought', 'Alice,', '"without', 'pictures', 'or', 'conversations?"'] Write a for loop to print each word on a separate line Notice some words end in punctuation, e.g. bank, or do. or book," Notice some words begin in punctuation, e.g. "and or "without. Try the following: import re word = 'conversations?"' re.sub("[?\"'.]","",word) Write a for loop to print the each word of paragraph2 (with punctuation deleted) on a separate line
Exercise 2 Enter this: What does this kind of for loop do? [x.lower() for x in paragraph2] [re.sub("[?\"'.]","",x) for x in paragraph2] What does this kind of for loop do? (It's called a list comprehension.)
Exercise 2: Examples >>>[x.lower() for x in paragraph2] ['alice', 'was', 'beginning', 'to', 'get', 'very', 'tired', 'of', 'sitting', 'by', 'her', 'sister', 'on', 'the', 'bank,', 'and', 'of', 'having', 'nothing', 'to', 'do.', 'once', 'or', 'twice', 'she', 'had', 'peeped', 'into', 'the', 'book', 'her', 'sister', 'was', 'reading,', 'but', 'it', 'had', 'no', 'pictures', 'or', 'conversations', 'in', 'it,', '"and', 'what', 'is', 'the', 'use', 'of', 'a', 'book,"', 'thought', 'alice,', '"without', 'pictures', 'or', 'conversations?"'] >>>paragraph3 = [x.lower() for x in paragraph2] >>>paragraph3 >>>[re.sub("[,.?'\"]", "", x) for x in paragraph3] ['alice', 'was', 'beginning', 'to', 'get', 'very', 'tired', 'of', 'sitting', 'by', 'her', 'sister', 'on', 'the', 'bank', 'and', 'of', 'having', 'nothing', 'to', 'do', 'once', 'or', 'twice', 'she', 'had', 'peeped', 'into', 'the', 'book', 'her', 'sister', 'was', 'reading', 'but', 'it', 'had', 'no', 'pictures', 'or', 'conversations', 'in', 'it', 'and', 'what', 'is', 'the', 'use', 'of', 'a', 'book', 'thought', 'alice', 'without', 'pictures', 'or', 'conversations'] >>>[re.sub("[.,?'\"]", "",x.lower()) for x in paragraph2] >>>[re.sub("[.,?'\"]", "",x).lower() for x in paragraph2] >>>[re.sub("[.,?'\"]", "",x) for x.lower() in paragraph2] File "<stdin>", line 1 SyntaxError: can't assign to function call
Exercise 3 Enter this: from collections import Counter c = Counter() c['alice'] += 1 c['a'] += 1 c c.most_common() See also https://docs.python.org/3/library/collections.html#collections.Counter
Homework 4 Starting with: paragraph1 = 'Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do. Once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, "and what is the use of a book," thought Alice, "without pictures or conversations?"' and using what you've learnt in Exercises 1–3, write Python code that builds a frequency table for words in paragraph1 Use the most_common() method to print your table Submit your python console (cut and paste or screen snapshot) showing your work
Homework 4 Submit to TA Patricia Lee by next Wednesday (by midnight) One PDF file only!