CSc 110, Autumn 2016 Lecture 20: Lists for Tallying; Text Processing

Slides:



Advertisements
Similar presentations
1 Various Methods of Populating Arrays Randomly generated integers.
Advertisements

Building Java Programs Chapter 7 Arrays. 2 Can we solve this problem? Consider the following program (input underlined): How many days' temperatures?
Copyright 2008 by Pearson Education Building Java Programs Chapter 7 Lecture 7-2: Tallying and Traversing Arrays reading: 7.1 self-checks: #1-9 videos:
Multi-Dimensional Arrays Rectangular & Jagged Plus: More 1D traversal.
Copyright 2008 by Pearson Education Building Java Programs Chapter 7 Lecture 7-2: Tallying and Traversing Arrays reading: 7.1 self-checks: #1-9 videos:
Copyright 2010 by Pearson Education Building Java Programs Chapter 7 Lecture 7-3: Arrays for Tallying; Text Processing reading: 4.3, 7.6.
1 Reference semantics. 2 A swap method? Does the following swap method work? Why or why not? public static void main(String[] args) { int a = 7; int b.
Building Java Programs Chapter 7 Arrays Copyright (c) Pearson All rights reserved.
Building Java Programs Chapter 7 Arrays Copyright (c) Pearson All rights reserved.
1 BUILDING JAVA PROGRAMS CHAPTER 7 LECTURE 7-3: ARRAYS AS PARAMETERS; FILE OUTPUT.
Copyright 2008 by Pearson Education Building Java Programs Chapter 7 Lecture 7-3: Arrays as Parameters; File Output reading: 7.1, 4.3, 3.3 self-checks:
Building Java Programs Chapter 7 Lecture 7-3: Arrays for Tallying; Text Processing reading: 7.6, 4.3.
1 Building Java Programs Chapter 7: Arrays These lecture notes are copyright (C) Marty Stepp and Stuart Reges, They may not be rehosted, sold, or.
Building Java Programs Chapter 7
Copyright 2010 by Pearson Education Building Java Programs Chapter 7 Lecture 7-3: Arrays for Tallying; Text Processing reading: 7.1, 4.4 self-checks: #1-9.
Copyright 2010 by Pearson Education Building Java Programs Chapter 7 Lecture 17: Arrays for Tallying; Text Processing reading: 4.3, 7.6 (Slides adapted.
Topic 23 arrays - part 3 (tallying, text processing) Copyright Pearson Education, 2010 Based on slides bu Marty Stepp and Stuart Reges from
Copyright 2008 by Pearson Education Building Java Programs Chapter 7 Lecture 7-3: Arrays as Parameters; File Output reading: 7.1, 4.3, 3.3 self-checks:
Building Java Programs Chapter 7 Arrays Copyright (c) Pearson All rights reserved.
CSc 110, Autumn 2016 Lecture 15: lists Adapted from slides by Marty Stepp and Stuart Reges.
CSc 110, Autumn 2016 Lecture 26: Sets and Dictionaries
Adapted from slides by Marty Stepp and Stuart Reges
Building Java Programs
CSc 110, Autumn 2016 Lecture 27: Sets and Dictionaries
CSc 110, Autumn 2017 Lecture 24: Lists for Tallying; Text Processing
CSc 110, Autumn 2016 Lecture 13: Random Numbers
CSc 110, Spring 2018 Lecture 32: Sets and Dictionaries
CSc 110, Autumn 2017 Lecture 31: Dictionaries
CSc 110, Autumn 2017 Lecture 30: Sets and Dictionaries
2008/11/24: Lecture 19 CMSC 104, Section 0101 John Y. Park
Building Java Programs
Building Java Programs
Adapted from slides by Marty Stepp and Stuart Reges
2008/11/24: Lecture 19 CMSC 104, Section 0101 John Y. Park
CSc 110, Spring 2018 Lecture 33: Dictionaries
Building Java Programs
Building Java Programs
CSc 110, Spring 2018 Lecture 9: Parameters, Graphics and Random
CSc 110, Autumn 2016 Lecture 19: lists as Parameters
Lecture 15: lists Adapted from slides by Marty Stepp and Stuart Reges
CSc 110, Autumn 2016 Lecture 9: input; if/else
CSc 110, Autumn 2017 Lecture 9: Graphics and Nested Loops
CSc 110, Spring 2017 Lecture 17: Line-Based File Input
Lecture 7-3: Arrays for Tallying; Text Processing
Topic 23 arrays - part 3 (tallying, text processing)
Building Java Programs
Why did the programmer quit his job?
Adapted from slides by Marty Stepp and Stuart Reges
Adapted from slides by Marty Stepp and Stuart Reges
Building Java Programs Chapter 7
Adapted from slides by Marty Stepp and Stuart Reges
Building Java Programs
Adapted from slides by Marty Stepp and Stuart Reges
CSc 110, Spring 2017 Lecture 20: Lists for Tallying; Text Processing
CSc 110, Spring 2018 Lecture 5: Loop Figures and Constants
Building Java Programs
Building Java Programs
Adapted from slides by Marty Stepp and Stuart Reges
Adapted from slides by Marty Stepp and Stuart Reges
Building Java Programs
Building Java Programs
Building Java Programs
Building Java Programs
Building Java Programs
Building Java Programs
Building Java Programs
CSc 110, Autumn 2016 Lecture 20: Lists for Tallying; Text Processing
Arrays, Part 2 of 2 Topics Array Names Hold Address How Indexing Works
A counting problem Problem: Write a method mostFrequentDigit that returns the digit value that occurs most frequently in a number. Example: The number.
Building Java Programs
Presentation transcript:

CSc 110, Autumn 2016 Lecture 20: Lists for Tallying; Text Processing Adapted from slides by Marty Stepp and Stuart Reges

Value/Reference Semantics Variables of type int, float, boolean, store values directly: Values are copied from one variable to another: cats = age Variables of object types store references to memory: References are copied from one variable to another: scores = grades age 20 cats 3 age 20 cats 20 index 1 2 value 89 78 93 grades scores

A multi-counter problem Problem: Write a function most_frequent_digit that returns the digit value that occurs most frequently in a number. Example: The number 669260267 contains: one 0, two 2s, four 6es, one 7, and one 9. most_frequent_digit(669260267) returns 6. If there is a tie, return the digit with the lower value. most_frequent_digit(57135203) returns 3.

A multi-counter problem We could declare 10 counter variables ... counter0, counter1, counter2, counter3, counter4, counter5, counter6, counter7, counter8, counter9 But a better solution is to use a list of size 10. The element at index i will store the counter for digit value i. Example for 669260267: How do we build such an list? And how does it help? index 1 2 3 4 5 6 7 8 9 value

Tally solution # Returns the digit value that occurs most frequently in n. # Breaks ties by choosing the smaller value. def most_frequent_digit(n): counts = [0] * 10 while (n > 0): digit = n % 10 # pluck off a digit and tally it counts[digit] += 1 n = n / 10 # find the most frequently occurring digit best_index = 0 for i in range(1, len(counts)): if (counts[i] > counts[best_index]): best_index = i return best_index

Section attendance question Read a file of section attendance (see next slide): yynyyynayayynyyyayanyyyaynayyayyanayyyanyayna ayyanyyyyayanaayyanayyyananayayaynyayayynynya yyayaynyyayyanynnyyyayyanayaynannnyyayyayayny And produce the following output: Section 1 Student points: [20, 16, 17, 14, 11] Student grades: [100.0, 80.0, 85.0, 70.0, 55.0] Section 2 Student points: [16, 19, 14, 14, 8] Student grades: [80.0, 95.0, 70.0, 70.0, 40.0] Section 3 Student points: [16, 15, 16, 18, 14] Student grades: [80.0, 75.0, 80.0, 90.0, 70.0] Students earn 3 points for each section attended up to 20.

Section input file Each line represents a section. student 123451234512345123451234512345123451234512345 Each line represents a section. A line consists of 9 weeks' worth of data. Each week has 5 characters because there are 5 students. Within each week, each character represents one student. a means the student was absent (+0 points) n means they attended but didn't do the problems (+1 points) y means they attended and did the problems (+3 points) week 1 2 3 4 5 6 7 8 9 section 1 section 2 section 3 yynyyynayayynyyyayanyyyaynayyayyanayyyanyayna ayyanyyyyayanaayyanayyyananayayaynyayayynynya yyayaynyyayyanynnyyyayyanayaynannnyyayyayayny

Section attendance answer def main(): file = open("sections.txt") lines = file.readlines() section = 1 for line in lines: points = [0] * 5 for i in range(0, len(line)): student = i % 5 earned = 0 if (line[i] == 'y'): # c == 'y' or 'n' or 'a' earned = 3 elif (line[i] == 'n'): earned = 1 points[student] = min(20, points[student] + earned) grades = [0] * 5 for i in range(0, len(points)): grades[i] = 100.0 * points[i] / 20 print("Section " + str(section)) print("Student points: " + str(points)) print("Student grades: " + str(grades)) print() section += 1

Data transformations In many problems we transform data between forms. Example: digits  count of each digit  most frequent digit Often each transformation is computed/stored as an list. For structure, a transformation is often put in its own function. Sometimes we map between data and list indexes. by position (store the i th value we read at index i ) tally (if input value is i, store it at array index i ) explicit mapping (count 'J' at index 0, count 'X' at index 1) Exercise: Modify our Sections program to use functions that use lists as parameters and returns.

List param/return answer # This program reads a file representing which students attended # which discussion sections and produces output of the students' # section attendance and scores. def main(): file = open("sections.txt") lines = file.readlines() section = 1 for line in lines: # process one section points = count_points(line) grades = compute_grades(points) results(section, points, grades) section += 1 # Produces all output about a particular section. def results(section, points, grades): print("Section " + str(section)) print("Student scores: " + str(points)) print("Student grades: " + str(grades)) print() ...

List param/return answer ... # Computes the points earned for each student for a particular section. def count_points(line): points = [0] * 5 for i in range(0, len(line)): student = i % 5 earned = 0 if (line[i] == 'y'): # c == 'y' or c == 'n' earned = 3 elif (line[i] == 'n'): earned = 2 points[student] = min(20, points[student] + earned) return points # Computes the percentage for each student for a particular section. def compute_grades(points): grades = [0] * 5 for i in range(0, len(points)): grades[i] = 100.0 * points[i] / 20 return grades