CSCE 590 Web Scraping: Lecture 2

Slides:



Advertisements
Similar presentations
Container Types in Python
Advertisements

String and Lists Dr. Benito Mendoza. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list List.
I210 review Fall 2011, IUB. Python is High-level programming –High-level versus machine language Interpreted Language –Interpreted versus compiled 2.
The Python Programming Language Matt Campbell | Steve Losh.
Data Structures Akshay Singh.  Lists in python can contain any data type  Declaring a list:  a = [‘random’,’variable’, 1, 2]
1 Python Control of Flow and Defining Classes LING 5200 Computational Corpus Linguistics Martha Palmer.
Python Data Structures
Data Structures in Python By: Christopher Todd. Lists in Python A list is a group of comma-separated values between square brackets. A list is a group.
Python: Classes By Matt Wufsus. Scopes and Namespaces A namespace is a mapping from names to objects. ◦Examples: the set of built-in names, such as the.
Collecting Things Together - Lists 1. We’ve seen that Python can store things in memory and retrieve, using names. Sometime we want to store a bunch of.
Built-in Data Structures in Python An Introduction.
Getting Started with Python: Constructs and Pitfalls Sean Deitz Advanced Programming Seminar September 13, 2013.
Introducing Python CS 4320, SPRING Resources We will be following the Python tutorialPython tutorial These notes will cover the following sections.
Python Functions.
CLASSES Python Workshop. Introduction  Compared with other programming languages, Python’s class mechanism adds classes with a minimum of new syntax.
Daniel Jung. Types of Data Structures  Lists Stacks Queues  Tuples  Sets  Dictionaries.
Python I Some material adapted from Upenn cmpe391 slides and other sources.
Tuples Chapter 10 Python for Informatics: Exploring Information
Development Environment: Connectors Prepared by T280 Dec, 2014.
LECTURE 3 Python Basics Part 2. FUNCTIONAL PROGRAMMING TOOLS Last time, we covered function concepts in depth. We also mentioned that Python allows for.
CS190/295 Programming in Python for Life Sciences: Lecture 6 Instructor: Xiaohui Xie University of California, Irvine.
Lists Michael Ernst CSE 140 University of Washington.
Python Data Structures By Greg Felber. Lists An ordered group of items Does not need to be the same type – Could put numbers, strings or donkeys in the.
Quiz 3 Topics Functions – using and writing. Lists: –operators used with lists. –keywords used with lists. –BIF’s used with lists. –list methods. Loops.
String and Lists Dr. José M. Reyes Álamo. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list.
CSCI/CMPE 4341 Topic: Programming in Python Chapter 7: Introduction to Object- Oriented Programming in Python – Exercises Xiang Lian The University of.
CMSC201 Computer Science I for Majors Lecture 20 – Classes and Modules Prof. Katherine Gibson Prof. Jeremy Dixon Based on slides from the.
Lecture III Syntax ● Statements ● Output ● Variables ● Conditions ● Loops ● List Comprehension ● Function Calls ● Modules.
String and Lists Dr. José M. Reyes Álamo.
Ruby: An Introduction Created by Yukihiro Matsumoto in 1993 (named after his birthstone) Pure OO language (even the number 1 is an instance of a class)
Embedded Software Development with Python and the Raspberry Pi
CMSC201 Computer Science I for Majors Lecture 17 – Dictionaries
Containers and Lists CIS 40 – Introduction to Programming in Python
CS-104 Final Exam Review Victor Norman.
Data Structures: Lists
Ruth Anderson University of Washington CSE 160 Spring 2015
Lecture 10 Data Collections
Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language. It was created by Guido van Rossum during.
Intro to Computer Science CS1510 Dr. Sarah Diesburg
Tuples Chapter 10 Python for Everybody
CS190/295 Programming in Python for Life Sciences: Lecture 5
LING 388: Computers and Language
CISC101 Reminders Quiz 2 this week.
Introduction to Python
Bryan Burlingame Halloween 2018
CMSC201 Computer Science I for Majors Lecture 16 – Classes and Modules
Guide to Programming with Python
CS190/295 Programming in Python for Life Sciences: Lecture 6
Ruth Anderson University of Washington CSE 160 Winter 2017
4. sequence data type Rocky K. C. Chang 16 September 2018
Intro to Computer Science CS1510 Dr. Sarah Diesburg
Python for Informatics: Exploring Information
String and Lists Dr. José M. Reyes Álamo.
CSCE 590 Web Scraping: Lecture 2
Python I Some material adapted from Upenn cmpe391 slides and other sources.
Intro to Computer Science CS1510 Dr. Sarah Diesburg
15-110: Principles of Computing
CISC101 Reminders Assignment 2 due today.
Introduction to Computer Science
Bryan Burlingame Halloween 2018
Python Review
Intro to Computer Science CS1510 Dr. Sarah Diesburg
Functions Functions being objects means functions can be passed into other functions: sort (list, key=str.upper)
Winter 2019 CISC101 5/26/2019 CISC101 Reminders
Sample lecture slides.
Dictionary.
Python List.
Python - Tuples.
Introduction to Computer Science
Presentation transcript:

CSCE 590 Web Scraping: Lecture 2 Topics Dictionaries Object Oriented Python Readings: Python tutorial January 10, 2017

Overview Last Time – Lec01 all slides covered Today Web applications overview Python into: dynamic typing, strings, slices, lists, lists of lists, if, while, for w in wordlist:, def functions Today Class exercise – wordlists Data Dictionaries Libraries Object Oriented Python Beautiful Soup Documentation Testing & Test First Design

More Resources Textbook: Website Code examples: Python: YouTube: Videos on python lists, dictionaries, classes etc. https://www.youtube.com/playlist?list=PL8830E081324343F1

Python on YouTube

Homework: Due Sunday Midnight

List Methods list.append(x) - Add an item to the end of the list. Equivalent to a[len(a):] = [x]. list.extend(L) - Extend the list by appending all the items in the given list. Equivalent to a[len(a):] = L. list.insert(i, x) - Insert an item x at position i. list.remove(x) - Remove the first item from the list whose value is x. It is an error if there is no such item. list.pop([i]) - Remove the item at the given position in the list, and return it. If no index is specified, a.pop() removes and returns the last item in the list. (The square brackets around the i in the method signature denote that the parameter is optional list.clear() - Remove all items from the list. Equivalent to del a[:]. Reference section 5.1 Python Tutorial

list.index(x[, start[, end]]) - Return zero-based index in the list of the first item whose value is x. Raises a ValueError if there is no such item. ... list.count(x) - Return the number of times x appears in the list. list.sort(key=None, reverse=False) - Sort the items of the list in place ... list.reverse() - Reverse the elements of the list in place. list.copy() - Return a shallow copy of the list. Equivalent to a[:]

Lists as stacks and Queues 5.1.1 Using Lists as a stack stck = [23, 14, 55] // 55 is on top stck. pop() // yields [23, 14] returns 55 stck. append(43) // yields [23, 14, 43]

Lists as Queues 5.1.1 Using Lists as Queues Bad idea; not efficient Use collections.deque “deal from both ends” Optimized for inserting and removing from either end >>> from collections import deque >>> queue = deque(["Eric", "John", "Michael"]) >>> queue.append("Terry") >>> queue.append("Graham") >>> queue.popleft() >>> queue

5.1.3. List Comprehensions List comprehensions provide a concise way to create lists squares = list(map(lambda x: x**2, range(10))) squares = [x**2 for x in range(10)] [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y] from math import pi [str(round(pi, i)) for i in range(1, 6)]

Tuples

Sets

Dictionaries Dictionaries are sometimes found in other languages as “associative memories” or “associative arrays”. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, The keys can be any immutable type; strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples; You can’t use lists as keys, since lists can be modified in place

Sample dictionary operations tel = {'jack': 4098, 'sape': 4139} tel['guido'] = 4127 tel['jack'] del tel['sape'] tel['irv'] = 4127 list(tel.keys()) sorted(tel.keys()) 'guido' in tel //Test returns TRUE 'jack' not in tel //Test returns FALSE

dict() constructor The dict() constructor builds dictionaries directly from sequences of key-value pairs: >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)]) {'sape': 4139, 'jack': 4098, 'guido': 4127} dict comprehensions >>> {x: x**2 for x in (2, 4, 6)} {2: 4, 4: 16, 6: 36} When the keys are simple strings, it is sometimes easier to specify pairs using keyword arguments: >>> dict(sape=4139, guido=4127, jack=4098)

Looping through Dictionaries knights = {'gallahad': 'the pure', 'robin': 'the brave'} for k, v in knights.items(): print(k, v)

Coding Style Use 4-space indentation, and no tabs. Wrap lines so that they don’t exceed 79 characters. Use blank lines to separate functions and classes, and larger blocks of code inside functions. When possible, put comments on a line of their own. Use docstrings. Use spaces around operators and after commas, but not directly inside bracketing constructs: a = f(1, 2) + g(3, 4).

Coding Style (continued) Name your classes and functions consistently; the convention is to use CamelCase for classes and lower_case_with_underscores (snake case?) for functions and methods. Always use self as the name for the first method argument Don’t use fancy encodings if your code is meant to be used in international environments. Python’s default, UTF-8, or even plain ASCII work best in any case. Likewise, don’t use non-ASCII characters in identifiers if there is only the slightest chance people speaking a different language will read or maintain the code.

Classes – Python Tutorial Sec 9 9.3.1. Class Definition Syntax The simplest form of class definition looks like this: class ClassName: <statement-1> . <statement-N>

__init__() method (constructor) class Complex: def __init__(self, realpart, imagpart): self.r = realpart self.i = imagpart x = Complex(3.0, -4.5) x.r, x.i

Instance Objects x.counter = 1 while x.counter < 10: x.counter = x.counter * 2 print(x.counter) del x.counter

9.3.5. Class and Instance Variables class Dog: kind = 'canine' # class variable shared by all instances def __init__(self, name): self.name = name d = Dog('Fido')

surprising effects with involving mutable objects #shared data can have possibly surprising effects with involving mutable objects such as lists and dictionaries class Dog: tricks = [ ] # mistaken use of a class variable def __init__(self, name): self.name = name def add_trick(self, trick): self.tricks.append(trick) >>> d = Dog('Fido') >>> e = Dog('Buddy') >>> d.add_trick('roll over') >>> e.add_trick('play dead') >>> d.tricks # unexpectedly shared by all dogs ['roll over', 'play dead']

Additional uses of self class Bag: def __init__(self): self.data = [] def add(self, x): self.data.append(x) def addtwice(self, x): self.add(x)

Inheritance Yes of course

Private Variables Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped

# Iterators -- iter(element) Implicity used in for w in […] >>> s = 'abc' >>> it = iter(s) >>> it <iterator object at 0x00A1DB50> >>> next(it) 'a' 'b' 'c' Traceback (most recent call last): File "<stdin>", line 1, in ? next(it)