Download presentation
Presentation is loading. Please wait.
Published byGertrude Bond Modified over 6 years ago
1
Programming for Geographical Information Analysis: Core Skills
Modules and Packages
2
Review We've seen that a module is a file that can contain classes as well as its own variables. We've seen that you need to import it to access the code, and then use the module name to refer to it. import module1 a = module1.ClassName()
3
This lecture Import. Modules. Packages. Useful standard library packages. Useful external packages. This lecture, we'll look in more depth at packages and modules, and look at some useful modules in the standard library.
4
Packages modules: usually single files to do some set of jobs packages: modules with a namespace, that is, a unique way of referring to them libraries: a generic name for a collection of code you can use to get specific types of job done.
5
Packages The Standard Python Library comes with a number of other packages which are not imported automatically. We need to import them to use them.
6
Import import agentframework point_1 = agentframework.Agent() This is a very explicit style. There is little ambiguity about which Agent we are after (if other imported modules have Agent classes). This is safest as you have to be explicit about the module. Provided there aren't two modules with the same name and class, you are fine. If you're sure there are no other Agent, you can: from agentframework import Agent point_1 = Agent() This just imports this one class.
7
NB You will often see imports of everything in a module: from agentframework import * This is easy, because it saves you having to import multiple classes, but it is dangerous: you have no idea what other classes are in there that might replace classes you have imported elsewhere. In other languages, with, frankly better, documentation and file structures, it is easy to find out which classes are in libraries, so you see this a lot. In Python, it is strongly recommended you don't do this. If you get code from elsewhere, change these to explicit imports.
8
As If the module name is very long (it shouldn't be), you can do this: import agentbasedmodellingframework as abm agent_1 = abm.Agent() If the classname is very long, you can: from abm import AgentsRepresentingPeople as Ag agent_1 = Ag() Some people like this, but it does make the code harder to understand.
9
When importing, Python will import parent packages (but not other subpackages) If hasn’t been used before, will search import path, which is usually (but not exclusively) the system path. If you're importing a package, don't have files with the same name (i.e. package_name.py) in the directory you're in, or they'll be imported rather than the package (even if you're inside them).
10
Interpreter To reload a module: import importlib; importlib.reload(modulename) In Spyder, just re-run the module file. Remember to do this if you update it.
11
This lecture Modules. Import. Packages.
Useful standard library packages. Useful external packages.
12
Modules and Packages Modules are single files that can contain multiple classes, variables, and functions. The main difference when thinking of module and scripts is that the former is generally imported, and the latter generally runs directly. Packages are collections of modules structured using a directory tree.
13
Running module code Although we've concentrated on classes, you can import and run module-level functions, and access variables. import module1 print(module1.module_variable) module1.module_function() a = module1.ClassName()
14
Importing modules Indeed, you have to be slightly careful when importing modules. Modules and the classes in them will run to a degree on import. # module print ("module loading") # Runs def m1(): print ("method loading") class cls: print ("class loading") # Runs def m2(): print("instance method loading") Modules run incase there's anything that needs setting up (variables etc.) prior to functions or classes.
15
Modules that run If you're going to use this to run code, note that in general, code accessing a class or method has to be after if is defined: c = A() c.b() class A: def b (__self__) : print ("hello world") Doesn’t work, but: Does
16
Modules that run This doesn't count for imported code. This works fine because the files has been scanned down to c= A() before it runs, so all the methods are recognised. class A: def __init__ (self): self.b() def b (self) : print ("hello world") c = A()
17
Modules that run However, generally having large chunks of unnecessary code running is bad. Setting up variables is usually ok, as assignment generally doesn't cause issues. Under the philosophy of encapsulation, however, we don't really want code slooping around outside of methods/functions. The core encapsulation level for Python are the function and objects (with self; not the class). It is therefore generally worth minimising this code.
18
Running a module The best option is to have a 'double headed' file, that runs as a script with isolated code, but can also run as a module. As scripts run with a global __name__ variable in the runtime set to "__main__", the following code in a module will allow it to run either way without contamination. if __name__ == "__main__": # Imports needed for running. function_name()
19
This lecture Packages. Modules. Useful standard library packages.
Import. Modules. Packages. Useful standard library packages. Useful external packages.
20
Structure that constructs a dot delimited namespace based around a directory structure. /abm __init__.py /general agentframework.py /models model.py Packages The __init__.py can be empty. They allow Python to recognise that the subdirectories are sub-packages. You can now: import abm.general.agentframework.Agent etc. The base __init__.py can also include, e.g. __all__ = ["models", "general"] Which means that this will work: from abm import * If you want it to.
21
Running a package Packages can be run by placing the startup code in a file called __main__.py This could, for example use command line args to determine which model to run. This will run if the package is run in this form: python -m packagename Relatively trivial to include a bat or sh file to run this.
22
Package Advantages Structured approach, rather than having everything in one file. Allows files to import each other without being limited to same directory. Can set up the package to work together as an application. The more detailed the namespace (e.g. including unique identifiers) the less likely your identifiers (classnames; function names; variables) are to clash with someone else's.
23
This lecture Useful standard library packages. Packages.
Import. Modules. Packages. Useful standard library packages. Useful external packages.
24
Core libraries Scripts, by default only import sys (various system services/functions) and builtins (built-in functions, exceptions and special objects like None and False). The Python shell doesn’t import sys, and builtins is hidden away as __builtins__.
25
Built in functions https://docs.python.org/3/library/functions.html
abs() dict() help() min() setattr() all() dir() hex() next() slice() any() divmod() id() object() sorted() ascii() enumerate() input() oct() staticmethod() bin() eval() int() open() str() bool() exec() isinstance() ord() sum() bytearray() filter() issubclass() pow() super() bytes() float() iter() print() tuple() callable() format() len() property() type() chr() frozenset() list() range() vars() classmethod() getattr() locals() repr() zip() compile() globals() map() reversed() __import__() complex() hasattr() max() round() delattr() hash() memoryview() set()
26
Python Standard Library
Most give useful recipes for how to do major jobs you're likely to want to do.
27
Useful libraries: text
difflib – for comparing text documents; can for example generate a webpages detailing the differences. Unicodedata – for dealing with complex character sets. See also "Fluent Python" regex
28
Collections https://docs.python.org/3/library/collections.html
# Tally occurrences of words in a list c = Counter() for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']: c[word] += 1 print(c) <Counter({'blue': 3, 'red': 2, 'green': 1})>
29
Collections https://docs.python.org/3/library/collections.html
# Find the ten most common words in Hamlet import re words = re.findall(r'\w+', open('hamlet.txt').read().lower()) Counter(words).most_common(5) [('the', 1143), ('and', 966), ('to', 762), ('of', 669), ('i', 631)]
30
Useful libraries: binary data
See especially struct:
31
Useful libraries: maths
decimal — Does for floating points what ints do; makes them exact fractions — Rational numbers (For dealing with numbers as fractions
32
Statistics https://docs.python.org/3/library/statistics.html
mean() Arithmetic mean (“average”) of data. harmonic_mean() Harmonic mean of data. median() Median (middle value) of data. median_low() Low median of data. median_high() High median of data. median_grouped() Median, or 50th percentile, of grouped data. mode() Mode (most common value) of discrete data. pstdev() Population standard deviation of data. pvariance() Population variance of data. stdev() Sample standard deviation of data. variance() Sample variance of data.
33
Random selection Random library includes functions for: Selecting a random choice Shuffling lists Sampling a list randomly Generating different probability distributions for sampling.
34
Auditing random numbers
Often we want to generate a repeatable sequence of random numbers so we can rerun models or analyses with random numbers, but repeatably. functions Normally uses os time, but can be forced to a seed.
35
Useful libraries: lists/arrays
bisect — Array bisection algorithm (efficient large sorted arrays for finding stuff)
36
Useful libraries: TkInter
Used for Graphical User Interfaces (windows etc.) Wrapper for a library called Tk (GUI components) and its manipulation languages Tcl. See also: wxPython: Native looking applications: (Not in Anaconda)
37
Turtle For drawing shapes. TKInter will allow you to load and display images, but there are additional external libraries better set up for this, including Pillow:
38
Useful libraries: talking to the outside world
Serial ports rs232-port argparse — Parser for command-line options, arguments and sub-commands datetime
39
Databases DB-API dbm — Interfaces to Unix “databases” Simple database sqlite3 — DB-API 2.0 interface for SQLite databases Used as small databases inside, for example, Firefox.
40
This lecture Useful external packages. Packages.
Import. Modules. Packages. Useful standard library packages. Useful external packages.
41
External libraries A very complete list can be found at PyPi the Python Package Index: To install, use pip, which comes with Python: pip install package or download, unzip, and run the installer directly from the directory: python setup.py install If you have Python 2 and Python 3 installed, use pip3 (though not with Anaconda) or make sure the right version is first in your PATH.
42
Numpy Mathematics and statistics, especially multi-dimensional array manipulation for data processing. Good introductory tutorials by Software Carpentry:
43
Numpy data Perhaps the nicest thing about numpy is its handling of complicated 2D datasets. It has its own array types which overload the indexing operators. Note the difference in the below from the standard [1d][2d] notation: import numpy data = numpy.int_([ [1,2,3,4,5], [10,20,30,40,50], [100,200,300,400,500] ]) print(data[0,0]) # 1 print(data[1:3,1:3]) # [[20 30][ ]] On a standard list, data[1:3][1:3] wouldn't work, at best data[1:3][0][1:3] would give you [20][30]
44
Numpy operations You can additionally do maths on the arrays, including matrix manipulation. import numpy data = numpy.int_([ [1,2,3,4,5], [10,20,30,40,50], [100,200,300,400,500] ]) print(data[1:3,1:3] - 10) # [[10 20],[ ]] print(numpy.transpose(data[1:3,1:3])) # [[20 200],[30 300]] There's a nice numpy cheatsheet from datacamp at:
45
Pandas Data analysis. Based on Numpy, but adds more sophistication.
46
Pandas data Pandas data focuses around DataFrames, 2D arrays with addition abilities to name and use rows and columns. import pandas df = pandas.DataFrame( data, # numpy array from before. index=['i','ii','iii'], columns=['A','B','C','D','E'] ) print (data['A']) print(df.mean(0)['A']) print(df.mean(1)['i']) Prints: i 1 ii 10 iii 100 Name: A, dtype: int32 37.0 3.0
47
scikit-learn Scientific analysis and machine learning. Used for machine learning. Founded on Numpy data formats.
48
Beautiful Soup Web analysis. Need other packages to actually download pages like the library requests. BeautifulSoup navigates the Document Object Model: Not a library, but a nice intro to web programming with Python.
49
Tweepy Downloading Tweets for analysis. You'll also need a developer key: access-key-for-twitter-oauth/994/ Most social media sites have equivalent APIs (functions to access them) and modules to use those.
50
NLTK Natural Language Toolkit. Parse text and analyse everything from Parts Of Speech to positivity or negativity of statements (sentiment analysis).
51
Celery Concurrent computing / parallelisation. For splitting up programs and running them on multiple computers e.g. to remove memory limits. See also:
52
Review import geostuff point_1 = geostuff.GeoPoint()
from geostuff import GeoPoint # Don't use * point_1 = GeoPoint() import geostuffthatisuseful as geo point_1 = geo.GeoPoint()
53
Review Generally, on import, loose code in modules and classes will run. Avoid this by placing all code in functions and use the following to isolate code to run if you want the module to also run as a script: if __name__ == "__main__": # Imports needed for running. function_name()
54
Review In general in scripts and modules code has to be defined before it can be used within the same module. class A: def b (__self__) : print ("hello world") c = A() c.b()
55
Review Key standard libraries to study: builtins/pathlib/os math/statistics decimal/fraction regex datetime Key external libraries to study: matplotlib numpy pandas beautifulsoup tkinter
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.