3. Numerical Methods II JHU Physics & Astronomy Python Workshop 2018

Slides:



Advertisements
Similar presentations
PHP (2) – Functions, Arrays, Databases, and sessions.
Advertisements

Advanced File Processing
Naming and Directories. Recall from the last time… File system components Disk management organizes disk blocks into files. Many disk blocks management.
Lists in Python.
Programming in HTML.  Programming Language  Used to design/create web pages  Hyper Text Markup Language  Markup Language  Series of Markup tags 
Pandas: Python Programming for Spreadsheets Pamela Wu Sept. 17 th 2015.
1 Functions 1 Parameter, 1 Return-Value 1. The problem 2. Recall the layout 3. Create the definition 4. "Flow" of data 5. Testing 6. Projects 1 and 2.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
File Systems (1). Readings r Reading: Disks, disk scheduling (3.7 of textbook; “How Stuff Works”) r Reading: File System Implementation ( of textbook)
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
Python Functions.
Chapter 2: Variables, Functions, Objects, and Events JavaScript - Introductory.
Comparison of different output options from Stata
Files Tutor: You will need ….
HTML Structure & syntax
Advanced HTML Tags:.
AHG Advanced Techniques for PDF Accessibility
Tiny http client and server
Fundamentals of Python: First Programs
INLS 623– Database Systems II– File Structures, Indexing, and Hashing
May 3rd – Hashing & Graphs
CHP - 9 File Structures.
LINUX WINDOWS Vs..
ITM 352 HTML Forms, Basic Form Processing
CMSC201 Computer Science I for Majors Lecture 22 – Binary (and More)
SQL Implementation & Administration
© 2016 Pearson Education, Ltd. All rights reserved.
Containers and Lists CIS 40 – Introduction to Programming in Python
Department of Computer Science,
What is FITS? FITS = Flexible Image Transport System
Other Kinds of Arrays Chapter 11
Other Kinds of Arrays Chapter 11
Arrays and files BIS1523 – Lecture 15.
Statement atoms The 'atomic' components of a statement are: delimiters (indents, semicolons, etc.); keywords (built into the language); identifiers (names.
Binary Files.
Introduction to Lime Survey
User Defined Functions
CIS 336 strCompetitive Success/tutorialrank.com
Naming and Directories
Introduction to Python programming
Lab 1 Introductions to R Sean Potter.
Prof: Dr. Shu-Ching Chen TA: Yimin Yang
Naming and Directories
Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha
Hash Tables.
Guide To UNIX Using Linux Third Edition
Naming and Directories
Variables ICS2O.
CISC101 Reminders Slides have changed from those posted last night…
Matlab tutorial course
Coding Concepts (Data Structures)
PHP.
Prof: Dr. Shu-Ching Chen TA: Haiman Tian
Fundamentals of Data Structures
Numpy (Numerical Python)
Coding Concepts (Data- Types)
Python programming exercise
Fundamentals of Python: First Programs
Intro to PHP.
7. Power in Packages JHU Physics & Astronomy Python Workshop 2018
Dr. Sampath Jayarathna Cal Poly Pomona
Dr. Sampath Jayarathna Old Dominion University
PHP an introduction.
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Hash Maps Implementation and Applications
Naming and Directories
Unit J: Creating a Database
Excel Tips & Tricks July 18, 2019.
Getting Started with Data
Presentation transcript:

3. Numerical Methods II JHU Physics & Astronomy Python Workshop 2018 Instructors: Eric Winter (STScI) Kate Rowlands (JHU Astro) Jenn Wojno (JHU Astro)

Data Files: Where the Party’s AT At some point, you need to do more than just play with numbers that you’ve generated in your python code, and getting that data into and out of your python environment is important.

Data File Types Human Readable Binary-based Two main categories of data files: Human Readable Binary-based Pro: Can see data in your text editor Con: Looks like a bunch of 1s and 0s Con: No standardization, at the mercy of whoever formatted it (usually) Pro: (Usually) easy to load and get access to data Con: Difficult* to store metadata Pro: Made to store metadata Con: Slow, large (usually) Pro: Fast, Small Con: Whole file must be read in (usually) Pro: Can read only the stuff you want *For various values of "difficult" – see XML, JSON, etc.

Data File Types Human Readable Binary-based PRO TIP: Two main categories of data files: Human Readable Binary-based Pro: Can see data in your text editor Con: Looks like a bunch of 1s and 0s Con: No standardization, at the mercy of whoever formatted it (usually) Pro: (Usually) easy to load and get access to data Con: Difficult* to store metadata Pro: Made to store metadata Con: Slow, large (usually) Pro: Fast, Small Con: Whole file must be read in (usually) Pro: Can read only the stuff you want PRO TIP: There are many binary-based formats, including HDF5 and NetCDF, all of which have python interfaces. In this workshop, we’ll go in depth with FITS files *For various values of "difficult" – see XML, JSON, etc.

Introducing Astropy! Contains lots ‘o useful functionality for astronomy & beyond. The Docs: http://astropy.readthedocs.org/en/stable/

Introducing Astropy! PRO TIP: astropy is installed by default in the Anaconda installation. Contains lots ‘o useful functionality for astronomy & beyond. The Docs: http://astropy.readthedocs.org/en/stable/

FITS (Flexible image transport System) Files! astropy A useful (binary) format commonly used in astronomy to store image or tabulated data. astropy.table (from astropy.table import Table) astropy.io.fits (from astropy.io import fits) We’ll use this for FITS (and other) tables We’ll use this for FITS images

FITS (Flexible image transport System) Files! astropy A useful (binary) format commonly used in astronomy to store image or tabulated data. astropy.table (from astropy.table import Table) PRO TIP: (Usually) when you import something with a capital letter first (i.e., from astropy.table import Table), you’re importing a class. These are special types of variables with useful methods astropy.io.fits (from astropy.io import fits) We’ll use this for FITS (and other) tables We’ll use this for FITS images

FITS (Flexible image transport System) Files! astropy A useful (binary) format commonly used in astronomy to store image or tabulated data. PRO TIP 2: You can also deal with tables through the normal astropy.io.fits interface. The “table” interface is quite slick, however and makes life easier (especially when making new tables). astropy.table (from astropy.table import Table) astropy.io.fits (from astropy.io import fits) We’ll use this for FITS (and other) tables We’ll use this for FITS images

FITS Images FITS files can store multidimensional data (commonly 2 or 3 dimensions). Any given FITS file can contain multiple images (or tables) called extensions Every FITS extension contains a header and data. FITS headers can contain World Coordinate System (wcs) information that indicates where a given pixel is on the sky A FITS file open in DS9 (a common viewer)

FITS Images FITS files can store multidimensional data (commonly 2 or 3 dimensions). Any given FITS file can contain multiple images (or tables) called extensions Every FITS extension contains a header and data. FITS headers can contain World Coordinate System (wcs) information that indicates where a given pixel is on the sky PRO TIP: Unlike python, FITS convention has indexing starting at 1. Generally astropy covers this up – but you should be aware of this. A FITS file open in DS9 (a common viewer)

Reading In FITS Images Convenience functions make reading FITS images easy: from astropy.io import fits img1 = fits.getdata('Fluxes.fits') # Getting the image head1 = fits.getheader('Fluxes.fits') # and the Header This opens the image as a Numpy array, and the header as a “dictionary-like” object (i.e., you can access the individual header keywords through “head1[‘key’]” ). To open other extensions in the fits file: img1 = fits.getdata('Fluxes.fits', 0) # Primary Ext img2 = fits.getdata('Fluxes.fits', 1) # Second Ext img2 = fits.getdata('Fluxes.fits', ext=1) # Equivalent

Reading In FITS Images PRO TIP: Convenience functions make reading FITS images easy: from astropy.io import fits img1 = fits.getdata('Fluxes.fits') # Getting the image head1 = fits.getheader('Fluxes.fits') # and the Header This opens the image as a Numpy array, and the header as a “dictionary-like” object (i.e., you can access the individual header keywords through “head1[‘key’]” ). To open other extensions in the fits file: PRO TIP: In addition to local files, you can open FITS files on the internet by using the url as opposed to the file name. img1 = fits.getdata('Fluxes.fits', 0) # Primary Ext img2 = fits.getdata('Fluxes.fits', 1) # Second Ext img2 = fits.getdata('Fluxes.fits', ext=1) # Equivalent

Reading In FITS Images PRO TIP 2: Convenience functions make reading FITS images easy: from astropy.io import fits img1 = fits.getdata('Fluxes.fits') # Getting the image head1 = fits.getheader('Fluxes.fits') # and the Header This opens the image as a Numpy array, and the header as a “dictionary-like” object (i.e., you can access the individual header keywords through “head1[‘key’]” ). To open other extensions in the fits file: PRO TIP 2: This is not the most efficient way to open a FITS file, especially larger ones. If you want to manipulate large data sets multiple times, there’s a faster way. img1 = fits.getdata('Fluxes.fits', 0) # Primary Ext img2 = fits.getdata('Fluxes.fits', 1) # Second Ext img2 = fits.getdata('Fluxes.fits', ext=1) # Equivalent

Fits Files: A More Technical Review Basic structure of a FITS file: Header Data Unit List (HDU List) Header Data Unit (HDU) Header Data Unit (HDU) Header Data Unit (HDU) Header Header … Header Data Data Data Primary Extension (0) Secondary Extension (1) Secondary Extension (N-1)

Fits Files: A More Technical Review Basic structure of a FITS file: Header Data Unit List (HDU List) Header Data Unit (HDU) Header Data Unit (HDU) Header Data Unit (HDU) Header Header … Header Data Data Data PRO TIP: FITS tables cannot be in the primary extension. Primary Extension (0) Secondary Extension (1) Secondary Extension (N-1)

Reading In a FITS File (Expanded) Reading a file, now knowing what a FITS file consists of: hdulist = fits.open('Fluxes.fits') # Getting the HDUlist hdulist.info() # The composition of the file Now getting the header and/or data: head0 = hdulist[0].header # Primary Ext Header data1 = hdulist[1].data # Second Ext Data Writing to a new file and closing: hdulist.writeto('test.fits') hdulist.close() # Closing the FITS file

Reading In a FITS File (Expanded) Reading a file, now knowing what a FITS file consists of: hdulist = fits.open('Fluxes.fits') # Getting the HDUlist hdulist.info() # The composition of the file Now getting the header and/or data: PRO TIP: FITS files are read in such that the first axis (often the RA for astronomical images) is read in as the last axis in the numpy array. Be sure to double check that you have the axis you need. head0 = hdulist[0].header # Primary Ext Header data1 = hdulist[1].data # Second Ext Data Writing to a new file and closing: hdulist.writeto('test.fits') hdulist.close() # Closing the FITS file

Reading In a FITS File (Expanded) Reading a file, now knowing what a FITS file consists of: hdulist = fits.open('Fluxes.fits') # Getting the HDUlist hdulist.info() # The composition of the file PRO TIP 2: writeto will, by default, fail if you try to overwrite an existing file. To force an overwrite, pass the overwrite argument: overwrite = True Now getting the header and/or data: head0 = hdulist[0].header # Primary Ext Header data1 = hdulist[1].data # Second Ext Data Writing to a new file and closing: hdulist.writeto('test.fits') hdulist.close() # Closing the FITS file

Writing Out A FITS Image # Make arrays of data to write out. import numpy as np data1 = np.random.uniform(100,100) data2 = np.random.uniform(200,200) # Create the HDU. hdu1 = fits.PrimaryHDU(data1) # Makes a header hdu2 = fits.ImageHDU(data2) # Making a new HDU List: hdulist1 = fits.HDUList([hdu1, hdu2]) # Writing the file: hdulist1.writeto('test.fits', overwrite=True) hdulist1.close() # MUST do this.

Short Detour: Glob Module In one of the many useful python packages, glob lets you get lists of files using wildcards: import glob # Getting list of all files in current directory: filelist1 = glob.glob(‘*’) # or filelist1 = glob.glob(‘./*’) # Getting list of all files in all directories two levels down with the extension ‘.fits’: filelist2 = glob.glob(‘../*.fits’)

Short Detour: OS Module Additionally, the os module provides a large number of useful filesystem functions: import os # Basic File Operations (with tab autocomplete): !touch junk1.dat junk2.dat os.remove('junk1.dat') # Delete file os.rename('junk2.dat', 'junk2b.dat') # Rename file os.mkdir('mydata') # Making new directory # Path functions: os.path.exists('/tmp') # Checks if loc exists # Splits loc into directory and file os.path.split('/usr/local') # Splits loc into path+file and extension os.path.splitext('/var/log/messages.gz')

Short Detour: Lambda Functions Sometimes you want to define a simple function without the full function syntax. Lambda functions exist for this exact reason: # Defining the Function: funct1 = lambda x: x**2 # Returns the square of x # Using the Function: tmpvar1 = funct1(5) # Can use multiple variables: funct2 = lambda x,y: x + y tmpvar2 = funct2(5, 6)

Tables (& FITS Tables) While you can use the FITS interface to open tables, Astropy makes it very easy and convienient with the astropy.table interface: from astropy.table import Table # Getting the first table t1 = Table.read('Fluxes.fits') # Getting the second table t2 = Table.read('Fluxes.fits', hdu=2) This provides a really flexible Table object that is a pleasure to deal with. It is easy to access different types of data, and read in and output to a wide variety of formats (not just FITS)

Table Formats Pretty much anything you’d ever want!

Playing with Table Data A table is both a dictionary-like and numpy array-like data type that can either be accessed by key (for columns) or index (for rows): # Getting column names, number of rows: t1.colnames, len(t1) # Getting specific columns: t1[‘id’], t1[[‘ra’, ‘dec’]] # Getting specific rows (all normal indexing works): t1[0], t1[:3], t1[::-1] # Where searching also works: inds = np.where(t1[‘ra’] > 180) subtable = t1[inds] # Gets all columns

Playing with Table Data A table is both a dictionary-like and numpy array-like data type that can either be accessed by key (for columns) or index (for rows): # Getting column names, number of rows: t1.colnames, len(t1) # Getting specific columns: t1[‘id’], t1[[‘ra’, ‘dec’]] # Getting specific rows (all normal indexing works): t1[0], t1[:3], t1[::-1] # Where searching also works: inds = np.where(t1[‘ra’] > 180) subtable = t1[inds] # Gets all columns PRO TIP: Extracting a single column will give you a Numpy array-like variable with all your favourite methods attached.

Making a Table To make a table manually is easy with Numpy arrays: # Given two columns (1D) arr1 and arr2: arr1 = np.arange(10) arr2 = np.arange(10)**2 arr3 = np.arange(10)**3 t1 = Table([arr1, arr2], names=(“a”, “b”)) # The columns are named “a” and “b”. # Adding an additional column: col3 = Table.Column(name=“c”, data=arr3) t1.add_column(col3) # Adding an additional row: row = np.array([10, 100, 1000]) t1.add_row(row)

Writing Out a Table Writing out a table is also quite simple: # Writing out FITS table: t1.write('tabledata.fits') # Writing out specific text type: t1.write('tabledata.txt', format=‘ascii.tab’) # Can even write out to LaTeX: t1.write('tabledata.tex', format=‘ascii.latex’)

Writing Out a Table PRO TIP: Writing out a table is also quite simple: # Writing out FITS table: t1.write('tabledata.fits') # Writing out specific text type: t1.write('tabledata.txt', format=‘ascii.tab’) # Can even write out to LaTeX: t1.write('tabledata.tex', format=‘ascii.latex’) PRO TIP: To quickly investigate a table in a nicely formatted manner, you can do: t1.show_in_browser()

Exercise Time!