Data visualization in Python

Slides:



Advertisements
Similar presentations
Rich Pugh Andy Nicholls Head to Head: Lattice vs ggplot2 Rich Pugh
Advertisements

ggplot2 - spatial plotting - spatial plotting Norsk statistikermøte, Halden, 11. juni 2013 André Teigland Forskningssjef SAMBA Elisabeth.
Plotting with ggplot2: Part 1
Graphic representations in statistics (part II). Statistics graph Data recorded in surveys are displayed by a statistical graph. There are some specific.
CartoVis: A Web-based Exploratory Spatial Data Analysis Application Ryan Stanley West Virginia University.
Data visualization and graphic design
Chart ArcView_module_11 May 15, 10:40 AM. Outline Two ways of creating charts How charts are created Steps.
ESRM 250 & CFR 520: Introduction to GIS © Phil Hurvitz, KEEP THIS TEXT BOX this slide includes some ESRI fonts. when you save this presentation,
Python plotting for lab folk Only the stuff you need to know to make publishable figures of your data. For all else: ask Sourish.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln
GMT: The Generic Mapping Tools Paul Wessel, Walter H.F. Smith and the GMT team.
___________________________________________GIST: A New Tool for Visualizing Geographic Data Environmental Modeling Center__________________________________________________.
GEOG3025 Exploratory analysis of neighbourhood data.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Thinking about Graphics Scales in Stata. Level of measurement Categorical versus continuous Categorical data may be represented as Position along a categorical.
Unit 42 : Spreadsheet Modelling
McGraw-Hill Career Education© 2008 by the McGraw-Hill Companies, Inc. All Rights Reserved. Office Excel 2007 Lab 2 Charting Worksheet Data.
A Quick Introduction to GIS
Advanced Spatial Methods in R
Ggplot2 A cool way for creating plots in R Maria Novosolov.
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.
Python Lab Matplotlib - I Proteomics Informatics, Spring 2014 Week 9 25 th Mar, 2014
Treatment of Data Techniques IB Geography I. Purpose of this PPT Use this Power Point to decide how you will treat your data. Remember, there may be multiple.
R PROGRAMMING FOR SQL DEVELOPERS Kiran Math Developer : Proterra in Greenville SC
Overlay Operations. Overlay Operations involve combining spatial and attribute data from two or more spatial data layers. “Stacking data” – Very powerful.
Data Visualization basics Petar Horozov Nikolay Nedyalkov
Andrew White, Brian Freitag, Udaysankar Nair, and Arastoo Pour Biazar
Overview of R and ggplot2 for graphics
Shiny for RStudio Exploring Web Mapping Technology
Ggplot2 Wu Shaohuan.
Intro to Geospatial Data Science
Mapping for the interwebs
TIP │Use color to tell story, arrange multiple graphics together, add icons on top of charts to create custom graphics. EDITABLE GRAPHIC AT END OF PRESENTATION.
Working with Charts © 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Using R Graphs in R.
Lesson 13: Visualizations
21 Essential Data Visualization Tools
Digital Text and Data Processing
ggplot2 Merrill Rudd TAs: Brooke Davis and Megsie Siple
Excel Part 4 Working with Charts and Graphics
Next Generation R tidyr, dplyr, ggplot2
Summary Statistics in R Commander
Excel Part 4 Working with Charts and Graphics
Ggplot2 I EPID 799C Mon Sep
Visualization Making an Impact with Your Data
Python Visualization Tools: Pandas, Seaborn, ggplot
ggplot2 II EPID 799C Wed Sep New Packages (install now!)
N. Capp, E. Krome, I. Obeid and J. Picone
Lesson 13: Visualizations
Lesson 13: Visualizations
IST256 : Applications Programming for Information Systems
Mapbox Studio Sarah and Haley.
PowerPoint Infographics Sampler
INTRODUCTION TO SGPLOT Zahir Raihan OVERVIEW  ODS Graphics  SGPLOT overview  Plot Content  High value plot statements  High value plot options 
GIS Lecture: Editing Data
Introduction To ArcMap
Graphs with SPSS.
Overview of R and ggplot2 for graphics
Lecture 7 – Delivering Results with R
R course 6th lecture.
Python for Data Analysis
Guidelines to visualise statistical information: Tables, graphs and maps THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION.
Excel Part 4 Working with Charts and Graphics
Working with GEOLocation Data
DATA VISUALISATION (QUANTITATIVE).
Bespoke Visual Layouts with Charticulator
Igor Stančin, Alan Jović to: {igor.stancin,
Mapping packages Unfortunately none come with Anaconda (only geoprocessing is which does lat/long to Cartesian conversions). matplotlib.
Data visualization and graphic design
Spark with R Martijn Tennekes
Presentation transcript:

Data visualization in Python Martijn Tennekes, Ali Hürriyetoglu THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION

Outline Overview data visualization in Python ggplot Folium Conclusion

Which packages/functions Standard charts (e.g. line chart, bar chart, scatter plot): Matplotlib, Pandas, Seaborn, ggplot, Altair, ... Thematic maps Folium, Basemap, Cartopy, Iris, … Other visualisations Bokeh (interactive plots), plotly, …

ggplot Based on one of the most popular R package (ggplot2) Based on the Grammar of Graphics (Wilkinson, 2005) Charts are build up according to this grammar: data mapping / aestetics geoms stats scales coord Facets Pandas DataFrames are used natively in ggplot.

ggplot and qplot Shortcut function: qplot (quick plot): Stacking of layers and transformations with + Data: DataFrame. ggplot(mpg, aes(x = displ, y = cty) ) + geom_point() Aestatics: x, y, color, fill, shape Geometry: points Shortcut function: qplot (quick plot): qplot(diamonds.carat, diamonds.price)

Aesthetics Mapping of data to visual attributes of geometric objects: Position: x, y Color: color Shape: shape ggplot(aes(x='carat', y='price', color='clarity'), diamonds) + geom_point()

Aesthetics Mapping of data to visual attributes of geometric objects: Position: x,y Color: color Shape: shape ggplot(aes(x='carat', y='price', shape="cut"), diamonds) + geom_point()

Geom Geometric objects: Also margins: Points, lines, polygons, … Functions start with “geom_” Also margins: geom_errorbar(), geom_pointrange(), geom_linerange(). Note: they require the aesthetics ymin and ymax. ggplot(mpg, aes(x = displ, y = cty)) + geom_point() + geom_line()

Stat stat_smooth() and stat_density() enable statistical transformation Most geoms have default stat (and the other way round) geom and stat form a layer One or more layers form a plot

stat_smooth ggplot(aes(x='date', y='beef'), data=meat) + geom_point() + \ stat_smooth(method='loess')

stat_density ggplot(aes(x='price', color='clarity'), data=diamonds) + stat_density()

Scales (and axes) A scale indicates how the value of a variable scales with an aesthetic Therefore: A scale belongs to one aesthetic (x, y, color, fill, etc.) The axis is an essential part of a scale With scale_XXX, the scales and axes can be adjusted (XXX stands for the a combination of aesthetic and type of scale, e.g. scale_fill_gradient)

scale_x_log ggplot(diamonds, aes(x='price')) + geom_histogram() + scale_x_log(base=100)

Coord A chart is drawn in a coordinate system. This can be transformed. A pie chart has a polar coordinate system. df = pd.DataFrame({"x": np.arange(100)}) df['y'] = df.x * 10 # polar coords p = ggplot(df, aes(x='x', y='y')) + geom_point() + coord_polar() print(p)

Facets With facets, small multiples are created. Each facet shows a subset of the data. ggplot(diamonds, aes(x='price')) + \ geom_histogram() + \ facet_grid("cut")

Facets example ggplot(chopsticks, aes(x='chopstick_length', y='food_pinching_effeciency')) + \ geom_point() + \ geom_line() + \ scale_x_continuous(breaks=[150, 250, 350]) + \ facet_wrap("individual")

Facets example 2 ggplot(diamonds, aes(x="carat", y="price", color="color", shape="cut")) + geom_point() + facet_wrap("clarity")

ggplot tips You can annotate plots Assign a plot to a variable, for instance g: The function save saves the plot to the desired format: ggplot(mtcars, aes(x='mpg')) + geom_histogram() + \ xlab("Miles per Gallon") + ylab("# of Cars") g = ggplot(mpg, aes(x = displ, y = cty)) + geom_point() g.save(“myimage.png”)

Folium: Thematic maps A thematic map is a visualization where statistical information with a spatial component is shown. Other libraries are: Basemap, Cartopy, Iris Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via Folium.

Folium features Built-in tilesets from OpenStreetMap, MapQuest Open, MapQuest Open Aerial, Mapbox, and Stamen Supports custom tilesets with Mapbox or Cloudmade API keys. Supports GeoJSON and TopoJSON overlays, as well as the binding of data to those overlays to create choropleth maps with color-brewer color schemes.

Basic Maps folium.Map(location=[50.89, 5.99], zoom_start=14)

Basic maps folium.Map(location=[50.89, 5.99], zoom_start=14, tiles='Stamen Toner')

GeoJSON/TopoJSON Overlays ice_map = folium.Map(location=[-59, -11], tiles='Mapbox Bright', zoom_start=2) ice_map.geo_json(geo_path=geo_path) ice_map.geo_json(geo_path=topo_path, topojson='objects.antarctic_ice_shelf') ice_map.create_map(path='ice_map.html')

Choropleth maps map = folium.Map(location=[48, -102], zoom_start=3) map.choropleth(geo_path=state_geo, data=state_data, columns=['State', 'Unemployment'], key_on='feature.id', fill_color='YlGn', fill_opacity=0.7, line_opacity=0.2, legend_name='Unemployment Rate (%)')

Summary Python has many options for data visualization Each visualisation library has a particular audience Javascript backend is mostly used to extend power of the visualisation Python’s extensive data processing tools integrates well with visualisation requirements

References http://yhat.github.io/ggplot/ https://folium.readthedocs.io/en/latest/