Presentation is loading. Please wait.

Presentation is loading. Please wait.

Visualizing Economic Data Using Perl and HTML5's Canvas A. Sinan Unur

Similar presentations

Presentation on theme: "Visualizing Economic Data Using Perl and HTML5's Canvas A. Sinan Unur"— Presentation transcript:

1 Visualizing Economic Data Using Perl and HTML5's Canvas A. Sinan Unur

2 Government agencies provide a lot of economic data  (U.S. Census Bureau) -Income, poverty, health insurance, housing, population etc  (U.S. Bureau of Economic Analysis) -National accounts and related macro economic data etc  (U.S.Bureau of Labor Statistics) -Employment, price indexes etc  (U.S. Bureau of Transportation Statistics) -Transportation sector specific economic indicators, accidents, air fares etc  (Centers for Medicare and Medicaid Services) -Medicare/medicaid and other health care related data

3 Utility of data provided by government agencies  The detailed, raw or close to raw data provided by these agencies are invaluable to researchers.  Not easily accessible to the general public who lack the advanced statistical and econometric tools and background to analyze them.  Agencies also publish summary tables and graphs.  Those are not very accessible either.

4 Bad apples (BTS) … Uninformative

5 Bad apples (Census) …  Years in descending order -Cannot easily sort because some years have footnote text. E.g. 2004 (35)  Multiple tables embedded in singles sheet  Cannot compare across tables without going through a bunch of hoops

6 What if you want to do something with the data?  Perl to the rescue -Combine information from various tables spread over a number of files -Put data in proper database tables -Issue whatever queries you want  For data in Excel files, use Spreadsheet::ParseExcel  For simple ad hoc databases, use SQLite in conjunction with DBI and DBD::SQLite  Create accessible, structured HTML tables as output  Turn HTML tables into charts using JavaScript and Canvas  Going to use some income data from the Census Bureau as a concrete example

7 Data source  Historical income data from the Census Bureau - -Households -Quintiles of the income distribution -Number of households in income brackets -All pre-tax, pre-transfer

8 Spreadsheet::ParseExcel  Reduce memory footprint and processing overhead using cell callbacks - my $parser = Spreadsheet::ParseExcel->new( - CellHandler => sub { $self->_cell_handler(@_) }, - NotSetCell => 1, - ); - $parser->parse($file);

9 Spreadsheet::ParseExcel  Cell handler must detect -Sub-tables -Rows within sub-tables  Cell handler creates record for each row, identifying main table (race, units), sub- table etc so all data can be put into one table  Parser is given a callback. Every time it has a complete record, cell handler invokes call back with the record.  Sheet contents are therefore not duplicated or even triplicated(?) in memory.  Once all related data are in a database table, we can do things like compare the second quintile of the income distribution across sub-groups etc.

10 Sharing with others  Perl Dancer ( makes it easy to put together small, dedicated web apps  Main interface: Just a form.  Output: Nicely formatted HTML table + JavaScript to use the contents of the table to create a plot on a canvas.  IDEALLY: -No more generating bitmap images on the server side and serving them. -No need to depend on Flash, SVG. -Copy & paste, print.  Of course, canvas is not fully and consistently supported yet: -E.g. Chrome on Windows does not let you right-click and copy canvas.

11 Canvas headaches  Need text height to be able to figure out where to plot  var metrics = ctx.measureText(string);  metrics only has a width property, no height!

12 Canvas headaches  How do others deal with the lack of a way to measure height of a string? -Flot, jQuery Visualize: Use absolutely positioned HTML elements over canvas -Disadvantage: Chart is no longer a single entity you can copy & paste, save to a file etc. -Gnuplot, possibly others: Use manually specified outlines for ASCII and specific symbol characters -Lose Unicode text drawing support

13 Canvas: Height of a string in current font  Draw string, black on white background  Find first scanline with a non-white pixel  Find first subsequent scanline with all white pixels -Waste memory -Repeatedly draw on and clear canvas -Inelegant, cumbersome -Seems to be the only way to do it if you want arbitrary fonts, character sets, and treat chart as a single entity

14 Code, sample app & pretty pictures coming soon  … before my presentation ;-)

Download ppt "Visualizing Economic Data Using Perl and HTML5's Canvas A. Sinan Unur"

Similar presentations

Ads by Google