Robert Currier, Mote Marine Laboratory Dr. Barbara Kirkpatrick, TAMU/GCOOS.

Slides:



Advertisements
Similar presentations
Lecture 11 Server Side Interaction
Advertisements

JavaScript and AJAX Jonathan Foss University of Warwick
Pubman and Selenium tests. What is Selenium Selenium is a suite of Web application test automation tools for any browser on any operating system –Firefox,
Web Server Programming
Test Automation Framework Ashesh Jain 2007EE50403 Manager Amit Maheshwari.
This presentation is intended as a detailed WebEx, to bring potential customers to an understanding of Dream Report capabilities. This presentation focuses.
What is it? –Large Web sites that support commercial use cannot be written by hand What you’re going to learn –How a Web server and a database can be used.
LOGO Tech propulsion Labs Android Webdriver Test automation - Selenium 2 Masud Parvez SQA Architect Tech Propulsion Labs
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
A closer look Dynamic Webpages Jessica Meyerson March 1, 2011.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
Understanding and Managing WebSphere V5
Selenium Web Test Tool Training Using Ruby Language Discover the automating power of Selenium Kavin School Kavin School Presents: Presented by: Kangeyan.
Automation using Selenium Authored & Presented by : Chinmay Sathe & Amit Prabhu Cybage Software Pvt. Ltd.
Website Development & Management Introduction & Overview CIT Fall Instructor: John Seydel, Ph.D.
1 Homework / Exam Exam 3 –Solutions Posted –Questions? HW8 due next class Final Exam –See posted schedule Websites on UNIX systems Course Evaluations.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
NMD202 Web Scripting Week1. Contact Information – Lecturer is a part time member of staff. Students are encouraged to use.
INFM 603: Information Technology and Organizational Context Jimmy Lin The iSchool University of Maryland Thursday, October 18, 2012 Session 7: PHP.
Healthy Kids Zone Team Introduction Chad Honkofsky 2.
WaveMaker Visual AJAX Studio 4.0 Training Troubleshooting.
NET 499 Leonard Paul Vinas Network Security and Electronics Department of Technology.
Test Automation For Web-Based Applications Portnov Computer School Presenter: Ellie Skobel.
Promoting Open Source Software Through Cloud Deployment: Library à la Carte, Heroku, and OSU Michael B. Klein Digital Applications Librarian
L. Grewe LAMP, WAMP and... Motivaiton Basic Web Systems with Delivery of Static and Dynamic Web Pages html, css, media javascript (“dynamic” on client.
Unit 1 – Web Concepts Instructor: Brent Presley. ASSIGNMENT Read Chapter 1 Complete lab 1 – Installing Portable Apps.
Overview Embedded Linux Graphics Typical desktop Linux graphics stack SystemRAMDisk X Window System5MB16MB GNOME14MB95MB KDE11MB96MB Mozilla12MB95MB.
CNIT 133 Interactive Web Pags – JavaScript and AJAX JavaScript Environment.
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
1 In the good old days... Years ago… the WWW was made up of (mostly) static documents. –Each URL corresponded to a single file stored on some hard disk.
Understanding AJAX Hype, Hope, Hurt and Help for the Web MJTS May 4th, 2006 _________________________ Terence Conklin, Conklin Systems
Cross Site Integration “mashups” cross site scripting.
Dynamic web content HTTP and HTML: Berners-Lee’s Basics.
Selenium and Selenium on Rails. Agenda  Overview of Selenium Simple Selenium Tests Selenium IDE  Overview of Selenium on Rails  Problems with Selenium.
“On Track Fitness” A new app to record physical activities from an urban area using smart phones for personal logging & community sharing Presented by:
1 MSCS 237 Overview of web technologies (A specific type of distributed systems)
Pubman and Selenium tests. What is Selenium Selenium is a suite of Web application test automation tools for any browser on any operating system –Firefox,
Charles Dunbar, Ben Kallal, Ankit Patel, Peter Purcell, Kody Reynolds.
Open Source Options Steve Duthie – MT Department of Labor John Pearce – OR Employment Department By PresenterMedia.comPresenterMedia.com.
Introduction Selenium IDE is a Firefox extension that allows you to record, edit, and debug tests for HTML Easy record and playback Intelligent field selection.
Paperless Timesheet Management Project Anant Pednekar.
Google Map Engine Can export images to Map Engine from Earth Engine
Web Development Resources for Project 1 Tyler Moore University of Tulsa CS 7403: Secure Electronic Commerce Spring
Overview Web Technologies Computing Science Thompson Rivers University.
JavaScript and Ajax (JavaScript Environment) Week 6 Web site:
Function as a Service An Ad Hoc Approach to Cloud Computing By Keith Downie.
Wes Preston DEV 202. Audience: Info Workers, Dev A deeper dive into use-cases where client-side rendering (CSR) and SharePoint’s JS Link property can.
CPSC 8985 Fall 2015 P10 Web Crawler Mike Schmidt.
ConTZole Tomáš Kubeš, 2010 atlas-tz-monitoring.cern.ch An Interactive ATLAS Tier-0 Monitoring.
Platform as a Service (PaaS)
JQuery Fundamentals Introduction Tutorial Videos
Web Programming Language
Web Technologies Computing Science Thompson Rivers University
Platform as a Service (PaaS)
Selenium HP Web Test Tool Training
Selenium and Selenium on Rails
Testing with Selenium IDE
SharePoint-Hosted Apps and JavaScript
Tracking FEMA Kevin Kays, Emily Maier, Tyler Leskanic, Seth Cannon
ISC440: Web Programming 2 AJAX
Map Reduce Workshop Monday November 12th, 2012
IntroductionToPHP Static vs. Dynamic websites
Selenium Tutorials Cheyat Training.
Web Technologies Computing Science Thompson Rivers University
Introduction to AJAX and JSON
Lecture 34: Testing II April 24, 2017 Selenium testing script 7/7/2019
Build a Text Dataset from AMAZON
And Mobile Web Browsers
Web Application Development Using PHP
Presentation transcript:

Robert Currier, Mote Marine Laboratory Dr. Barbara Kirkpatrick, TAMU/GCOOS

Overview FL Department of Health monitors 34 coastal counties E. coli/Enterroccus samples taken weekly DOH data publicly available but no API Original DOH website used standard HTML/CSS Python “web scraping” app developed to harvest data DOH outsourced website to commercial provider

We had no access to DOH staff or API for the data In “Big Data” world of today this is becoming typical: What we built broke when data format changed This is the story of how we fixed the harvester

Original Data Harvester Written in Python Used the ‘urllib’ library for web scraping Data stored in MySQL database Harvester ran nightly out of cron App walked through list of counties and built url: Data returned as Python text object Text object fed to regular expression for matching

Original Data Format

And Then It Stopped Working… FL DOH suddenly (to us) outsourced in early 2013 New website used proprietary JavaScript and Maps Plain HTML no longer sent to the browser Instead, custom JavaScript was loaded The JavaScript used AJAX and DOM manipulation

New Data Format

The Solution Emulating a browser with Selenium Portable software test framework for web applications Can act like FireFox, Chrome and IE Typically used for building automated tests We repurposed and used as a virtual browser As a browser Selenium can execute JavaScript

Soup’s On! Selenium worked and we now had data available But data was very unstructured and massively ugly BeautifulSoup4 to the rescue…

And The Soup Was Tasty! BeautifulSoup4 gave us back our “structured” data Some modification needed to data parsing code as… Locations, variables and dates were not on same line

The New Code Worked Perfectly In Our Development Environment

But Failed Spectacularly When We Deployed

What Happened? Amazon EC-2 instances are “headless” servers No display hardware No graphics libraries (GTK+) Since no graphics libraries, no browsers Without a browser, we crash and burn

Adding A Virtual Head provided us with a script that pulled the source and built GTK+ on our cloud server in under two hours. Thanks, Joe Lawson! Unfortunately, the script bombed and didn’t build FireFox. We had to download the source and build by hand. Now we had a working browser, but no monitor on which to display our output…

Getting A Head with XVFB XVFB: The X virtual frame buffer Performs all graphical operations in memory Doesn’t show output Primarily used for testing, but… We repurposed, just like Selenium +=

Automating The Process

Conclusions Don’t be afraid to use untraditional data sources But be prepared for your code to break We live in a data rich environment But most of the data is very messy/unstructured So tread lightly, and don’t lose your head!

Thanks To: Mote Marine Laboratory Gulf Coast Ocean Observing Systems Texas A&M Department of Oceanography All the Free and Open Source Software developers

In Remembrance Of Seth Vidal, creator of ‘yum’, friend and FOSS guru Killed while biking on July 8 th 2013 in Durham, NC