Using the Purdue DB Technology to build simple on-demand data exploration tools Michael Grobe Pervasive Technology Institute Indiana University Hubbub.

Slides:



Advertisements
Similar presentations
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
Advertisements

Programming Paradigms and languages
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 22 World Wide Web and HTTP.
MAST-VizieR/NED cross correlation tutorial 1. Introduction Figure 1: Screenshot of the MAST VizieR Catalog Search Form. or enter here as object class:
Stat-JR: eBooks Richard Parker. Quick overview To recap… Stat-JR uses templates to perform specific functions on datasets, e.g.: – 1LevelMod fits 1-level.
DATABASES AT THE HUB NOW YOU CAN CREATE THEM YOURSELF! Ann Christine Catlin HUBbub 2013.
DATABASES AT THE HUB NOW YOU CAN CREATE THEM YOURSELF! Ann Christine Catlin Senior Research Scientist Rosen Center for Advanced Computing HUBbub 2013.
Exploring Microsoft Excel 2002 Chapter 7 Chapter 7 List and Data Management: Converting Data to Information By Robert T. Grauer Maryann Barber Exploring.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Chapter Day 5. © 2007 Pearson Addison-Wesley. All rights reserved2-2 Agenda Day 5 Questions from last Class?? Problem set 1 Posted  Introduction on developing.
Web applications. Javascript. Web 2.0: The dynamic, read-write web UC Santa Cruz CMPS 10 – Introduction to Computer Science
A Simple Guide to Using SPSS© for Windows
Interpret Application Specifications
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Computer Science 101 Web Access to Databases Overview of Web Access to Databases.
1 Chapter 20 — Creating Web Projects Microsoft Visual Basic.NET, Introduction to Programming.
Presented by Mina Haratiannezhadi 1.  publishing, editing and modifying content  maintenance  central interface  manage workflows 2.
WEB DESIGNING Prof. Jesse A. Role Ph. D TM UEAB 2010.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
Section 2.1 Compare the Internet and the Web Identify Web browser components Compare Web sites and Web pages Describe types of Web sites Section 2.2 Identify.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
11 Games and Content Session 4.1. Session Overview  Show how games are made up of program code and content  Find out about the content management system.
Lab Assignment 7 | Web Forms and Manipulating Strings Interactive Features Added In this assignment you will continue the design and implementation of.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Chapter 33 CGI Technology for Dynamic Web Documents There are two alternative forms of retrieving web documents. Instead of retrieving static HTML documents,
1 Web Basics Section 1.1 Compare the Internet and the Web Compare Web sites and Web pages Identify Web browser components Describe types of Web sites Section.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Using the Purdue DB Technology to build.
RUP Implementation and Testing
Introduction to SPSS Edward A. Greenberg, PhD
GDT V5 Web Services. GDT V5 Web Services Doug Evans and Detlef Lexut GDT 2008 International User Conference August 10 – 13  Lake Las Vegas, Nevada GDT.
SPSS Presented by Chabalala Chabalala Lebohang Kompi Balone Ndaba.
1 In the good old days... Years ago… the WWW was made up of (mostly) static documents. –Each URL corresponded to a single file stored on some hard disk.
Section 17.1 Add an audio file using HTML Create a form using HTML Add text boxes using HTML Add radio buttons and check boxes using HTML Add a pull-down.
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
CPSC 203 Introduction to Computers Lab 23 By Jie Gao.
1 Data Bound Controls II Chapter Objectives You will be able to Use a Data Source control to get data from a SQL database and make it available.
26 June 2008 DG REGIO Evaluation Network Meeting Ex-post Evaluation of Cohesion Policy Programmes co-financed by the European Fund for Regional.
NMED 3850 A Advanced Online Design January 12, 2010 V. Mahadevan.
Chapter 3 Servlet Basics. 1.Recall the Servlet Role 2.Basic Servlet Structure 3.A simple servlet that generates plain text 4.A servlet that generates.
11 3 / 12 CHAPTER Databases MIS105 Lec15 Irfan Ahmed Ilyas.
September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University.
CSC 2720 Building Web Applications Server-side Scripting with PHP.
What is SPSS  SPSS is a program software used for statistical analysis.  Statistical Package for Social Sciences.
Larry Theller, Bernie Engel, Youn Shik Park for Purdue University ABE and Office of Indiana State Chemist September 27-28, 2012 “Empowering Water Quality.
3-Tier Client/Server Internet Example. TIER 1 - User interface and navigation Labeled Tier 1 in the following graphic, this layer comprises the entire.
Access 2007 ® Use Databases How can Microsoft Access 2007 help you structure your database?
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Rails & Ajax Module 5. Introduction to Rails Overview of Rails Rails is Ruby based “A development framework for Web-based applications” Rails uses the.
EML Analysis Tools Introduction Ecoinformatics Working Group Taiwan Forestry Research Institute (TFRI)
1 WWW. 2 World Wide Web Major application protocol used on the Internet Simple interface Two concepts –Point –Click.
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
Web Design and Development. World Wide Web  World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files 
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
HUBzero® Platform for Scientific Collaboration Copyright © 2012 HUBzero Foundation, LLC Collaboration and Contribution Emily Kayser Hub Liaison, HUBzero®
: Information Retrieval อาจารย์ ธีภากรณ์ นฤมาณนลิณี
Navigation Framework using CF Architecture for a Client-Server Application using the open standards of the Web presented by Kedar Desai Differential Technologies,
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Report on parallel session 5A, a.k.a. the DoSSiER session: Database of Scientific Simulation and Experimental Results 9/14/2016 Hans Wenzel 21st Geant4.
Developing Online Tools To Support The Visualization Of Ocean Data For Educational Applications Poster #1767 Michael Mills, S. Lichtenwalner,
IST 220 – Intro to Databases
Using E-Business Suite Attachments
Data Virtualization Tutorial… CORS and CIS
Databases at the Hub Now you can Create them yourself!
A QUICK START TO OPL IBM ILOG OPL V6.3 > Starting Kit >
Spreadsheets, Modelling & Databases
Tutorial 7 – Integrating Access With the Web and With Other Programs
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
Presentation transcript:

Using the Purdue DB Technology to build simple on-demand data exploration tools Michael Grobe Pervasive Technology Institute Indiana University Hubbub 2012 September 24, 20121

The general idea This presentation describes some of the work that came out of our collaboration with Ann-Christine Catlin’s group at Purdue. The primary goal involved making data from Connie Weaver’s Nutrional Science Lab at Purdue available on the Indiana CTSI Hub using the tools developed by Ann- Christine’s group, which we took to calling the “Purdue database technology toolset.” At the time we were working on this project, the Purdue database technology was focused around several tools that are very useful for collecting and displaying data, and therefore for building integrated data base interfaces: Dbparse: allows users to upload CSV files to backend database tables DataView: displays contents of backend database tables Photo_Gallery and the Collections manager: maintains and displays collections of images which may be displayed on-their-own or from within DataView views com_form: a component for building web-based forms for data collection and display 2

The general idea (continued) To use the DataViewer for data display, programmers simply prepare configuration files that identify database tables to be displayed, and define display appearance, including suitable plots. However, for non-programming users, it seemed likely that a small set of ad hoc data exploration tools might be useful for getting a quick look at the data. In particular, the following capabilities seem to constitute a basic package of capabilities: - frequency tables, - cross tabulations, - simple correlations/regression plots, or - simple correlation matrices. 3

The prototype and a question The com_form and DataViewer components were used to provide a PROTOTYPE version of a new HubZero component that performs these functions (com_datashape). com_form was used to build forms requesting and validating user input, and com_dataview was used to display frequency charts and regression plots as directed at run-time. (Something of a mutation of the DataViewer model.) So today’s presentation will: 1)illustrate how these components can be used together, and 2)help gauge user-interest in having such capabilities proceed past the prototype phase. (It will not however, provide a tutorial on actual coding techniques.) 4

A starting form to create queries on the fly. ( 5

Request a frequency table/chart 6

The field name pulldown menu 7

The resulting frequency chart 8

The cumulative frequency chart 9

Request a chart of partitioned field values 10

Resulting chart showing partition frequencies 11

Cross tabulation on 10 partitions of 2 fields 12

Cross tabulation (second half) 13

Resulting tabulation (10 partitions of each variable) 14

User specified partitions Users may explicitly specify partitions. In the next example, FecCa partitions were specified as: 700 | 800 | 900 | 1000 | 1100 | 1200 and the Retention partitions were specified as: -200 | -100 | 0 | 100 | 200 | 300 | 400 | 500 Real partition values may be specified using either standard decimal notation or scientific notation; they must of course be in ascending order. These are strict "partitions" rather than category "boundaries", so the first and last categories extend to negative or positive infinity, respectively; they are "unbounded", though that may not always be explicitly represented in the headers. 15

Resulting tabulation (user-defined partitions of each variable) 16

Request a simple correlation/regression 17

Resulting regression info 18

A correlation_matrix 19

Request user-defined displays via URLs You can invoke these applications via statically coded URLs. These examples will only work (at present) if you are logged in to the target CTSI hub instance in the same browser. Cross tablulate Retention with "Age category" (Teen or Adult) (1996) (show only the HTML table to avoid formatting imposed by the DataViewer display) Cross tablulate Retention with "Age category" (Teen or Adult) (1996) (show only the HTML table to avoid formatting imposed by the DataViewer display) Cross tabulate Retention with "Age category" (Teen or Adult) (1996) (show BOTH the HTML and DataView tables) Cross tabulate Retention with "Age category" (Teen or Adult) (1996) (show BOTH the HTML and DataView tables) Cross tabulate 10 Alk Phos categories with 10 Retention categories (HTML table only) Cross tabulate 10 Alk Phos categories with 10 Retention categories Cross tabulate unique Retention values with unique Retention values (both cases interpreted as integers) (HTML table only) Cross tabulate unique Retention values with unique Retention values (both cases interpreted as integers) Retention cross-tabulated with FecCa (Fecal Calcium) Age by Pre-specified Age Month categories (There are some missing Age values.) Age by Pre-specified Age Month categories Correlate and regress Retention as a function of Post Menarcheal age (1996) Correlation matrix composed of all numeric fields in the table cca_weaver_db1_1_1996 showing only values greater than.5 Correlation matrix composed of all numeric fields in the table cca_weaver_db1_1_1996 showing only values greater than.5 20

Summary Prototypes for 4 tools for (very) simple data exploration were built over 2 existing Purdue DB technology tools. (Correlation matrix tool can now run independently of the DataViewer.) These provide on-demand frequency charts, cross- tabulations, and simple correlation/regression output. These tools can help users get a “feel” for the data, but are not suitable for serious statistical analysis. Can these tools be helpful in the Hub context? 21

Additional information – Tutorial: Setting up a spreadsheet for use with the cceHub DataViewer Tutorial: Setting up a spreadsheet for use with the cceHub DataViewer – Tutorial: Using the Web form component to build scripts to modify databases Tutorial: Using the Web form component to build scripts to modify databases – Tutorial: Using the Web form and DataViewer components for to request user-specified frequency tables/charts, cross-tabs, and regressions Tutorial: Using the Web form and DataViewer components for to request user-specified frequency tables/charts, cross-tabs, and regressions – Putting a Joomla! component into SVN for the development cycle Putting a Joomla! component into SVN for the development cycle 22