Prof. Jason Hong, Carnegie Mellon University Rapid End-User Programming and Visualization for the Web IDA Session 5 2007 CS Study Panel 24 April 2008.

Slides:



Advertisements
Similar presentations
Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
Advertisements

Session # 2 SWE 211 – Introduction to Software Engineering Lect. Amanullah Quadri 2. Fact Finding & Techniques.
XProtect ® Express Integration made easy. With support for up to 48 cameras, XProtect Express is easy and affordable IP video surveillance software with.
© 2010 Artur Dubrawski 1 T-Cube Web Interface in RTBP: A Review of R&D Challenges Artur Dubrawski, Ph.D, M.Eng. Director, Auton Lab Senior Systems Scientist,
C van Ingen, D Agarwal, M Goode, J Gupchup, J Hunt, R Leonardson, M Rodriguez, N Li Berkeley Water Center John Hopkins University Lawrence Berkeley Laboratory.
Use of an innovative meta-data search tool improves variable discovery in large-p data sets like the Simons Simplex Collection (SSC) Leon Rozenblit, JD,
Requirements Engineering n Elicit requirements from customer  Information and control needs, product function and behavior, overall product performance,
End User Mashup Programming Environments Oleg Beletski HUT, Telecommunications Software and Multimedia Laboratory
Access 2007 Product Review. With its improved interface and interactive design capabilities that do not require deep database knowledge, Microsoft Office.
Development and Evaluation of Emerging Design Patterns for Ubiquitous Computing Eric Chung Carnegie Mellon Jason Hong Carnegie Mellon Madhu Prabaker University.
GenSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work Chris Murphy, Swapneel Sheth, Gail Kaiser, Lauren.
Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University.
Course Wrap-Up IS 485, Professor Matt Thatcher. 2 C.J. Minard ( )
Outline Chapter 1 Hardware, Software, Programming, Web surfing, … Chapter Goals –Describe the layers of a computer system –Describe the concept.
© Prentice Hall CHAPTER 3 Computer Software.
Copyright 2003 The McGraw-Hill Companies, Inc CHAPTER Application Software computing ESSENTIALS    
Business Intelligence Technology and Career Options Paul Boal Director - Data Management Mercy ( April 7, 2014.
Chapter 3 Software Two major types of software
Creating a SharePoint App with Microsoft Access Services
WebQuilt and Mobile Devices: A Web Usability Testing and Analysis Tool for the Mobile Internet Tara Matthews Seattle University April 5, 2001 Faculty Mentor:
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Web Design Process CMPT 281. Outline How do we know good sites from bad sites? Web design process Class design exercise.
Databases & Data Warehouses Chapter 3 Database Processing.
Annual SERC Research Review - Student Presentation, October 5-6, Extending Model Based System Engineering to Utilize 3D Virtual Environments Peter.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
2010 QICF. Carnegie Mellon University in Pittsburgh, Pennsylvania, USA Founded by Andrew Carnegie in 1900 The youngest top-25 American research university.
Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software.
Human-Computer Interaction Breakout Clare-Marie Karat, Charles Wiecha Wanda Dunn, Jason Hong, Bonnie John, Bob Kraut, Brad Myers, Norman Sadeh.
Solution Overview for NIPDEC- CDAP July 15, 2005.
Classroom User Training June 29, 2005 Presented by:
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Rodney Holman Mandip Kaur Information Builders  Company Name: Information Builders  CEO and Founder: Gerald D. Cohen  Address: Two Penn Plaza, New.
Integrating Educational Technology into the Curriculum
Topiary: A Tool for Prototyping Location-Enhanced Applications Yang Li, Jason I. Hong, James A. Landay, Presented by Daniel Schulman.
Databases and Education Access Access Course Progression Access courses can be designed for intensive immersion or semester-long courses. Basic.
2007 Microsoft Office System Overview 2007 Microsoft Office System Overview Elizabeth Caley Senior Product Manager Microsoft Canada.
Computer –the machine the program runs on –often split between clients & servers Human-Computer Interaction (HCI) Human –the end-user of a program –the.
(1) WattDepot: A software ecosystem for energy data collection, storage, analysis, and visualization Robert S. Brewer, Philip M. Johnson Collaborative.
Section 1: Introducing Group Policy What Is Group Policy? Group Policy Scenarios New Group Policy Features Introduced with Windows Server 2008 and Windows.
Web Mashups Presented By: Saket Goel Uni: sg2679.
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
Chapter 11: Software Prototyping Omar Meqdadi SE 273 Lecture 11 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Melissa Armstrong – Sponsor Dr. Eck Doerry – Mentor Greg Andolshek Alex Koch Michael McCormick Department of Computer Science SolutionProblemDesign User.
Edukey Education Ltd
Problem Statement: Users can get too busy at work or at home to check the current weather condition for sever weather. Many of the free weather software.
1 ISA&D29-Oct ISA&D29-Oct-13 Systems Analyst: problem solver IT and Strategic Planning.
5 Weeks Due Date April 15. Content Not Key Google performs 3 Billion Searches a day.
Grade 11 Computer Science. Relational Databases  Using the link below, answer questions in your notebooks  Look at Kites.accdb database to refresh your.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
WEP Presentation for non-IT Steps and roles in software development 2. Skills developed in 1 st year 3. What can do a student in 1 st internship.
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
Human-Computer Interaction at CMU Jodi Forlizzi Jason Hong.
Software Prototyping Rapid software development to validate requirements.
Connecting with Computer Science2 Objectives Learn how software engineering is used to create applications Learn some of the different software engineering.
Ubiquitous Computing Visions Jason I. Hong jasonh at cs cmu edu.
I NTRODUCTION TO N ETWORK A DMINISTRATION. W HAT IS A N ETWORK ? A network is a group of computers connected to each other to share information. Networks.
Copyright (c) 2003 by Prentice Hall Chapter 2 Applications Software: Getting the Work Done Computers: Tools for an Information Age BSM025 Computers.
Usable Privacy and Security and Mobile Social Services Jason Hong
Google search in general  Google Search, commonly referred to as Google Web Search or just Google, is a web search engine owned by Google Inc. It is.
An Active Security Infrastructure for Grids Stuart Kenny*, Brian Coghlan Trinity College Dublin.
Dude, Where's My Car? And Other Questions in Context-Awareness Jason I. Hong James A. Landay Group for User Interface Research University of California.
UCI Large-Scale Collection of Application Usage Data to Inform Software Development David M. Hilbert David F. Redmiles Information and Computer Science.
Prof. James A. Landay University of Washington Spring 2008 Web Interface Design, Prototyping, and Implementation Ubicomp Design Pre-Patterns May 29, 2008.
Network and Server Basics. Learning Objectives After viewing this presentation, you will be able to: Understand the benefits of a client/server network.
Take Your Data Analysis and Reporting to the Next Level by Combining SAS Office Analytics, SAS Visual Analytics, and SAS Studio David Bailey Tim Beese.
Chapter 12: Automated data collection methods
Model-View-Controller Patterns and Frameworks
Presentation transcript:

Prof. Jason Hong, Carnegie Mellon University Rapid End-User Programming and Visualization for the Web IDA Session CS Study Panel 24 April 2008

Research Areas End-User Programming Extracting and visualizing data from web Usable Privacy and Security Anti-phishing (training, detection) Managing privacy and security policies Mobile Computing Location-based services Context-aware computing Jason Hong Assistant Professor Human-Computer Interaction Institute Carnegie Mellon University PhD: University of California, Berkeley Potential Military Applications Tools for rapidly integrating data and web services Better visualizations of large data sets Effective training for security Automated algorithms for detecting phishing scams Better interfaces for managing security Principal Investigator Contact Information School of Computer Science Carnegie Mellon University 2504D Newell-Simon Hall 5000 Forbes Ave Tel: (412) Fax: (412) Web: Principal Investigator

30000 Foot View High-level problems observed: –Stovepipes - Data and services spread over multiple systems –Agility - Integration takes months or years –Overload - Too much information to easily process Goal: Make it easy for people to visualize and process data gathered from variety of sources –Information extraction + visualization + machine learning –No PhD required Analogies: –Spreadsheets –Visual Basic

Mashups as Key Focus Area More specifically, provide an end-user programming tool that makes it easy to create mashups –Mashups are applications that combine content and services from multiple web sites –Ex. Craigslist.com + GoogleMaps = Housingmaps.com

Other Example Mashups Other example mashups –Ex. MySpace child predators –Ex. Locations of friends on MySpace or Facebook Common themes –Aggregating multiple sources (web pages, databases, etc) –Handling multiple data formats (not designed to be shared) –Processing the data (filtering, summarizing, etc) –Supporting multiple forms of output (graphs, maps, lists)

Creating Mashups is Difficult Requires lots of skill to create a mashup –Ex. Housingmaps creator has PhD in computer science –Ex. MySpace predator list took months of custom coding Requires programming expertise in many areas –Web crawling –Text parsing and pattern matching –Web services (WSDL and REST) –Databases –HTML Can we accelerate this process to a matter of days or hours for non-experts?

End-User Programming Haggis, an end-user programming tool 1.Rapidly extract and combine data from multiple sources 2.Quickly create high-quality interfaces and visualizations 3.Use programming-by-example techniques to specify what is normal and what is anomalous

1. Extract data from multiple sources Improved wizards for extracting data from web pages –Can specify example of desired links, system generalizes

Improved wizards for extracting data from web pages –Can specify example of desired links, system generalizes –Better support for other patterns on web Tables, street addresses, etc Support for real-time data –Weather, traffic, stocks, any web page periodically updated –Sensor Andrew, sensor network being deployed at CMU Electrical usage, water usage, etc 1. Extract data from multiple sources

2. Interfaces and Visualizations Wizards for supporting common UI patterns –Table views, maps, graph views, alerts, etc Programming-by-example techniques

2. Interfaces and Visualizations Output as a web page or desktop widget –Yahoo Widgets, Google Desktop, Windows Sidebar

2. Interfaces and Visualizations Output as a web page or desktop widget –Yahoo Widgets, Google Desktop, Windows Sidebar

3. Normal versus Anomalous Problem: Too much data, gets dropped on floor Solution: “Teach” the system what patterns to look for –Analyst-in-the-loop: infoviz + machine learning –Long-term goal Example: –eBay “penny sellers”, could create custom software, but slow –Analyst uses visualization to find some examples of penny sellers and gives hints to system as to why –System finds more suspects, analyst gives relevance feedback –As new data streams in, system can flag suspects Can help address high turnover rate at intelligence agencies, loss of organizational memory

Current Progress First round of interviews completed –Sensor Andrew team (Civil and Electrical Engineers) –Mashup Camp –Programmers around CMU Initial prototype of “plumbing” in progress –An Integrated Development Environment (IDE) for programmers, to facilitate extraction and visualization of data –Low-level support for extracting data from tables, basic visualizations, etc –Higher-level tools later to be built on top First round of user tests planned for August

Past Work with Marmite Wizard for extracting data from arbitrary web pages Combine operators together in a dataflow (Unix) View the data in multiple ways (table, map)

How Marmite Works Wizard for getting data from web pages Combine operators together in a dataflow (Unix) View the data in multiple ways (table, map)

How Marmite Works Operators let you know what operations can be done Input, processing, output

How Marmite Works Operators are chained together in a dataflow (Unix)

How Marmite Works Current data is shown

How Marmite Works And multiple views too

How Marmite Works A wizard UI for helping people get the data they want

Some High-Level Design Issues Centralized model –Clean data model: well-managed, well-formatted, common representations, well-known databases, etc Decentralized model –“Anarchic”, multiple data formats in multiple places –Hard to get lots of people to agree on data format and representation –More likely scenario (look at how databases are used today) –Haggis is being designed for this model, assuming that a person may have to clean up the data and resolve formats

Other High-Level Design Issues Discovery –What data sources are available? –May need some kind of centralized store that describes these (sort of like DNS for Internet) Security –Access control, who can access what data sources? –This is a general problem with sensor data Privacy –What kinds of queries / apps should people be able to do? –Unclear how to restrict those in practice