Download presentation
1
Visual analytic tools for monitoring and understanding the emergence and evolution of innovations in science & technology Links from this talk: bit.ly/stmwant Cody Dunne Dept. of Computer Science and Human-Computer Interaction Lab, University of Maryland OECD KNOWINNO Workshop November 14-15, 2011 Alexandria, VA, USA QR Codes from:
2
Outline Academic literature exploration
Case study: Tree visualization techniques Case study: Business intelligence news Case study: Pennsylvania innovations STICK approach
3
1. Academic literature exploration
Users are looking for: Foundations Emerging research topics State of the art/open problems Collaborations & relationships between Communities Field evolution Easily understandable surveys
4
Action Science Explorer
5
User requirements Control over the paper collection
Choose custom subset via query, then iteratively drill down, filter, & refine Overview either as visualization or text statistics Orient within subset Easy to understand metrics for identifying interesting papers Ranking & filtering Create groups & annotate with findings Organize discovery process Share results
6
Action Science Explorer
Bibliometric lexical link mining to create a citation network and citation context Network clustering and multi-document summarization to extract key points Potent network analysis and visualization tools
7
2. Case study: Tree visualization
Problem: Traditional 2D node-link diagrams of trees become too large Solutions: Treemaps: Nested Rectangles Cone Trees: 3D Interactive Animations Hyperbolic Trees: Focus + Context Measures: Papers, articles, patents, citations,… Press releases, blog posts, tweets,… Users, downloads, sales,…
8
Treemaps: nested rectangles
9
Smartmoney MarketMap Feb 27, 2007
Green is A&P trading company, which bought Pathmark chain of stores. Up 6% that day. smartmoney.com/marketmap
10
Cone trees: 3D interactive animations
Robertson, G. G., Card, S. K., and Mackinlay, J. D., Information visualization using 3D interactive animation, Communications of the ACM, 36, 4 (1993), Robertson, G. G., Mackinlay, J. D., and Card, S. K., Cone trees: Animated 3D visualizations of hierarchical information, Proc. ACM SIGCHI Conference on Human Factors in Computing Systems, ACM Press, New York, (April 1991),
11
Hyperbolic trees: focus & context
Lamping, J. and Rao, R., Laying out and visualizing large trees using a hyper-bolic space, Proc. 7th Annual ACM symposium on User Interface Software and Technology, ACM Press, New York (1994), Lamping, J., Rao, R., and Pirolli, P., A focus+context technique based on hy-perbolic geometry for visualizing large hierarchies, Proc. SIGCHI Conference on Human Factors in Computing Systems, ACM Press, New York (1995),
12
Tree visualization publishing
TM=Treemaps CT=Cone Trees HT=Hyperbolic Trees Trade Press Articles Academic Papers Patents
13
Tree visualization citations
TM=Treemaps CT=Cone Trees HT=Hyperbolic Trees Academic Papers Patents
14
Insights Emerging ideas may benefit from open access
Compelling demonstrations with familiar applications help Many components to commercial success 2D visualizations w/spatial stability successful Term disambiguation & data cleaning are hard Shneiderman, B., Dunne, C., Sharma, P. & Wang, P. (2011), "Innovation trajectories for information visualizations: Comparing treemaps, cone trees, and hyperbolic trees", Information Visualization.
15
3. Case study: Business intelligence news Proquest 2000-2009
Term Frequency hyperion 3122 decision support system 39 data mining 889 business process reengineering 36 business intelligence 434 data mart 29 knowledge mgmt. 221 business analytics 21 data warehouse 207 text mining 19 data warehousing 139 predictive analytics 18 cognos 112 business performance mgmt 6 competitive intelligence 86 online analytical processing 5 electronic data itrch. 69 knowledge discovery in database 1 meta data ad hoc query
16
PQ Business Intelligence 2000-2009 Co-occurrence of concepts with organizations
Data Mining National Security Agency NSA White House FBI AT&T American Civil Liberties Union Electronic Frontier Foundation Dept. of Homeland Security CIA Frequency Year
17
PQ Business Intelligence 2000-2009 Co-occurrence of organizations
NSA Natl. Security Agency White House AT&T FBI EFF ACLU CIA Pentagon Frequency Year
18
Business Intelligence
Matrix showing Co-Occurrence of concepts and orgs.
19
Business Intelligence
: (subset)
20
Business Intelligence 2000-2009: Data mining
NSA CIA FBI White House Pentagon DOD DHS AT&T ACLU EFF Senate Judiciary Committee
21
Business Intelligence 2000-2009: Tech1
Google Yahoo Stanford Apple Tech2 IBM, Cognos Microsoft Oracle Finance NASDAQ NYSE SEC NCR MicroStrategy
22
Business Intelligence 2000-2009:
Air Force Army Navy GSA UMD*
23
Insights Useful groupings in PQ BI terms based on events and long-term collaborators Interactive line charts useful for looking at co-occurrence relationships over time Clustered heatmaps useful for overall co-occurrence relationships stick.ischool.umd.edu
24
4. Case study: Pennsylvania innovations
Innovation relationships during 1990 State & federal funding Patents (both strong and weak ties) Location Connecting State & federal agencies Universities Firms Inventors
25
Patent Tech SBIR (federal) PA DCED (state) Related patent 2: Federal agency 3: Enterprise 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty 13-15: Semi-rural/rural cnty 17: Foreign countries 19: Other states
26
Patent Tech SBIR (federal) PA DCED (state) Related patent 2: Federal agency 3: Enterprise 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty 13-15: Semi-rural/rural cnty 17: Foreign countries 19: Other states
27
Pharmaceutical/Medical
No Location Philadelphia Patent Tech Navy SBIR (federal) PA DCED (state) Related patent 2: Federal agency Pharmaceutical/Medical 3: Enterprise Pittsburgh Metro 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty 13-15: Semi-rural/rural cnty 17: Foreign countries Westinghouse Electric 19: Other states
28
Pharmaceutical/Medical
No Location Philadelphia Patent Tech Navy SBIR (federal) PA DCED (state) Related patent 2: Federal agency Pharmaceutical/Medical 3: Enterprise Pittsburgh Metro 5: Inventors 9: Universities 10: PA DCED 11/12: Phil/Pitt metro cnty 13-15: Semi-rural/rural cnty 17: Foreign countries Westinghouse Electric 19: Other states
29
Insights Meta-layouts useful for showing: User comments
Groups (clusters, attributes, manual) Relationships between them User comments “We've never been able to see anything like this“ “This is going to be huge"
30
5. STICK approach NSF SciSIP Program The STICK Project
Science of Science & Innovation Policy Goal: Scientific approach to science policy The STICK Project Science & Technology Innovation Concept Knowledge-base Goal: Monitoring, Understanding, and Advancing the (R)Evolution of Science & Technology Innovations
31
STICK approach cont… Scientific, data-driven way to track innovations
Vs. current expert-based, time consuming approaches (e.g., Gartner’s Hype Cycle, tire track diagrams) Includes both concept and product forms Study relationships between Study the innovation ecosystem Organizations & people Both those producing & using innovations stick.ischool.umd.edu
32
STICK Process (overview)
Identify concepts Business intelligence, cloud computing, customer relationship management, health IT, web 2.0, electronic health records, biotech Query data sources Processing Automatic entity recognition Crowd-sourced verification Co-occurrence networks Visualizing & analyzing Overall statistics Network evolution Sharing results News Dissertation Academic Patent Blogs
33
Process Cleaning Collecting Processing Visualizing & Analyzing
Collaborating Cleaning
34
Collecting Identify Concepts Data Sources Begin with target concepts
Business Intelligence Health IT Cloud Computing Customer Relationship Management Web 2.0 Personal Health Records Nanotechnology Develop sub concepts from domain experts, wikis News Dissertation Academic Patent Blogs
35
Collecting (2) Form & Expand Queries Scrape Results ABS(
"customer relationship management" OR "customers relationship management" OR "customer relation management" ) OR TEXT(…) OR SUB(…) OR TI(…) Scrape Results
36
Processing Automatic Entity Recognition Crowd-Sourced Verification
BBN IdentiFinder Extract most frequent 25% Assign to CrowdFlower Workers check organization names and sample sentences Each time a MTurker is given a string and the sentence the string appears, then he/she selects the string that he/she think is an organization. Also I do not think we have people's names used in relation extraction right now. So you can also remove 'and people' from the first sentence in my slide if you feel need to, We first extracted the top 25% most frequently appeared organization names, then ask MTurkers to identify if they are correct. To identify correctly formatted organization names, we ask three MTurkers to vote the automatically extracted names and also the sentence contain the name. If at least two MTurkers vote the names as organization names, we collect them as corrected formated names.
37
Processing (2) CSV GraphML Compute Co-Occurrence Networks Output
Overall edge weights Slice by time to see network evolution Output CSV GraphML
38
Visualizing & Analyzing
Spotfire NodeXL Import CSV, Database Standard charts Multiple coordinated views Highly scalable CSV, Spigots, GraphML Automate feature Batch analysis & visualization Excel 2007/2010 template
39
Shared data & analysis repositories
Online Research Community Share data, tools, results Data & analysis downloads Spotfire Web Player Communication Co-creation, co-authoring stick.ischool.umd.edu/community
40
Ongoing Work Collecting: Additional data sources and queries
Processing: Improving entity recognition accuracy Visualizing & Analyzing: Visualizing network evolution Co-occurrence network sliced by time Collaborating: Develop the STICK Open Community site Motivate user participation Improve the resources available Invitation-only testing
41
Outline Academic literature exploration
Citation networks and text summarization Case study: Tree visualization techniques Papers, patents, and trade press articles Case study: Business intelligence news News term co-occurrence Case study: Pennsylvania innovations Patents, funding, and locations STICK approach Tracking innovations across papers, patents, news articles, and blog posts
42
Take Away Messages Easier scientific, data-driven innovation analysis:
Automatic collection & processing of innovation data Easy access to visual analytic tools for finding clusters, trends, outliers Communities for sharing data, tools, & results
43
Visual analytic tools for monitoring and understanding the emergence and evolution of innovations in science & technology Links from this talk: bit.ly/stmwant Cody Dunne Dept. of Computer Science and Human-Computer Interaction Lab, University of Maryland This work has been partially supported by NSF grants IIS (ASE) and SBE (STICK) QR Codes from:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.