Data Mining: Crossing the Chasm Rakesh Agrawal IBM Almaden Research Center.

Slides:



Advertisements
Similar presentations
Data Mining: Potentials and Challenges Rakesh Agrawal & Jeff Ullman.
Advertisements

Creating Value: Understanding patterns of market evolution.
XBRL International Supporting the COREP Project Ignacio Hernandez-Ros Technology Development, XBRL International Inc.
Class #5 Market Segmentation & How to Do Primary Market Research or Now that I have my idea and team, what do I do next?
Startup University Hi-Tech Marketing How Markets Develop Market Development Strategy.
Cloud Computing - clearing the fog Rob Gear 8 th December 2009.
Nokia Technology Institute Natural Partner for Innovation.
Web Services: A Personal Viewpoint Rakesh Agrawal IBM Almaden Research Center.
Crossing the Chasm What’s New? What’s Not?. Disruptive Innovation What Makes High-Tech Marketing Different? High Risk Unproven products and promises Incompatible.
Machine Learning and Data Mining Course Summary. 2 Outline  Data Mining and Society  Discrimination, Privacy, and Security  Hype Curve  Future Directions.
MIKE2.0 Methodology Presentation to Wiki Wednesday community, London 6 June 2007
Data Mining: Next 10 Years Rakesh Agrawal IBM Almaden Research Center Position from KDD-2001 Revisited.
TALC and Marketing Early Mkt thru Tornado. Technology Adoption Life Cycle # New Users Time technophiles visionaries pragmatists conservatives Skeptics.
Some Interesting Problems Rakesh Agrawal IBM Almaden Research Center.
Basics of Software Business
9 Entrepreneurship Marketing in a New Venture. 9-2 “Advertisers are the interpreters of our dreams.” --E. B. White.
High Tech Marketing Prof. Mitchell Tseng IELM538.
8 Systems Analysis and Design in a Changing World, Fifth Edition.
Crossing the Chasm and Beyond
Navision Business Analytics Joyce Leung, Partner Technology Specialist.
Leading and Managing Business Intelligence 21 st Meeting Course Name: Business Intelligence Year: 2009.
The Strategic Management Process
IBM Corporate User Technologies | November 2004 | © 2004 IBM Corporation An Introduction to Darwin Information Typing Architecture: DITA Presented by Dave.
Mantova 18/10/2002 "A Roadmap to New Product Development" Supporting Innovation Through The NPD Process and the Creation of Spin-off Companies.
The Evolution of Video Game Development Research by Jariel Ortiz, Academia Bautista de Puerto Nuevo, San Juan, Puerto Rico Research Mentor: Prof. Irma.
第三組 Produce a report on 1.SAP NetWeaver 2.SAP Web Application Server 3. SAP Solution Manager ~ Team member ~ 何承恩 謝岳霖 徐翊翔 陳鼎昇.
Lee Kinsman (soon to be) Consultant, Chamonix IT Consulting
Copyright © 2004 Sherif Kamel Theory of Diffusion of Innovation Sherif Kamel The American University in Cairo.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Niche Marketing in Executive Search “The Box” Your Thinking is Here. My Goal: In 45 minutes, your thinking will be here.
Wireless Technology and Access to Justice By Jessica Hill.
What is Enterprise Architecture?
1 Agilent at APEX 2003 March 30, 2003 Anaheim, California.
Summary Device protocols tied intimately to applications. A need to significantly reduce critical data update times. Current network bandwidth consumption.
The Intel Advantage in Education Robert Shults Intel Corporation.
The Technology Adoption Lifecycle: Marketing to Mainstream Customers.
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
Annual General Meeting © Infosys Technologies Limited State of the Markets Basab Pradhan Senior Vice President and Head – World-wide Sales &
H:\share\rabino\information and marketing strategy.ppt 1 “Crossing the Chasm” by Geoffrey Moore Technology Adoption Life Cycle Innovators Early Adoption.
Data Mining: Potentials and Challenges Rakesh Agrawal IBM Almaden Research Center.
1 XML Based Networking Method for Connecting Distributed Anthropometric Databases 24 October 2006 Huaining Cheng Dr. Kathleen M. Robinette Human Effectiveness.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
ICS (072)Database Systems: An Introduction & Review 1 ICS 424 Advanced Database Systems Dr. Muhammad Shafique.
CERN – European Organization for Nuclear Research Administrative Support - Internet Development Services CET and the quest for optimal implementation and.
Making it Stick: Planning Your Elevator Pitch. Introducing Moore’s Chasm.
IS3313 Developing and Using Management Information Systems Lecture n: Tom’s out – Let’s talk about LINUX and Windows for a bit Rob Gleasure
The face of eCommerce The popular image of eCommerce is that of a splashy web page, full of products and advertisements. In fact, that web page is the.
IS3320 Developing and Using Management Information Systems Lecture 11: Introducing Innovations in MIS Design Rob Gleasure
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Innovation and Commercialization
Inside The Tornado by Geoffrey A. Moore
Rob Gleasure robgleasure.com
Edition Vitale and Giglierano Chapter 5 Concepts and Context of Business Strategy Prepared by John T. Drea, Western Illinois University.
1 Acquisition Automation – Challenges and Pitfalls Breakout Session # E11 Name: Jim Hargrove and Allen Edgar Date: Tuesday, July 31, 2012 Time: 2:30 pm-3:45.
10 Best Technologies to Learn at Eduonix in 2016 The tech field is progressing rapidly, with newer software applications and development tools being released.
National Cybersecurity Center of Excellence Increasing the deployment and use of standards-based security technologies NIST Industry Day February 10, 2016.
Marketing for Social Innovations. The fundamental questions for the Chief Marketing Officer:  What do you sell? – be specific  What is the size of the.
Systems Analysis and Design in a Changing World, Fifth Edition
IT Strategic Plan موسسه آموزشي و تحقيقاتي صنايع دفاعي.
Knowledge management at Katzenbach partners llc
SAP Preferred Care Enhanced support foundation for customer success
BLACKVARD MANAGEMENT CONSULTING, LLC
STRATEGIC TECHNOLOGY MARKETING DR. ISMI RAJIANI
Supporting a Business Process
Harvard Web Publishing Web Publishing for the Harvard Community
Evolutionary framework for strategy making
Creativity and the Business Idea
Crossing the Chasm What’s New? What’s Not?.
Supporting a Business Process
Presentation transcript:

Data Mining: Crossing the Chasm Rakesh Agrawal IBM Almaden Research Center

Thesis The greatest challenge facing data mining is to make the transition from being an early market technology to mainstream technology We have the opportunity to make this transition successful

Outline Chasm in the technology adoption life cycle, à la Geoffrey Moore † Experience with Quest/Intelligent Miner Ideas for successful chasm crossing †Geoffrey A Moore. Crossing the Chasm. Harper Business.

Technology Adoption Life Cycle Techies: Try it! Visionaries: Get ahead of the herd! Pragmatists: Stick with the herd! Conservatives: Hold on! Skeptics: No way! Late Majority Early Majority Early Adopters Laggards Innovators Psychographic profile of each group is different

Innovators: Technology Enthusiasts Intrigued by any fundamental advance in technology Like to alpha test new products Can ignore the missing elements Want access to top technologists Want no-profit pricing (preferably free) Gatekeepers to early adopters

Early Adopters: Visionaries Driven by vision of dramatic competitive advantage via revolutionary breakthroughs Great imagination for strategic applications Not so price-sensitive Want rapid time to market Demand high degree of customization Fund the development of early market

Early Majority: Pragmatists Want sustainable productivity improvement through evolutionary change Astute managers of mission-critical apps Understand real-world issues and tradeoffs Focus on proven applications; want to see the solution in production Bulwark of the mainstream market

Late Majority: Conservatives Want to stay even with the competition Risk averse Price sensitive Need completely pre-assembled solutions Extend technology life cycles

Laggards: Skeptics Driven to maintain status quo Good at debunking marketing hype Disbelieve productivity-improvement arguments Can be formidable opposition to early adoption of a technology Retard the development of high-tech markets

Crack in the curve Early Market Mainstream Market Chasm The greatest peril in the development of a high-tech market lies in making the transition from an early market dominated by a few visionaries to a mainstream market dominated by pragmatists.

Visionaries vs. Pragmatists Adventurous First strike capability Early buy-in State of the art Think big Spend big Prudent Staying power Wait-and-see Industry standard Manage expectation Spend to budget

Is data mining following this curve? Yes!!! My personal viewpoint based on Quest/Intelligent Miner experience

Quest Started as skunk work in early nineties Inspired by needs articulated by industry visionaries: –Transaction data collected over a long period –Current tools/SQL don’t cut it –About ready to throw data

Approach Examine “real” applications Identify operations that cut across applications Design fast, scalable algorithms for each operation Develop applications by composing operations

Operations Associations Sequential Patterns Similar time series New Operations Completeness, scalability Classification Clustering Deviations Adopted from Statistics/Learning Scalability

Bringing Quest to market Visionaries who inspired Quest did not become first customers: –Wanted evidence that the technology “worked” Frustrating attempts to interest major IBM customers: –Integration with existing applications –Too-far-out technology –Resistance from in-house analytic groups

First hits Small information-based companies who provided data in exchange for free results CIO who wanted to be seen as the technology pioneer in his industry CIO who wanted the success story to feature in the company’s annual report Led to the formation of a group offering services using Quest

Characteristics of engagements Mostly associations and sequential patterns Completeness a big plus Unanticipated uses Feedback for further development

Into the product land Formation of a small “out-of-plan” product group to productize Quest Facilitated by a closet mathematician Successes of the services group used for market validation Continued development and infusion of technology

Intelligent Miner Serious product Integrates technologies from various groups Fast, scalable, runs on multiple platforms Several “early market” success stories

Are we in the chasm? Perceived to be sophisticated technology, usable only by specialists Long, expensive projects Stand-alone, loosely-coupled with data infrastructures Difficult to infuse into existing mission- critical applications

Chasm Crossing Personal speculations on some technical challenges Do not imply IBM research/product directions

XML-based Data Mining Standard (1) Model Building: –A pair of standard DTDs for each operation –Interchangeable library of operator implementations Operator Model Parameters Data Specs Standard DTD Library Ack: Mattos, Pirahesh, Schwenkries

XML-based Data Mining Standard (2) Model Deployment: –Mapping XML object provides mapping between names and format in the model object and the data record –Model could have been developed on a different system Application Result Mapping Standard DTDs Standard DTD Library Model Data Record

Implications Standard interfaces for application developers to incorporate data mining Coupling with relational databases –mappings from DTDs to relational schemas –implementation using existing infrastructure

Data Mining Benchmarks UC Irvine repository Generating synthetic benchmarks modeled after real data sets is a hard problem –How to map names into meaningful literals –How to preserve empirical distributions Ack: Srikant, Ullman

Auto-focus data mining Automatic parameter tuning Automatic algorithm selection (à la join method selection in database query optimization) Ack: Andreas Arning

Web: Greatest opportunity Huge collection of data (e.g. Yahoo collecting ~50GB every day) Universal digital distribution medium makes data mining results actionable in fundamentally new ways But watch for privacy pitfall

Privacy-preserving data mining Technical vs. legislated solutions Implication for data mining algorithms when some fields of a data record have been fudged according to the user’s privacy sensitivity Ack: R. Srikant

Personalization Internet might provide for the first time tools necessary for users to capture information about themselves and to selectively release this information † Will we be providing these tools? † John Hagel, Marc Singer. Net Worth. Harvard Business School Press.

What about Association Rules? Very long patterns Separating wheat from chaff Principled introduction of domain knowledge

What else? Formal foundations of data mining

Summary Closely couple data mining with database systems Embed data mining into applications Focus on web Standard interfaces Benchmarks Auto focussing Personalization Privacy

Concluding remarks Data mining, a great technology –Combination of intriguing theoretical questions with large commercial interest in the technology Poised for transitioning into mainstream technology Will we rise to the challenge as a community?

Acknowledgments