Download presentation
Presentation is loading. Please wait.
Published byLucinda Hubbard Modified over 9 years ago
1
The Emergence of Data Science: Why Now? Ike Nassi (With contributions from Andrew McAfee, MIT Sloan) 17-Oct 2013 BSOE Research Day
2
What this talk is all about Convince you that There is a need We have some tools We need new approaches We can’t do it all ourselves Evidence-based decision making is important And it needs more attention It will happen anyway
3
Outline Societal Economic Technological
4
A Short Story – Point of View 1984 Configuration = 0Configuration ≠ 0
5
The Future: Hard to Predict Accurately iWatch? Skynet?
6
Changes happen faster than we think!
7
How well can experts predict?
8
2012 Political Campaign “Bottom line: Romney 315, Obama 223. That sounds high for Romney. But he could drop Pennsylvania and Wisconsin and still win the election. Fundamentals." Barone: Going out on a limb: Romney beats Obama, handily (315 to 223) The Washington Examiner ^ | 11/2/12 | Michael Barone slide by Andrew McAfee (MIT)
9
What about the experts? slide by Andrew McAfee (MIT)
10
A Meta-Study Scorecard 136 studies of expert vs. algorithmic prediction Tossup Experts Clearly Better Algorithm Clearly Better 65 (48%)63 (46%) 8 (6%) slide by Andrew McAfee (MIT)
11
The Digital Frontier Keeps Expanding (slide contributed by Andy McAfee, MIT) Source: “Building Watson: It’s not so elementary, my dear” – W. Shih. HBS case #9-612-017
12
(slide contributed by Andrew McAfee, MIT) Ken Jennings
13
Why is Data Science happening now?
14
We can collect “Big Data” slide by Andrew McAfee (MIT)
15
Big Data slide by Andrew McAfee (MIT)
16
What can Economics tell us? We are collecting a lot more data, but… We are facing a rapidly changing economic landscape And we are not very good at controlling the economy Who is going to analyze it?
20
Capital vs. Labor Source: Federal Reserve Bank of St. Louis, Economic Research slide by Andrew McAfee (MIT)
21
Recent Trends Shaded areas indicate recessions slide by Andrew McAfee (MIT)
22
Recent Trends Shaded areas indicate recessions slide by Andrew McAfee (MIT)
23
Skill Disparities Source: http://econ-www.mit.edu/~dautor/hole-vol4/figs/fig-04.zip slide by Andrew McAfee (MIT)
24
Superstars Source: http://emlab.berkeley.edu/users/saez/piketty-saezOUP04US.pdf
25
How to effect change Make the experts more effective
26
Collect data, predict, act (proactive) E.g. Evidence-based medicine Build systems that collect data, create feedback loops (reactive) E.g. Human body Both are needed Proactive and Reactive Approaches Proactive Analysis Reactive
27
Technology Requirements Data sizes for data under management are monotonically increasing Who wants less data? Our appetite for analysis is monotonically increasing Do you think, or do you know? Trend toward evidence-based management Our appetite for speed is monotonically increasing Who wants questions answered more slowly? Hence the industry interest in in-memory data management systems Our overall ability to manage complexity is not increasing
28
Technology To Support Data Science Processor speeds are limited Processor core density has been increasing at a healthy rate Memory density is increasing (but at a lower rate than core density)! Therefore, the memory/core ratio is going in the wrong direction! We haven’t significantly changed the memory/storage hierarchies for decades Interconnects are getting faster – as fast as memory access? memory access is slow caches are fast!
29
Memory-Density/Core-Density Declining…
30
Technological Solutions It’s in our nature to tackle more ambitious problems Need faster answers SAP, Oracle, Neo-4j, Objectivity, etc. More in-memory solutions (e.g. NYSE/Euronext – Steve Rubinow) Cannot get faster processors, but we can get more of them But: parallelism is difficult Legacy software is a huge problem Need more machine learning, therefore, feedback
31
What about memory?
32
Scaling out When all you have is a hammer, every problem looks like a nail Or, in my case, a thumb! Today we rely almost exclusively on “scale-out” systems Because that’s the main way we add processors and memory Shard the data, intelligently target the queries – time consuming It’s not easy to query partitioned databases What is the best way to do it? Moving data is time-consuming And you might have to change it What if you could build systems that “scale-up”?
33
What I’m doing about this Enabling systems that scale-up (TidalScale Inc. mission) Software that sits below an operating system but above the hardware that aggregates a set of servers together and runs that collection as a single virtual server running a single conventional operating system dynamic scaling at linear cost supporting unmodified legacy software and legacy operating systems automatically, dynamically and hierarchically optimizing processors, memory, networks, and storage systems through machine learning automatically evolving as hardware evolves The computer begins to learn what it needs to do to manage itself!
34
Why Data Science Now? NEED: the future is increasingly complex and difficult to predict NEED: we don’t have enough qualified experts, and experts often get it wrong RAW MATERIALS: we are collecting huge amounts of data at an increasing rate ENABLER: new hardware and software tools are emerging THEREFORE: Data science is inevitable! We don’t have a choice
35
What are the implications? Danny Hillis, inventor of the Connection Machine: “I want to build a computer that will be proud of me” What about SkyNet? Let’s leave that discussion for another day….
36
The Second Machine Age Andrew McAfee, MIT amcafee@mit.edu @amcafee
37
Thank you Ike Nassi UCSC Computer Science inassi@ucsc.edu and TidalScale, Inc. ike.nassi@tidalscale.com
38
Complexity Is the world getting more complex? Understanding complex systems: because we want to? Because we need to? Does it matter? Who manages our understanding of complexity? Who analyzes it? Is the world getting more complex, or are we making it more complex for ourselves based on a false sense of what we truly know?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.