Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIG DATA BIG DATA The next frontier for emerging market USC CSSE Annual Research Review March 14, 2013 Rachchabhorn Wongsaroj Bank of Thailand, Visiting.

Similar presentations


Presentation on theme: "BIG DATA BIG DATA The next frontier for emerging market USC CSSE Annual Research Review March 14, 2013 Rachchabhorn Wongsaroj Bank of Thailand, Visiting."— Presentation transcript:

1 BIG DATA BIG DATA The next frontier for emerging market USC CSSE Annual Research Review March 14, 2013 Rachchabhorn Wongsaroj Bank of Thailand, Visiting Scholar @ USC

2 Outline  Current situation  What is big data?  Why big data is important?  Big data cases  Research challenges  Big data in Thailand  Future research

3 Current Situation Data Quantity Data Quality Data Variety Data Timeliness Lots of data is being created & collected Global data Problems

4 Big Data = Volume, Variety and Velocity Volume People to People People to Machine Machine to Machine Variety Velocity What is big data? 8 Billion 8 Billion messages/day messages/day 845M active users 340Million 340Million Tweets/day Tweets/day 140M active users 20 Hours of 20 Hours of video uploaded video uploaded every minute every minute Source: Gartner & IBM

5 Emerging Technologies Hype Cycle 2011 (Gartner) Why big data is important?

6 Emerging Technologies Hype Cycle 2012 (Gartner)

7 Source: McKinsey Global Institute Analysis Why big data is important?

8 Big data can generate significant financial value across sectors US Health Care $300 billion value/year ̴ 0.7 % annual productivity growth Europe Public Sector Administration £250 billion value/year ̴0.5 % annual productivity growth Global Personal Location Data $100 billion +revenue for service provider Up to $700 billion value to end users US Retail 60+% increase in net margin possible 0.5-1.0 % annual productivity growth Manufacturing Up to 50% decrease in product development Up to 7% reduction in working capital Source: McKinsey Global Institute Analysis

9 $165B Clinical $47B Account Health Care sector has potential to invest $300B Source: US Department of Labor Business Model aggregation of patient records, online platform and communities 2% $5B Public health surveillance and response systems 3% $9B Accounts advanced fraud detection: performance based drug pricing 14% $47B R&D personalized medicine, clinical trial design 32% $108B Clinical transparency in clinical data and clinical decision support 49% $165B Why big data is important? $108B R&D

10 Cases Data sources / TechniquesOutput Google patient search data, Predictive Model, etc. Hospitalization pattern, Customized insurance Advanced analytic solutionsProcess time reduction Customer transactionsCustomer defection prediction Trading transactions & IP addressPossible Frauds, Financial Bubble, Money Laundering Real time people & location dataCrime and terrorist prevention Product search pattern, social media Website outage/peak time support, Travel trend and pattern Big data cases

11 Function Big data retail lever Marketing  Cross-selling  Location based marketing  In-store behavior analysis  Customer micro-segmentation  Sentiment analysis  Enhancing the multichannel consumer experience Merchandising  Assortment optimization  Pricing optimization  Placement and design optimization Operations  Performance transparency  Labor inputs optimization Supply Chain  Inventory management  Distribution and logistic optimization  Informing supplier negotiations New Business Model  Price comparison services  Web-based markets Source: McKinsey Global Institute Analysis Research Challenges Customer micro-segmentation Sentiment analysis Performance transparency Labor inputs optimization Price comparison services

12  Language  Cost of implementation  Magnitude of data  Demographic data generator  Data type Challenges Big data in Thailand

13 Language (natural language processing)  no space between words  Combination between Thai –Foreign languages  Lack of Thai text analytic components Example

14 Big data in Thailand Cost of implementation 13 Big data vendors in 2013 Hadoop : Requires: ~$1 million between 125 and 250 nodes Distribution: Annual costs: ~$4,000 per node -> A small fraction of an enterprise data warehouse $10-$100s of millions.

15 44% 31% 14% 9% Big data in Thailand Overseas Bandwidth 405,860 Mbps Local Bandwidth (.th, or.th, etc) 1,006,140 Mbps Magnitude of data As of September 2012 25% use smart phone 8% use tablet 60% use Local Bandwidth

16 Big data in Thailand Demographic data generator 39% of population use Internet 85.9% of data is created by Internet users age 6-24 Population 65M Internet users 25M Most data are from young generations

17 Only 2.12% focus on Education Source: http://www.prd.go.th/ewt_news.php?nid=23168 Big data in Thailand Types of data – limited Big data technique application

18 Bank of Thailand (BOT) Website – As is Financial institution BOT data (Internet/ Extranet) DB 1 DB 2 DB3 Manual Checking Template Input Manual Submit BTWS Working BOT Website Auto Submit Source: Bank of Thailand Problems  Too many steps  Once due - act first, fix later  Too many stakeholders  Bureaucracy management style

19 BOT data website – As is Source: Bank of Thailand Input Data Complex Validation Cross Validation Manual Check Query Data (BO) Input Template Manual Submit Website Approve Manual Checking Timeliness Revision Policy Accuracy & Reliability Volume Variety Velocity

20 BOT data website – To be Source: Bank of Thailand Input Data Complex Validation Cross Validation System Checking System Warning System Approve Website Approve Manual Checking Accuracy & Reliability

21 Future research  Data quality management  Tools  Template  Checklist  Process

22 Reference  Big Data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute Analysis  Understanding Big Data: Analytic for Enterprise Class Haddop and Streaming Data, IBM  Gartner Report  Thailand National Statistic Office  Thailand Digital Statistic Source  Bank of Thailand (www.bot.or.th)

23 BIG DATA The next frontier for emerging market Rachchabhorn Wongsaroj Bank of Thailand Visiting Scholar @ USC Thank you Q & A

24 Google File System (GFS) Map Reduce (MR) programming model Use Google Big data infrastructure from papers - GFS Hadoop Distributed File System (HDFS) - MapReduce Hadoop MapReduce etc. Pig (Yahoo!), Jaqi (IBM), Hive (Facebook) Mostly use MR technique (Pig >60%, Hive QL >90%) Higher-Level Languages Cosmos, Dryad, DryadLINQ, SCOPE (Bing) Big data technology

25 Relational DBMSs Versus Map Reduce/Hadoop Relational DBMSs Map Reduce/Hadoop Proprietary, mostlyOpen source ExpensiveLess expensive Data requires structuringData does not require structuring Great for speedy indexed lookupsGreat for massive full data scans Deep support for relational semanticsIndirect support for relational semantics, e.g., Hive Indirect support for complex data structures Deep support for complex data structures Indirect support for iteration, complex branching Deep support for iteration, complex branching Deep support for transaction processingLittle or no support for transaction processing Source: Kimball Group, April 2011 Big data technology

26 Big data exceeds the reach of commonly use hardware development and software tools to capture, manage, and process it with in a tolerable elapsed time for its user populations (Teradata Magazine article, 2011) Big data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze (The Mckinsey Global Institute, 2011) What is big data?

27 Why big data is important?


Download ppt "BIG DATA BIG DATA The next frontier for emerging market USC CSSE Annual Research Review March 14, 2013 Rachchabhorn Wongsaroj Bank of Thailand, Visiting."

Similar presentations


Ads by Google