TDWI EXECUTIVE SUMMIT From Traditional to Modern: How Rakuten Marketing Realized the Promise of a New Generation of BI September 21, 2015 Donald Krapohl Scott Wallace
About us Problem Environment Engineering the Change Process & Product Outcomes Lessons learned Questions
Acquiring new companies Problem Cost Cost to scale Capability Limited by technology Agility Not build for change Acquiring new companies Alignment Silos Tech fragmentation
Technical limitations Environment Global sensors merchant feeds website feeds purchases clicks web logs AR/AP content metadata Global audience Global sensor network Trillions of rows Multinational data consumers Compartmented Data By business By app Technical limitations Legacy at capacity
Engineering the Change Described Minimum Viable Product (MVP) Set MVP dates Established minimal controls Built prototype as dev environment Extrapolated IaaS needs from prototype Built and tested production core cluster
Engineering the Knowledge Built the skills: Didn’t overcomplicate Prototyped first on cheap cloud Automated build/config/deploy
Process & Product Design Process Product Infrastructure Software Performance Data Recency Cost Scalability Fault Tolerance Design Process Product Infrastructure Software
Infrastructure Key Characteristics High Scalability Low Cost High Availability Fault Tolerance Component Traditional Next Generation Hardware Expensive Appliances (Exadata), On Premises Commodity, In Cloud (AWS) Maintenance Specialized Personnel, Consultants Dev Ops, In House
Software Key Characteristics Open Source Stable Feature Rich Active Community Development Component Traditional Next Generation Service Management Disparate Data/ Service Hub (Cloudera Manager) Data Storage Enterprise RDBMS (Oracle 11g) HDFS (Hadoop, Hive) ETL Enterprise ETL Tool (Informatica) Realtime Computation (Storm), Bulk Data Load (Sqoop), Analytic Database (Impala) BI Enterprise BI Tool (OBIEE) BYOBI, Interactive Analysis (Impala), Data Virtualization (Teiid), Custom API
Design Key Characteristics High Performance Near Realtime Data Processing Adaptable Component Traditional Next Generation Data Warehouse Kimball Data Processing Batch, Incremental ETL Lambda, Realtime Streaming, Batch Software Development Methodologies Waterfall Agile, Test Driven Development, Continuous Integration
Outcomes Alignment Legacy compliant Limits/isolates special skills & translators Non-invasive to sources Agility Hot deploy Fault tolerant Highly mutable Cost 75% cost reduction on a $MM platform $$/KPI assign-ability Features BYO BI tool BYO Data source Sub-minute recency Event Attribution Omnichannel Reporting
Lessons Learned Get versions out quickly Keep teams small Have a capability-focused vision NoSQL can limit your consumers
Questions Contacts Scott Wallace – scott.wallace@rakuten.com Don Krapohl – donald.krapohl@rakuten.com