Download presentation
Presentation is loading. Please wait.
Published byValerie Welch Modified over 8 years ago
1
1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015
2
2 About me 1+ years of production experience in BigData – Edmunds.com – BigData CC 3+ years of development experience in BigData – Hadoop – Spark – Storm – Akka 6+ years of development experience – Java EE, IBM Websphere – Spring
3
3 Successful Business - Growing Business
4
4 Growing Business – Growing Load
5
5 Our software should be ready to grow with business 1. Pay for your needs, not for plans 2. Growth doesn’t require any changes in application 3. Where one growing app, there are some growing apps
6
6 Our software should be ready to grow with business
7
7 Caching Proxy CDN
8
8 Our software should be ready to grow with business Caching Proxy CDN NoSQL Distributed cache
9
9 Our software should be ready to grow with business Caching Proxy CDN NoSQL Distributed cache
10
10 Does Edmunds need cluster solution? 1.What trends we have now? 2.Is quality of the vehicle catalog is enough? 3.Is our ad efficient? 4.What results of A/B testing do we get? 5.What can we recommend to our clients? 6.Where is the car that client needs? 7.How many leads were sent to the dealer? 8.Is the dealer successful? 9.Are our visitors not robots? 10.What revenue do we have in this year? 11.Are we growing? 12.Are our dealers growing?
11
11 1.What trends we have now? 2.Is quality of the vehicle catalog is enough? 3.Is our ad efficient? 4.What results of A/B testing do we get? 5.What can we recommend to our clients? 6.Where is the car that client needs? 7.How many leads were sent to the dealer? 8.Is the dealer successful? 9.Are our visitors not robots? 10.What revenue do we have in this year? 11.Are we growing? 12.Are our dealers growing? It’s not a competitive advantage All competitors do that It’s not a competitive advantage All competitors do that Does Edmunds need cluster solution?
12
12 Need in fast access to the whole amount of data Historical data is important as a new one Support dynamically extended hardware resources Be able to run some independent applications on the same cluster Each application run require specific amount of resources Need in convenient monitoring tool and fault-tolerance of the system Code should be readable and distributed algorithms should be supportable Does Edmunds need cluster solution? Growing amount of data Amount of tasks growth
13
13 MAPREDUCE YARN Hadoop-based solutions
14
14 MapReduce across YARN Node
15
15 MapReduce across YARN Resource Manager Name Node Resource Manager Name Node Node
16
16 MapReduce across YARN Standby Resource Manager Active Resource Manager Hadoop Client MR Application Master Name Node Data Node MR Executor Data Node MR Executor
17
17 SPARK YARN Hadoop-based solutions
18
18 Spark across YARN Standby Resource Manager Active Resource Manager Spark Client MR Application Master Name Node Data Node Spark Executor Data Node Spark Executor
19
19 SPARK MESOS Mesosphere-based solutions
20
20 Spark across YARN Standby Mesos Master Active Mesos Master Spark Client Spark Scheduler Name Node Data Node Spark Executor Data Node Spark Executor
21
21 1234 WHAT NEXT Myriad YARN on Mesos Efficient access to Hadoop resources Dynamic nature of Mesos Kubernetes Resource Manager for docker-based infrastructure Solution from Google Akka Cluster Efficient model for vertical and horizontal scaling Freedom of choosing the way of distribution Task-specific tools Apache Storm Hive/Pig/Cascading… NoSQL solutions Kafka/Sqoop/Flume… Chef/Puppet/Ansible… Docker/Rocket/CoreOS Data Science
22
22 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.