Download presentation
Presentation is loading. Please wait.
Published byKevin Cobb Modified over 8 years ago
1
Big Data Anton Boyko
2
Agenda What is Big Data? Why Big Data? How to Big Data?
3
What is Big Data? Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. GigabytesTerabytesPetabytes…
4
Data growth Big Data Volume 10x Velocity 4.3 Variety 85%
5
How to process Big Data? Traditional way Appropriate way
6
Move data to compute
7
Move compute to data Fast storage vs. fast CPU and fast networking Linear scalability
8
Map/Reduce workflow File system Mappers (find matches) Reducers (combine matches) Mappers (inverse keys and values) Reducer (combine results) DFS temp
9
Map/Reduce – how it works public class NamespaceMapper : MapperBase { //Override the map method. public override void Map( string inputLine, MapperContext context) { var reg = new Regex(@"(using)\s[A-za-z0-9_\.]*\;"); var matches = reg.Matches(inputLine); foreach (Match match in matches) { //Just emit the namespaces. context.EmitKeyValue(match.Value,"1"); } } } public class NamespaceReducer : ReducerCombinerBase { //Accepts each key and count the occurrences public override void Reduce( string key, IEnumerable values, ReducerCombinerContext context) { //Write back context.EmitKeyValue(key,values.Count().ToString()); } }
10
Traditional RDBMS vs. Map/Reduce RDBMS Terabytes of data Static schema Interactive and batch access Nonlinear scaling Map/Reduce Exabytes of data (or more) Dynamic schema Batch access only Linear scaling
11
Hadoop – implementation of Map/Reduce engine
12
Hadoop ecosystem
13
Offering ODBC for Excel PowerPivot Windows Server or Windows Azure C#, Java, JavaScript
14
Demo
15
Pricing Head Node Single extra large instance (8 CPU 14 GB) $0.32 per hour $238 per month Compute Node One or more large instances (4 CPU 7 GB) $0.16 per hour $119 per month
16
Вопросы? Антон Бойко boyko.ant@live.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.