Download presentation
Presentation is loading. Please wait.
Published byHoratio Lindsey Modified over 8 years ago
1
Hadoop, Hive, JSON, and Data! Oh, my!! TJay Belt 1
2
Database Administrator at Imagine Learning eMail me TJayBelt@yahoo.com Read me http://tjaybelt.blogspot.com Follow me @tjaybelt 2
3
Thanks to our Sponsors! Yearly Partners Gold Sponsors
4
Big Data ecosystem 30,000 feet view of our ecosystem Issues found along the way Overview 4
5
Json (JavaScript Object Notation) Lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.
6
Json (JavaScript Object Notation) { "_id": "00000000-0000-0000-0000-000000000000", "Revision": 12, "ModelData": { "GradeLevel": "Kindergarten", "FirstLanguage": "English“ }, "SetTheStageData": { "LastSetTheStageLibraryWords": 1, "LastSetTheStageTakeATest": 0 }
7
Json (JavaScript Object Notation) "TestInstances": [{ "Product": "ILE", "Lesson": "30698aac-5a3d-4464-935c-16de4ba9db70", "LessonBranch": "Main", "TestType": "PlacementTest", "TimeStarted": "2015-11-13T15:16:51.8757165+00:00", "TimeCompleted": "2015-11-13T15:26:29.9646995+00:00", "TestInstanceId": "1", "TestSectionInstances": [{ "TestSection": "Letter Recognition", "TestQuestionInstances": [{ "TestQuestion": "q43", "TimeStarted": "2015-11-13T15:17:24.965+00:00", "TimeCompleted": "2015-11-13T15:17:33.432+00:00", "TestOptionInstances": [{ "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt256" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt258" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt257" }, { "ClickCount": 1, "IsSelected": true, "ResponseLatency": -8467, "TestOption": "opt253" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt255" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt254" }] },
8
Blob Storage Reliable, cost-effective cloud storage for large amounts of unstructured data Microsoft Azure Cloud
9
MongoDB MongoDB (from humongous) is a cross-platform document-oriented database. Classified as a NoSQL database that eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas Making the integration of data in certain types of applications easier and faster.
10
Hadoop is a Java-based programming framework that supports the processing of large data sets in a distributed computing environment.
11
MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster.
12
HIVE Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. It supports queries expressed in a language called HiveQL, which automatically translates SQL-like queries into MapReduce jobs executed on Hadoop.
13
What do we have? 13
14
Things we tried SQL Server Json procs SlamData PowerQuery DocumentDB MongoDirector SQL Azure
16
Issues I encountered 16
17
17
18
Issues I encountered 18
19
Issues I encountered 19
20
Thank You! TJay Belt Cell(801) 735-9439 eMailTJayBelt@Yahoo.comTJayBelt@Yahoo.com Bloghttp://tjaybelt.blogspot.comhttp://tjaybelt.blogspot.com Linked Inwww.linkedin.com/in/tjaybeltwww.linkedin.com/in/tjaybelt Twitter@tjaybelt Skypetjaybelt Google+linklink
21
Thanks to our Sponsors! Yearly Partners Gold Sponsors
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.