TEMPORAL DATA AND REAL-TIME ALGORITHMS AJ Jicha - Presenter Ryan Jicha - Presenter Ian Kaufer - Slide Maker Roy Zacharias - Slide Maker Frontiers in Massive Data Analysis Chapter 4, Pages Group 3
Agenda Topic Overview Data Acquisition Processing, Representation and Inference System and Hardware Challenges
Topic Overview Temporal data - data which depends on time Advertising Google Maps: Imaging & mapping with real-time traffic Protein folding research Cybersecurity (Security Information and Event Management Systems) Shift in computing environment Distributed computing
Data Acquisition Various sources of data Different locations/destinations Processing requirements based on types of data Scheduling theories: Hard real-time Firm real-time Soft real-time Bounded-tardiness
Processing High-speed data streams may exceed processing capacity Algorithms can be used to guess the missed data Representation Coding vs sketching Inference Algorithms used to guess answers based on real-time data Processing, Representation, Inference
System and Hardware Distributed file systems are necessary Google’s file system (GFS), which is proprietary Large quantity of data-acquisition machines to funnel ingest to processors Numerous engineers for system support
Major Challenges Algorithm design for massively distributed data that can adapt over time Algorithms that work on many platforms Distributed real-time acquisition, storage, transmission Consistency
Infrastructure – Systems, Hardware, & Software Summary Data acquisitionProcessingRepresentationInferencing
Terminology Inference Problem of turning data into knowledge using models Provenance Inferences on previously made inferences Temporal data Real-time, human-generated or measurements ...