Harnessing Kansas City Open Data to Improve the Lives of Citizens SAMAA GAZZAZ – SUPERVISED BY DR. PRAVEEN RAO DEPT. COMPUTER SCIENCE ELECTRICAL ENGINEERING.

Slides:



Advertisements
Similar presentations
Howden School and Technology College 7B Reproduction Growth of a human foetus.
Advertisements

STATISTICS Statistics refers to a set of techniques that are used to transform raw data into useful information.
CHAPTER 1 Exploring Data
Testing Bridge Lengths The Gadsden Group. Goals and Objectives Collect and express data in the form of tables and graphs Look for patterns to make predictions.
 Catalogue No: BS-338  Credit Hours: 3  Text Book: Advanced Engineering Mathematics by E.Kreyszig  Reference Books  Probability and Statistics by.
Unit 1 Lesson 4 Representing Data
+ The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1: Exploring Data Introduction Data Analysis: Making Sense of Data.
+ The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1: Exploring Data Introduction Data Analysis: Making Sense of Data.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Introduction to Statistics Mr. Joseph Najuch Introduction to statistical concepts including descriptive statistics, basic probability rules, conditional.
Correlation Correlation is used to measure strength of the relationship between two variables.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.0 Introduction Data Analysis:
Graphs and How to Use Them. Graphs Visually display your results and data Allow you (and your peers) to see trends Help to make conclusions easier Are.
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
Chapter 6: Analyzing and Interpreting Quantitative Data
Scientific Method Flip Chart Miss Forsythe 7 th Grade Science.
+ Warm Up Which of these variables are categorical? Which are quantitative?
The Science of Physics Mathematics. What We Want to Know… How do tables and graphs help understand data? How can we use graphs to understand the relationship.
P REVIEW TO 6.7: G RAPHS OF P OLYNOMIAL. Identify the leading coefficient, degree, and end behavior. Example 1: Determining End Behavior of Polynomial.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data Introduction Data Analysis:
Learning Bayesian Networks for Complex Relational Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Making inferences from collected data involve two possible tasks:
Computing and Data Analysis
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 2 Describing Data: Graphs and Tables
CHAPTER 1 Exploring Data
Unit 1 Lesson 3 Representing Data
The Nature of Probability and Statistics
Accel Precalculus Unit #1: Data Analysis Lesson #4: Introduction to Statistics.
Chapter 1 Data Analysis Ch.1 Introduction
Warm Up: What is Statistics?.
Chapter 1: Exploring Data
Statistics Applied to Economics I
Introduction & 1.1: Analyzing categorical data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Displaying Data – Charts & Graphs
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 13: Using Statistics
CHAPTER 1 Exploring Data
Parallel Feature Identification and Elimination from a CFD Dataset
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Descriptive Statistics Civil and Environmental Engineering Dept.
Presentation transcript:

Harnessing Kansas City Open Data to Improve the Lives of Citizens SAMAA GAZZAZ – SUPERVISED BY DR. PRAVEEN RAO DEPT. COMPUTER SCIENCE ELECTRICAL ENGINEERING – SCHOOL OF COMPUTATION AND ENGINEERING UNIVERSITY OF MISSOURI-KANSAS CITY

Introduction SUROP

 Big data is a general term describing datasets that are too large for existing data processing applications (e.g. real-life Kansas City crime data).  In order to extract useful information from datasets we need a strong analyzing application that can handle large datasets as well as provide probabilistic implications of the data (e.g. BayesDB).

Objective  We used an application developed by MIT researchers called “BayesDB” to understand Kansas City’s dataset. This was done after we tested the scalability of BayesDB. BayesDB uses built-in Bayesian Query Language (BQL) to analyze and harness probabilistic information off of the dataset.  Research Question: Does there exist a relationship between different factors present in a KC crime; what are they? And is BayesDB a convenient application for analyzing big datasets?

Materials BayesDB (probabilistic analysis application) KCPD_CrimeData2014.CSV

Approach  In order to test the scalability of BayesDB, we will calculate the time it needs to process and analyze different sizes of datasets.  Next, we will explore the relationships between the crime factors using the statistical and probabilistic commands provided by BayesDB, which will help us expand our understanding of raw data.

Results

Methods  BayesDB features for harnessing Kansas City crime data: SUMMARIZE Provide a comprehensive understanding of the dataset. ANALYZE Learn the dataset and the dependencies between its attributes. INFER Fill in missing data based on the learnt model with a specific certainty. ESTIMATE Provide the dependencies within the table and the strength of such dependencies. PLOT Draw a simple bar chart showing the dependencies between columns SIMULATE Predict future data entries with respect to a certain value.

Results  ESTIMATE PAIRWISE DEPENDENCE PROBABILITY FROM KCcrime;

Results  From the resulting dependency graph, we can find these relations:

Results  SIMULATE location_1,beat FROM KCcrime_demo GIVEN beat=134,involvement=ARR TIMES 10; | location_1 | beat | | ( , ) | 134 | | ( , ) | 134 | | ( , ) | 134 | | ( , ) | 134 |  BayesDB is also useful in identifying independent factors, which helps save time.

Conclusion