Download presentation
Presentation is loading. Please wait.
Published byEmma May Modified over 9 years ago
1
UNLOCKING THE SECRETS HIDDEN IN YOUR DATA PART 1 Data and Data Analysis
2
Data What is Data? Data is information gathered from observation, experimentation or modeling Qualitative – not precise (usually descriptive) Quantitative - Precise (usually numeric) The output of your model (i.e. number of healthy agents, number of infected agents, time…)
3
Data How do we gather data? Data collection is the systematic recording of information while changing Variables (a quantity that may assume any given value or set of values). Collect the output (i.e. number of healthy agents, number of infected agents, time…) while changing the variables (number of devils, number initially infected) of the model
4
Data Why should we get data? To answer questions To develop understanding To validate experiments What should we do with data? Display – usually graph it to make it easier to see trends Analysis – use math skills to uncover patterns and trends in data sets Interpretation - involves possible explanation those patterns and trends.
5
Extracting Data from StarlogoTNG There are three ways to extract data from StarlogoTNG Collect the data by hand Create a chart in Starlogo TNG and extract the data to Excel Create a table in Stalogo TNG and extract the data to Excel
6
Why Should We Display Data ? What did you see? Makes your data visible Helps find obvious patterns Does the data makes sense? Are your assumptions correct? Did you collect enough data?
7
Why Should We Analyze Data ? What does it Mean? Is there is more information in the data emergent behavior unexpected patterns Was the hypothesis correct ? Why Does it Matter? Draw conclusions from data More grass gives more rabbits To help you answer questions Provide visible evidence and support for our conclusions to you audience (e.g. Challenge judges) Validity of model, experiment, theory, …
8
Ways to Analyze Data Plotting Data Ways to visually understand data Statistics Makes it easier to compare data Mean, Median, Mode Makes it clear if you have NOISY data Range, Variance, Standard Deviation
9
Ways to Analyze Data Derivatives (Slopes) Tell if changes in parameters affect data Parameter 2 has a greater effect than Parameter 1 Get more information from data Slope = 0.08 Slope = 0.16 Slope = 0.39 Great Derivative
10
Collecting Data: Variable Sweeping Did you collect enough data? Did you vary the parameters throughout their ranges? If you have sliders (input variables) in your program, you need data for the full range of those sliders. Minimum 3 runs for a single variable (low, medium, high) More than one slider (variable), must vary them separately. 2 variable perhaps 9 runs
11
Collecting Data from Starlogo TNG Gathering Data by hand Tasmanian Devils Variable sweep More than one variable Multiple runs at each variable combination Average the data
12
Collecting Data from Starlogo TNG Lets Do It Open Tasmanian Devil Run a section of the data sheet Do variable sweep Initial Population Initial Percent Infected Multiple runs at each set of variables Collect output in data sheet Number healthy after 200 ticks
13
Collecting Data from Starlogo TNG Put Data into Excel Calculate Averages
14
Collecting Data from Starlogo TNG Make a Summary Table Create XY Charts
15
Collecting Data from Starlogo TNG Make a 3D Chart
16
Plotting Data – Extracting from Starlogo TNG Data can be extracted from a graph or a table in Starlogo TNG Create a graph using the line graph block Put reset clock on Setup block to clear and reset graph LET’S DO IT – Tasmanian Devils !!
17
Plotting Data – Extracting from StarlogoTNG After program is run Click on graph in Spaceland Save File – Excel file LET’S DO IT – Tasmanian Devils !!
18
Data Analysis: Plotting Data – Types of Plots All plots from http://www.statcan.ca Pie Charts – music preference Pets purchased at pet store Bar Charts – preferred snacks
19
Data Analysis: Plotting Data – Types of Plots All plots from http://www.statcan.ca XY Graphs – cell phone use http://www.statcan.ca Scatter Plots http://en.wikipedia.org/wiki/Scatterplot
20
Plotting Data – Activity in Excel Open Tasmanian Devil Export file (csv file ) by double clicking on the file In EXCEL - Insert Chart Select type of chart XY Scatter Hit the Next button LET’S DO IT
21
Plotting Data – Activity in Excel Select Data Range Highlight data to be plotted
22
Plotting Data – Activity in Excel Label each data series NEXT - Label Graph and Axis
23
Plotting Data – Activity in Excel Choose where you want the graph to be Get your graph
24
Plotting Data – Extracting from Netlogo Two ways 1 st Way: Write code to extract the data you want – see File Output Example in the Code Examples Open file in setup procedure Create a write-to-file procedure
25
Plotting Data – Extracting from Netlogo 2 nd way: Extract data from Netlogo graphs Have Netlogo generate graph on Interface page (example on later slide) Create a setup-plot procedure and a do-plot procedure Call the setup-plot procedure in setup procedure Call do-plot procedure in go procedure
26
Plotting Data – Extracting from Netlogo Run model until sufficient data obtained (PC) Right Click on Graph/ (Mac) Control Click on Graph Select Export Choose location and File name - select save Excel File is created – Next Slide Contains all the information in the plot and input parameters used. Contains excess information about the plot (color, pen down, mode, interval…) LET’S DO IT – Open Rabbits Grass Weeds
27
Plotting Data – Extracting from Netlogo This is what You need
28
Statistics Statistics help you Summarize data Describe data Analyze data Hard to describe the difference Between the two data sets Now it is easy to summarize, describe and analyze the data…. The blue and the pink data have the Same AVERAGE value (mean) but the blue data is “NOISIER” (greater standard deviation). Therefore…
29
Statistics – How to Calculate in Excel +,-,*,/ used for addition, subtraction, multiplication and division. Each cell has a label based on the column and row. Use cells to perform calculations instead of numbers. Example : =(A4+B4)/C4 Perform calculations on an entire column - copy and paste the equation.Warning : this changes the cell number for each line. Fix a specific cell - use the $ symbol, example (A4+B4)/$C$1 Excel has many built in statistical functions Makes life easy! E1
30
Calculate in Excel Activity Open a blank spread sheet in Excel Create 2 columns of numbers Then Add, Subtract, Multiple and Divide the first row Copy and paste the formulas
31
Statistics – Measurements of Central Tendency Mean (Average), Median, and Mode Definitions Mean (Average) – Sum divided by the number of data points Median – Middle data point when arranged from highest to lowest Mode – Most frequent value Use data set to calculate Mean (Average) Median, Mode, Max and Min Select Cell where you want the value of the function to appear Select Insert then Function Select Statistical Select function wanted (AVERAGE, MEDIAN, or MODE) then hit OK Select Range of data you want to analyze by clicking on range symbol and highlighting range. Hit enter or OK LET’S DO IT : StarlogoTNG : Fish and Plankton data Netlogo : Rabbits and Grass data
32
Statistics – Measurements of Data Spread Range, Variance and Standard Deviation Definitions Range = maximum - minimum Variance = measures noise of the data around the mean value. Standard Deviation (S) is the square root of the variance. Most commonly used measure of spread (same units as the data). Another reason to use S: ~68% of the data are in the interval Mean – S to Mean + S ~95% of the data are in the interval Mean – 2 S to Mean + 2 S ~99% of the data are in the interval Mean – 3 S to Mean + 3 S EXCEL does it for you!!! LET’S DO IT : StarlogoTNG : Fish and Plankton data Netlogo : Rabbits and Grass data
33
Derivatives What are Derivatives? A simple calculation using data Instantaneous rate of change = SLOPE Why use Derivatives? Get more information from data More Ways to comparison data Car moving down a road Data = the distance traveled Velocity = the 1 st derivative of distance Acceleration = 2 nd derivative of distance = the 1 st derivative of velocity Slope of distance Slope of velocity
34
A Note on Randomness This data is not RANDOM Random means that there is an equal probability of getting each outcome (like rolling a die) There is scatter in the data but it is not random
35
Other Things to Think About Is there “scatter” in your model? Evaluate how the “scatter” effects your results – repeat model runs Make sure you get enough data to get a good statistics Did you collect enough data? Did you let the model run long enough? Has the model reached “equilibrium”
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.