Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 1 – Stats start here.  Statistics: The science of data  Data: collection of numbers, characters, images or other items along with their context.

Similar presentations


Presentation on theme: "Chapter 1 – Stats start here.  Statistics: The science of data  Data: collection of numbers, characters, images or other items along with their context."— Presentation transcript:

1 Chapter 1 – Stats start here

2  Statistics: The science of data  Data: collection of numbers, characters, images or other items along with their context that provide information about something What is Statistics and data?

3  Facebook: If you have a Facebook account you have probably noticed that the ads you see online tend to match your interests and activities. Much of your personal information has been sold to marketing or tracking companies. Your data are valuable! A company can find out your age, sex, education level, job, hobbies and activities. Examples

4  Target stores make customer profiles by collecting data about people using credit cards. Patterns the company discovers across similar customer profiles enable it to send you advertising and coupons that promote items you may be interested in purchasing. Examples

5  How dangerous is texting while driving?  Researchers compare reaction time of sober drivers, drunk drivers, and texting drivers. The results were striking. The texting drivers actually responded more slowly and were more dangerous than those who were above the legal limit for alcohol Examples

6  Data vary because we don’t see everything and because even what we do see and measure, we measure imperfectly.  Example: Ask different people the same question and you will get lots of different answers  Statistics helps us make sense of the world by seeing past the underlying variation to find patterns and relationships. Statistics is about variation

7  Let’s start with an example: Amazon.com  Background: Amazon started as book store in 1995. By 1997 Amazon had 2.5 million books sold to more than 1.5 million customers in 150 countries. In 2010, sales reached 34.2 billion and they now sell basically everything, including a $400,000 necklace, Yak cheese from Tibet and the largest book in the world. What are Data?

8  So how did they do it? How do they track their customers?  The answer is data! What are Data?

9 Numbers only? The amount of your last purchase. Your name and address? Yes, but they are not numbers. Zip Code? This is a number, but is it used for analysis such as average?

10  Think of some data points that Amazon may collect: What are Data? 105-2686834OhioNashvilleKansas10.99NB00000I5Y6Katherine H. 105-9318443IllinoisOrange CountyBoston16.99YB000002BK9Samuel P. 105-1872500MassachusettsBad BloodChicago15.98NB000068ZVQChris G. 103-2628345CanadaLet GoMammals11.99NB000001OAAMonique D. 002-1663369OhioBest of KansasKansas10.99NB002MXA7Q0Katherine H.  Try to guess what each column represents.

11  Why is this hard?  Because there is no context. If we don’t know what values are measured and what is measured about them, the values are meaningless.  We can make the meaning clear if we organize them in a data table: What are Data? Order NumberNameState/Country Previous Album Download New Purchase Artist PriceGiftASIN 105-2686834Katherine H.OhioNashvilleKansas10.99NB00000I5Y6 105-9318443Samuel P.IllinoisOrange CountyBoston16.99YB000002BK9 105-1872500Chris G.MassachusettsBad BloodChicago15.98NB000068ZVQ 103-2628345Monique D.CanadaLet GoMammals11.99NB000001OAA 002-1663369Katherine H.OhioBest of KansasKansas10.99NB002MXA7Q0

12  Data must have context to be meaningful.  Without context data cannot be interpreted.  What information provides good context?  Who  What  Where  Why  When  How Context

13  Are the numbers listed above data?  Data must have context to be meaningful. The numbers listed above could be test scores, ages of a group of golfers, or the uniform numbers of the starting backfield on the football team.  Without context data cannot be interpreted. 17, 21, 44, 76

14  How the data are collected can make the difference between insight and nonsense. For example, data that come from a voluntary survey on the Internet are almost always worthless. The How The When  Time frame – Data recorded in 1803 means something much different than data recorded now The Where  Place – data measured in India may be different than data measured in Mexico.  More specific – indoors/outdoors, house/office

15  In general the rows of a data table correspond to the individual cases about the whom/which the data was collected, but cases go by different names depending on the situation:  Individuals who answer a survey are called respondents  People on whom we experiment are called subjects or participants  In a database, the rows are called records  Otherwise we call them what they are: customers, economic quarters, or companies, etc. The Who

16  Characteristics recorded about each individual are called (variables) usually the columns.  Can be broken into three categories:  Identifiers  Categorical  Quantitative The What

17  Identifiers are useful but not typically used for analysis.  Everyone has a unique one and they are useful for not confusing cases, but not needed to be analyzed.  Examples: Student ID numbers, driver license numbers, social security numbers The What

18  Categorical Variables: Tell the group/category each individual belongs to.  Usually text values, not numbers. Any descriptive responses are usually categorical.  Examples: Male/Female, pierced/not, eye color, state, country  Numerical examples: zip code, area code The What

19  Quantitative Variables: When a variable contains measured numerical values for which it makes sense to find an average, usually with units.  The units provide a meaning and also a scale in particular situations so we know how far apart two variables are.  Examples: Cost, life span, distance, degrees The What

20  Either/or: Some variables with numeric values can be either categorical or quantitative depending on what we want to know  Example: Age  Quantitative – Amazon wants to know the average age of those customers that visit their site after 3 am.  Categorical – When deciding which album to feature when you visit the site, they’ll have categories child, teen, adult, senior. The What

21  Example – Identify each variable as categorical or quantitative.  A consumer reports article about 25 tablet computers lists each tablet’s manufacturer, cost, battery life (hours), operating system (iOS/Android), and overall performance score.  Manufacturer – Categorical  Cost – Quantitative  Battery life – Quantitative  Operating system – categorical  Performance score – Either The What

22  Suppose a Consumer Reports article (published in June 2005) on energy bars gave the brand name, flavor, price, number of calories and grams of protein and fat. Identify the following  Who:  What:  When:  Where:  How:  Why:  Categorical variables:  Quantitative Variables (with units): Example

23  Popular magazines and websites rank colleges and universities on their “academic quality” in serving undergraduate students. Describe two categorical variables and two quantitative variables that you might record for each institution. Exit Slip


Download ppt "Chapter 1 – Stats start here.  Statistics: The science of data  Data: collection of numbers, characters, images or other items along with their context."

Similar presentations


Ads by Google