Measuring Dietary Intake Raymond J. Carroll Department of Statistics Faculty of Nutrition and Toxicology Texas A&M University

Slides:



Advertisements
Similar presentations
Measuring Dietary Intake Raymond J. Carroll Department of Statistics Faculty of Nutrition and Faculty of Toxicology Texas A&M University
Advertisements

Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Statistics Versus Parameters
Chapter 10.  Real life problems are usually different than just estimation of population statistics.  We try on the basis of experimental evidence Whether.
Data Collection Methods
Sample Size and Power Steven R. Cummings, MD Director, S.F. Coordinating Center.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Copyright (c) Bani Mallick1 Stat 651 Lecture 5. Copyright (c) Bani Mallick2 Topics in Lecture #5 Confidence intervals for a population mean  when the.
Correcting for measurement error in nutritional epidemiology Ruth Keogh MRC Biostatistics Unit MRC Centre for Nutritional Epidemiology in Cancer Prevention.
Copyright (c) Bani Mallick1 Lecture 2 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #2 Population and sample parameters More on populations.
Correlation: Relationships Can Be Deceiving. An outlier is a data point that does not fit the overall trend. Speculate on what influence outliers have.
An Inference Procedure
Introduction to Hypothesis Testing
Chapter 6 Reproducibility: duplicate measurements of the same individual in the same situation and time frame. Validity: comparison of questionnaire data.
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
Model Selection in Semiparametrics and Measurement Error Models Raymond J. Carroll Department of Statistics Faculty of Nutrition and Toxicology Texas A&M.
Introduction to Hypothesis Testing Raymond J. Carroll Department of Statistics Faculty of Nutrition Texas A&M University
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Measuring Dietary Intake Raymond J. Carroll Department of Statistics Faculty of Nutrition and Faculty of Toxicology Texas A&M University
A New Model for Dietary Intake Instruments Based on Self- Report and Biomarkers Raymond J. Carroll Texas A&M University (
Today Concepts underlying inferential statistics
Sample Size Determination
Inference about Population Parameters: Hypothesis Testing
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Hypothesis Testing. A Research Question Everybody knows men are better drivers than women. Hypothesis: A tentative explanation that accounts for a set.
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
09/15/05William Wu / MS meeting1 Measurement error and measurement model with an example in dietary data.
Associations of Red Meat, Fat, and Protein Intake With Distal Colorectal Cancer Risk 100/5/31 鄒季臻 Nutr Cancer August ; 62(6): 701–709.
What is statistics really about? Neil Sheldon Royal Statistical Society Centre for Statistical Education & Manchester Grammar School.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Chapter 1: Psychology, Research, and You Pages 2 – 21.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal distributions to evaluate this finding: The study, published.
Psychological Research Strategies Module 2. Why is Research Important? Gives us a reliable, systematic way to consider our questions Helps us to draw.
Intro to Research: How Psychologists Ask and Answer Questions Correlation and Description.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Economic evaluation of health programmes Department of Epidemiology, Biostatistics and Occupational Health Class no. 19: Economic Evaluation using Patient-Level.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Understanding Medical Articles and Reports Linda Vincent, MPH UCSF Breast SPORE Advocate September 24,
Scientific Method, Types of Experiments and Data Processing
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
14 Statistical Testing of Differences and Relationships.
Chapter 9 Day 2 Tests About a Population Proportion.
Epidemiological Research. Epidemiology A branch of medical science that deals with the incidence, distribution, and control of disease in a population.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Copyright (c) Bani Mallick1 STAT 651 Lecture 8. Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank.
Section 10.2: Tests of Significance Hypothesis Testing Null and Alternative Hypothesis P-value Statistically Significant.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Stat 100 Mar. 27. Work to Do Read Ch. 3 and Ch. 4.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Welcome to MM570 Psychological Statistics Unit 5 Introduction to Hypothesis Testing Dr. Ami M. Gates.
Psychology as a Science. Scientific Method  How is it used in psychology? It helps us separate true claims about the world from mere opinion It helps.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Unit 5 – Chapters 10 and 12 What happens if we don’t know the values of population parameters like and ? Can we estimate their values somehow?
Unit 5: Hypothesis Testing
Hypothesis Testing: Preliminaries
Statistics and Data Analysis
Significance Tests: The Basics
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Measuring Dietary Intake Raymond J. Carroll Department of Statistics Faculty of Nutrition and Toxicology Texas A&M University _________________________________________________________

Where I am From Wichita Falls (Ranked #10 in the worst jobs in Texas by Texas Monthly, 1980) Best pecans in the world _________________________________________________________

What I am Not I know that potato chips are not a basic healthy food group. However, if you ask me a detailed question about nutrition, then I will ask Joanne LuptonNancy TurnerMeeyoung Hong _________________________________________________________

You are what you eat, but do you know who you are? This talk is concerned with a simple question. Will lowering her intake of fat decrease a woman’s chance of developing breast cancer? This is a hugely controversial question The debate has a huge statistical component It is also relevant to questions such as: if I lower my caloric intake, will I live longer? _________________________________________________________

Evidence in Favor of the Fat- Breast Cancer Hypothesis Animal studies Ecological comparisons Case-control studies _________________________________________________________

International Comparisons There are major differences in fat and saturated fat intake across countries Are these related to breast cancer? _____________________________________________________________

International Comparisons _____________________________________________________________

Case-control studies Find women who have breast cancer, and women who do not. Compare their current fat intakes A problem on its face. We want past intake, not after-the-fact intake Not much found in single studies, but pooling them over many diverse studies suggests a fat-breast cancer link _________________________________________________________

Evidence against the Fat-Breast Cancer Hypothesis Prospective studies These studies try to assess a woman’s diet, then follow her health progress to see if she develops breast cancer The diets of those who developed breast cancer are compared to those who do not Only (?) 1 prospective study has found firm evidence suggesting a fat and breast cancer link, and 1 has a negative link _________________________________________________________

Prospective Studies NHANES (National Health and Nutrition Examination Survey): n = 3,145 women aged Nurses Health Study: n = 60,000+ Pooled Project: n = 300,000+ Norfolk (UK) study: n = 15,000+ AARP: n = 250,000+ WHI Controls n = 30,000+ AARP and WHI available soon _________________________________________________________

The Nurses Health Study, Fat and Breast Cancer _________________________________________________________ 60,000 women, followed for 10 years Prospective study Note that the breast cancer cases were eating less fat Donna Spiegelman, the NHS statistician

Clinical Trials The lack of consistent (even positive) findings led to the Women’s Health Initiative Approximately 60,000 women randomized to two groups: healthy eating and typical eating _________________________________________________________

WHI Diet Study Objectives _________________________________________________________

Objections to WHI Cost ($100,000,000+) Whether Americans can really lower % Calories from Fat to 20%, from the current 35% Even if the study is successful, difficulties in measuring diet mean that we will not know what components led to the decrease in risk. _________________________________________________________ Ross Prentice of the WHI

How do we measure diet in humans? 24 hour recalls Diaries Food Frequency Questionnaires (FFQ) _________________________________________________________ Walt Willett has a popular book and a popular FFQ

Objections to the 24 hour recall Only measures yesterday’s diet, not typical diet A single 24 hour recall finding a diet-cancer link is not universally scientifically acceptable Need for repeated applications Expensive Personal interview Phone interview _________________________________________________________

NHANES: Fat is Protective (?) Typical % Calories from Fat Cases: 35% Controls: 37% _________________________________________________________

NHANES: Calories are Protective (?) Typical Calories Cases: 1,300 Controls: 1,500

Food diaries Hot topic at NCI Only measures a few day’s diet, not typical diet A single 3-day diary finding a diet-cancer link is not universally scientifically acceptable Need for repeated applications Induces behavioral change?? _________________________________________________________

Typical (Median) Values of Reported Caloric Intake Over 6 Diary Days: WISH Study

The Food Frequency Questionnaire Do you remember the SAT? _________________________________________________________

The Pizza Question _________________________________________________________

The Norfolk Study with ~Diaries and FFQ _________________________________________________________ 15,000 women, aged 45-74, followed for 8 years 163 breast cancer cases Diary: p = FFQ: p = Directly contradicts NHANES (women aged 25-50).

Summary FFQ does not find a fat and breast cancer link 24 hour recalls and diaries are expensive They have found links, but in opposite directions Diaries also appear to modify behavior Question: do any of these things actually measure dietary intake? How well or how badly? These are statistical questions! _________________________________________________________

Do We Know Who We Are? Karl Pearson was arguably the 1 st great modern statistician Pearson chi-squared test Pearson correlation coefficient _________________________________________________________ Karl Pearson at age 30

Do We Know Who We Are? Pearson was deeply interested in self- reporting errors In 1896, Pearson ran the following experiment. For each of 3 people, he set up 500 lines of a set of paper, and had them bisected by hand _________________________________________________________ A gaggle of lines

Pearson’s Experiment He then had an postdoc measure the error made by each person on each line, and averaged “Dr. Lee spent several months in the summer of 1896 in the reduction of the observations ” _________________________________________________________ A gaggle of lines, with my bisections

Pearson’s Personal Equations Pearson computed the mean error committed by each individual: the “personal equations “ He found: the errors were individual. His errors were to the right, Dr. Lee’s to the left _________________________________________________________ Karl Pearson in later life

What Do Personal Equations Mean? Given the same set of data, when we are asked to report something, we all make errors, and our errors are personal In the context of reporting diet, we call this “person-specific bias “ _________________________________________________________ Laurence Freedman of NCI, with whom I did the work

What errors do FFQ Make? Pretend you and I eat the same amount of fat on average. We each fill out a FFQ twice, take the mean fat intake from the FFQ, and get different answers. Why? _________________________________________________________ Random Error: I will give different answers each time No one reports all the ice cream he/she eats (fixed bias due to societal factors) Personal Equation: we all report differently

Model Details for Statisticians The model in symbols Note how existence of person-specific bias means that variance of true intake is less than one would have thought _________________________________________________________

Our Hypothesis We hypothesized that when measuring Fat intake The personal equation, or person-specific bias, unique to each individual, is large and debilitating. The problem: the actual variability in American diets is much smaller than suspected. If true, the hypothesis says that one cannot really do an epidemiologic study for total energy or total fat, with any degree of success for cancer _________________________________________________________

Can We Test Our Hypothesis? We need biomarker data that are not much subject to the personal equation There is no biomarker for Fat  There are biomarkers for energy (calories) and Protein We expect that studies are too small by orders of magnitude _________________________________________________________

Biomarker Data Protein: Available from a number of European studies Calories and Protein: Available from NCI’s OPEN study Results are surprising Victor Kipnis was the driving force behind OPEN _________________________________________________________

Sample Size Inflation There are formulae for how large a study needs to be to detect a doubling of risk from low and high Fat/Energy Diets These formulae ignore the personal equation We recalculated the formulae accounting for Random error in repeated FFQ Societal factors causing underreporting in general Pearson’s personal equation: we report differently _________________________________________________________

Biomarker Data: Sample Size Inflation _________________________________________________________ If you are interested in the effect of calories on health, multiply the sample size you thought you needed by 11. For protein, by 4.5

Relative Odds Suppose high fat/energy/? diets lead to twice the risk of breast cancer compared to low fat/energy This is called the Relative Risk What is the risk we would observe with the FFQ? _________________________________________________________

Relative Risk _________________________________________________________ If high calories increases the risk of breast cancer by 100% in fact, and you change your intake dramatically, the FFQ thinks doing so increases the risk by 4% Result: It is not possible to tell if changing your absolute caloric intake, or your fat intake, or your protein intake will have any health effects

Relative Risk, Food Composition _________________________________________________________ If high protein (fat) increases the risk of breast cancer by 100%, your calories remain the same, you dramatically lower your protein (fat) intake, then FFQ thinks your risk increases by 20%- 30% Result: It is very difficult to tell if changing your food composition while maintaining your caloric intake will have any health effects

Summary Trying to establish a Fat and Breast Cancer link has proved difficult Standard instruments hide effects 24 hour recalls have found effects, but are very expensive Diaries may(?) change behavior: difficult to believe what they say There is hope to analyze food composition, not absolute intakes _________________________________________________________

Summary The AARP Study: 250,000+ women, by far the greatest number in any study My best case conjecture: Huge size  statistical significance FFQ  small measured increase in risk for dramatic behavioral change Statistician’s dream: use Pearson’s idea to get at the true increase in risk _________________________________________________________ A happy statistician dreaming about AARP

Summary The WHI Controls Study: 30,000+ women All with > 32% Calories from Fat via FFQ Also includes diaries Will be able to compare diaries and FFQ How many studies with 30,000+ diaries can we afford? _________________________________________________________ A happy statistician doing field biology in Northwest Australia (the Kimberley)

Summary WHI, 2005, clinical trial My best case conjecture: Probably no statistical effects (?) Even if so, the FFQ is so bad that we will not know what to do: Decrease Fat? Decrease saturated Fat? Eat more grain? Eat more veggies (yuck)? _________________________________________________________

You are what you eat, but do you know who you are? Diet is incredibly hard to measure Even 100% increases in risk cannot be seen in large studies If you read about a diet intervention, measured by a FFQ, and it achieves statistical significance multiple times: wow! _________________________________________________________

You are what you eat, but do you know who you are? Much work at NCI and WHI and EPIC on new ways of measuring diet EPIC may be a model, because of the wide distribution of intakes _________________________________________________________

Reporting Biases FFQ are not very good for measuring caloric intake We do not want to admit our pizza, ice cream, etc. _________________________________________________________

Reporting Biases 24 hour recalls are not very good for measuring caloric intake They are better than FFQ (less bias, for example), but they still are not very good _________________________________________________________

Reporting Biases FFQ are better for % Calories from Protein Our food composition is better known to us than the amounts Inflation of sample size only 2.3, not 4.5 as for actual protein _________________________________________________________