Gaining Market Share for Nonparametric Statistics Michael J. Schell Moffitt Cancer Center University of South Florida.

Slides:



Advertisements
Similar presentations
The new Education Committee Ray Hoffmann, and Nick Pajewski, The Medical College of Wisconsisn and Andrew Cucchiara The University of Pennsylvania.
Advertisements

Engaging in Statistical Practice in Academia is Honorable Michael J. Schell Moffitt Cancer Center & Research Institute.
Departments of Medicine and Biostatistics
Unit 1: Science of Psychology
Chapter 8 Linear Regression © 2010 Pearson Education 1.
1 Assessing Normality and Data Transformations Many statistical methods require that the numeric variables we are working with have an approximate normal.
How Many Discoveries Have Been Lost by Ignoring Modern Statistical Methods? Rand R. Wilcox.
Chapter 5 The Normal Curve and Standard Scores EPS 525 Introduction to Statistics.
Biol 500: basic statistics
GG313 Lecture 3 8/30/05 Identifying trends, error analysis Significant digits.
An Introduction to Logistic Regression
CHAPTER 3 Describing Relationships
Writing an APA-Style Research Report Wrapping it Up.
Non-parametric statistics
Basic Statistics in Clinical Research Slides created from article by Augustine Onyeaghala (MSc, PhD, PGDQA, PGDCR, MSQA,
Overall agenda Part 1 and 2  Part 1: Basic statistical concepts and descriptive statistics summarizing and visualising data describing data -measures.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL);
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
RESULTS & DATA ANALYSIS. Descriptive Statistics  Descriptive (describe)  Frequencies  Percents  Measures of Central Tendency mean median mode.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Chapter 1: Science of Psychology Daily Objective (concept map): Apply basic statistical concepts to explain research findings: - Descriptive Statistics:
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
Empirical Efficiency Maximization: Locally Efficient Covariate Adjustment in Randomized Experiments Daniel B. Rubin Joint work with Mark J. van der Laan.
1 Statistical Significance Testing. 2 The purpose of Statistical Significance Testing The purpose of Statistical Significance Testing is to answer the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
ANALYSIS PLAN: STATISTICAL PROCEDURES
How to select and read a paper Sir David Goldberg Institute of Psychiatry King’s College, London Course for Young Psychiatrists Addis Ababa, 27 th. April.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
CHAPTER 3 Describing Relationships
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Experimental Psychology PSY 433 Chapter 5 Research Reports.
Selecting Valid Statistical Test for Evidence Based Medicine Chapter 1 Overview: 1.1 Why Selecting Valid Statistical Tests are Important? 1.2 Factors to.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation.
Demonstrating Scholarly Impact: Metrics, Tools and Trends
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Model Diagnostics Political Analysis II.
Model Diagnostics and OLS Assumptions
Regression Models - Introduction
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Gaining Market Share for Nonparametric Statistics Michael J. Schell Moffitt Cancer Center University of South Florida

Web of Science Source of count data for this talk Words/phrases found in title or abstract Mainly title only references before 1991 The number of articles has increased over the years, thus the need for benchmarking

But is the Market Itself Expanding?

Non-Linear Regression Methods

Article Counts and Growth Rate of Regression Sub-Fields Sub-Field *GR Non-linear Wavelets Linear Logistic429116, Mixed models Data mining Bioinformatics * Estimated 5-year rate obtained by doubling the count GR = Growth Rate

How Many Discoveries Have Been Lost by Ignoring Modern Statistical Methods? Rand R. Wilcox, American Psychologist, 1998 Arbitrarily small departures from normality result in low power; even when distributions are normal, heteroscedasticity can seriously lower the power of standard ANOVA and regression methods. … most quantitative articles tend to be too technical for applied researchers. If the goal is to avoid low power, the worst method is the ANOVA F test. …the Theil-Sen estimator deserves consideration as well.

British Medical Journal articles by Doug Altman The scandal of poor medical research, 1994 Why are errors so common? Put simply, much poor research arise because researchers feel compelled for career reasons to carry out research that they are ill equipped to perform, and nobody stops them. Statistics and ethics in medical research. The misuse of statistics is unethical, 1980

Marketing of Pharmaceuticals 1)Must have the produced the drug and shown its efficacy 2)Need to produce the drug in mass quantities 3)Marketing

Marketing of Statistical Ideas 1)Must have derived the statistic and demonstrated its efficacy 2)Need to have available software 3)Need to disseminate the idea

Key Principle In an environment where ideas are not marketed, first on the market wins

First-on-the-market winners T-test, 1905 ANOVA Kolmogorov-Smirnov test, 1937 Duncan’s test, 1950 Kaplan-Meier curves, 1958 Cox regression, 1972

Hodges and Lehmann, th Berkeley Symposium Chernoff and Savage (1958) proved that the ARE of the normal scores test is at least 1 “The above results suggest that on the basis of power, at least for large samples, both the Wilcoxon and normal scores tests are preferable to the t-test for general use.”

First Simulation on Robustness of t-test CA Boneau, citations Conclusion: t-test is fine, exponential distribution simulation was done wrong Highest citation count on any subsequent simulation study (39 thru 2000) = 96

Textbook Placement Basic Practice of Statistics, 4 th Ed David S. Moore (728 pages) Non-parametric tests don’t make the book; they appear in the virtual appendix. Statistics: A Biomedical Introduction, 1977 Hollander and Wolfe T-test in Chapter 5; Wilcoxon in Chapter 13 Biostatistics, 2 nd Ed. van Belle, Fisher, et al., 2004 T-test in Chapter 5; Wilcoxon in Chapter 8

One-Way Layout for Books of Psalms Book NMnSDSkKurt RangeMd ,

Results ANOVAp =.7015 ANOVA on logged data p =.0586 Kruskal-Wallisp =.0458 Normal scoresp =.0378 AD sum for data: 14 = AD sum for log data: 1.9 =

Deciding Between ANOVA and KW on Principle If one is convinced that the metric of the values is what one wants, then ANOVA is fine ANOVA – political kin is the monarchy KW – political kin is democracy Power assessed as P(X < Y)

Cancer Research It has been my experience as a statistician in cancer research, that we are: 1)rarely sure of the metric for the data, 2)typically interested in answering the democratic question Thus, nonparametric analysis has predominated in my applied articles

Ethical Considerations Applied statistical work is very important in decision-making Educators have an ethical responsibility to properly train their “tool user” students in best practices “Tool user” statisticians have an ethical responsibility to seek best practice information