1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.

Slides:



Advertisements
Similar presentations
SJS SDI_21 Design of Statistical Investigations Stephen Senn 2 Background Stats.
Advertisements

Lecture 3: A brief background to multivariate statistics
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Observers and Kalman Filters
Multivariate distributions. The Normal distribution.
Lecture 7: Principal component analysis (PCA)
Multivariate Distance and Similarity Robert F. Murphy Cytometry Development Workshop 2000.
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
Data Basics. Data Matrix Many datasets can be represented as a data matrix. Rows corresponding to entities Columns represents attributes. N: size of the.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Correlations and Copulas Chapter 10 Risk Management and Financial Institutions 2e, Chapter 10, Copyright © John C. Hull
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Techniques for studying correlation and covariance structure
Correlation. The sample covariance matrix: where.
The Multivariate Normal Distribution, Part 1 BMTRY 726 1/10/2014.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Joint Probability distribution
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Modern Navigation Thomas Herring
Multivariate Data and Matrix Algebra Review BMTRY 726 Spring 2012.
Separate multivariate observations
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Factor Analysis and Inference for Structured Covariance Matrices
Final Exam Review II Chapters 5-7, 9 Objectives and Examples.
Chapter 2 Dimensionality Reduction. Linear Methods
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Some matrix stuff.
ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.
Marginal and Conditional distributions. Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
1 Multivariate Linear Regression Models Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.
The Mean of a Discrete RV The mean of a RV is the average value the RV takes over the long-run. –The mean of a RV is analogous to the mean of a large population.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
1 Multivariate Statistical Analysis Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
A Review of Some Fundamental Mathematical and Statistical Concepts UnB Mestrado em Ciências Contábeis Prof. Otávio Medeiros, MSc, PhD.
Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University.
Principal Components: A Mathematical Introduction Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University.
General ideas to communicate Dynamic model Noise Propagation of uncertainty Covariance matrices Correlations and dependencs.
Statistics. A two-dimensional random variable with a uniform distribution.
1 Matrix Algebra and Random Vectors Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
1 Inferences about a Mean Vector Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
1 Factor Analysis and Inference for Structured Covariance Matrices Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/
Math 4030 – 6a Joint Distributions (Discrete)
Visualizing and Exploring Data 1. Outline 1.Introduction 2.Summarizing Data: Some Simple Examples 3.Tools for Displaying Single Variable 4.Tools for Displaying.
Discriminant Function Analysis Mechanics. Equations To get our results we’ll have to use those same SSCP matrices as we did with Manova.
1 Canonical Correlation Analysis Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
1 Principal Components Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Introduction to Vectors and Matrices
Principal Components Shyh-Kang Jeng
STATISTICS Joint and Conditional Distributions
Factor Analysis An Alternative technique for studying correlation and covariance structure.
CH 5: Multivariate Methods
Regression.
ECE 417 Lecture 4: Multivariate Gaussians
Matrix Algebra and Random Vectors
Factor Analysis An Alternative technique for studying correlation and covariance structure.
Multivariate Linear Regression Models
数据的矩阵描述.
Introduction to Vectors and Matrices
Multivariate Methods Berlin Chen
The Multivariate Normal Distribution, Part I
Chapter-1 Multivariate Normal Distributions
Presentation transcript:

1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia

2 Array of Data * a sample of size n from a p -variate population

3 Row-Vector View

4 Example 3.1

5 Column-Vector View

6 Example 3.2

7 Geometrical Interpretation of Sample Mean and Deviation

8 Decomposition of Column Vectors

9 Example 3.3

10 Lengths and Angles of Deviation Vectors

11 Example 3.4

12 Random Matrix

13 Random Sample Row vectors X 1 ’, X 2 ’, …, X n ’ represent independent observations from a common joint distribution with density function f(x)=f(x 1, x 2, …, x p ) Mathematically, the joint density function of X 1 ’, X 2 ’, …, X n ’ is

14 Random Sample Measurements of a single trial, such as X j ’ = [X j1,X j2,…,X jp ], will usually be correlated The measurements from different trials must be independent The independence of measurements from trial to trial may not hold when the variables are likely to drift over time

15 Geometric Interpretation of Randomness Column vector Y k ’ =[X 1k,X 2k,…,X nk ] regarded as a point in n dimensions The location is determined by the joint probability distribution f(y k ) = f(x 1k, x 2k,…,x nk ) For a random sample, f(y k )=f k (x 1k )f k (x 2k )…f k (x nk ) Each coordinate x jk contributes equally to the location through the same marginal distribution f k (x jk )

16 Result 3.1

17 Proof of Result 3.1

18 Proof of Result 3.1

19 Proof of Result 3.1

20 Some Other Estimators

21 Generalized Sample Variance

22 Geometric Interpretation for Bivariate Case 

23 Generalized Sample Variance for Multivariate Cases

24 Interpretation in p-space Scatter Plot Equation for points within a constant distance c from the sample mean

25 Example 3.8: Scatter Plots

26 Example 3.8: Sample Mean and Variance-Covariance Matrices

27 Example 3.8: Eigenvalues and Eigenvectors

28 Example 3.8: Mean-Centered Ellipse

29 Example 3.8: Semi-major and Semi-minor Axes

30 Example 3.8: Scatter Plots with Major Axes

31 Result 3.2 The generalized variance is zero when the columns of the following matrix are linear dependent

32 Proof of Result 3.2

33 Proof of Result 3.2

34 Example 3.9

35 Example 3.9

36 Examples Cause Zero Generalized Variance Example 1 –Data are test scores –Included variables that are sum of others –e.g., algebra score and geometry score were combined to total math score –e.g., class midterm and final exam scores summed to give total points Example 2 –Total weight of chemicals was included along with that of each component

37 Example 3.10

38 Result 3.3 If the sample size is less than or equal to the number of variables ( ) then | S | = 0 for all samples

39 Proof of Result 3.3

40 Proof of Result 3.3

41 Result 3.4 Let the p by 1 vectors x 1, x 2, …, x n, where x j ’ is the j th row of the data matrix X, be realizations of the independent random vectors X 1, X 2, …, X n. If the linear combination a ’ X j has positive variance for each non-zero constant vector a, then, provided that p 0 If, with probability 1, a ’ X j is a constant c for all j, then | S | = 0

42 Proof of Part 2 of Result 3.4

43 Generalized Sample Variance of Standardized Variables

44 Volume Generated by Deviation Vectors of Standardized Variables

45 Example 3.11

46 Total Sample Variance

47 Sample Mean as Matrix Operation

48 Covariance as Matrix Operation

49 Covariance as Matrix Operation

50 Covariance as Matrix Operation

51 Sample Standard Deviation Matrix

52 Result 3.5

53 Proof of Result 3.5

54 Proof of Result 3.5

55 Result 3.6