Plan for today An example with 3 variables Face ratings 1: age, gender, and attractiveness – Histograms and scatter plots – Using symbols and colors to visually segment data – Full and partial correlations – 3D scatter plots Understanding intransitive correlations Face ratings 2: dominance, neoteny, and attractiveness – 3D scatter plots – Stepwise and full regression Fully crossed data – Surface plots Time permitting: simplifying high-dimensional data – Principal components analysis (PCA)
Face ratings Gender? Age?Attractiveness? 76 raters x 276 faces x 3 characteristics (with Corinne Olafsen ’14)
Face ratings Mean data across 76 raters 279 faces : 3 characteristics (with Corinne Olafsen ’14)
Exploratory look at the data: histograms – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ];
Exploratory look at the data: histograms – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; >> figure(101) >> hist(age2);
Exploratory look at the data: histograms – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; >> figure(102) >> hist(gender2);
Exploratory look at the data: histograms – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; >> figure(103) >> hist(attr2);
3D Histograms! – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; >> figure(104) >> hist(age2, gender2); >> xlabel(‘Age’); ylabel(‘Gender’);
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ];
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; (1) Attractiveness versus Gender In Matlab: >> figure(1); set(gca,'fontsize',16); >> plot(gender2, attr2, '.k'); >> xlabel('Gender (1=f, 4=m)') >> ylabel('Attractiveness (1-4)’)
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; (1) Attractiveness versus Gender In Matlab: >> figure(1); set(gca,'fontsize',16); >> plot(gender2, attr2, '.k'); >> xlabel('Gender (1=f, 4=m)') >> ylabel('Attractiveness (1-4)’)
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; (2) Attractiveness versus Age In Matlab: >> figure(2); set(gca,'fontsize',16); >> plot(age2, attr2, '.k'); >> xlabel(’Age (years)') >> ylabel('Attractiveness (1-4)’)
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; (2) Attractiveness versus Age In Matlab: >> figure(2); set(gca,'fontsize',16); >> plot(age2, attr2, '.k'); >> xlabel(’Age (years)') >> ylabel('Attractiveness (1-4)’)
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; (3) Age versus Gender In Matlab: >> figure(3); set(gca,'fontsize',16); >> plot(gender2, age2, '.k'); >> xlabel(’Gender (1=f, 4=m)’) >> ylabel(’Age (years)')
Looking at 2 variables at a time Pairwise scatter plots – In Matlab: >> age2 = [ … ]; >> gender2 = [ … ]; >> attr2 = [ … ]; (3) Age versus Gender In Matlab: >> figure(3); set(gca,'fontsize',16); >> plot(gender2, age2, '.k'); >> xlabel(’Gender (1=f, 4=m)’) >> ylabel(’Age (years)')
Breaking it down Attractiveness versus age (using different symbols for gender) In Matlab: >> figure(801); set(gca,'fontsize',16); >> m2 = find(gender2>2.5); >> plot(age2(m2),attr2(m2),'.b', 'markersize',25); >> hold on >> f2 = find(gender2<=2.5); >> plot(age2(f2),attr2(f2),'.r', 'markersize',25); >> xlabel('Age'); ylabel('Attractiveness (1-4)'); >> legend({'male' 'female'});
Breaking it down Attractiveness versus age (using different symbols for gender) In Matlab: >> figure(801); set(gca,'fontsize',16); >> m2 = find(gender2>2.5); >> plot(age2(m2),attr2(m2),'.b', 'markersize',25); >> hold on >> f2 = find(gender2<=2.5); >> plot(age2(f2),attr2(f2),'.r', 'markersize',25); >> xlabel('Age'); ylabel('Attractiveness (1-4)'); >> legend({'male' 'female'});
Breaking it down Attractiveness versus gender (using different symbols for gender) In Matlab: >> figure(802); set(gca,'fontsize',16); >> plot(gender2(m2),attr2(m2),'.b', 'markersize',25); >> hold on >> plot(gender2(f2),attr2(f2),'.r', 'markersize',25); >> xlabel(’Gender'); ylabel('Attractiveness (1-4)');
Breaking it down Attractiveness versus gender (using different symbols for gender) In Matlab: >> figure(802); set(gca,'fontsize',16); >> plot(gender2(m2),attr2(m2),'.b', 'markersize',25); >> hold on >> plot(gender2(f2),attr2(f2),'.r', 'markersize',25); >> xlabel(’Gender'); ylabel('Attractiveness (1-4)');
Breaking it down Correlations between gender and attractiveness? Overall: >> [r p] = corr(gender2, attr2) r = p = e-06
Breaking it down Correlations between gender and attractiveness? Overall: >> [r p] = corr(gender2, attr2) r = p = e-06 Just males: >> [r p] = corr(gender2(m2), attr2(m2)) r = p =
Breaking it down Correlations between gender and attractiveness? Overall: >> [r p] = corr(gender2, attr2) r = p = e-06 Just males: >> [r p] = corr(gender2(m2), attr2(m2)) r = p = Just females: >> [r p] = corr(gender2(f2), attr2(f2)) r = p = e-11
Putting it all together with color Hue Gender Brightness Age >> figure(804); clf; set(gcf,'color','w'); set(gca,'fontsize',16); >> for i=1:length(m2) markercolor = [0 0 1-age2(m2(i))-min(age2(m2)))/(max(age2(m2))-min(age2(m2)))]; plot(gender2(m2(i)),attr2(m2(i)),'.', 'markersize',35,'color’,markercolor); hold on >> end >> for i=1:length(f2) markercolor = [1-age2(f2(i))-min(age2(f2)))/(max(age2(f2))-min(age2(f2))) 0 0]; plot(gender2(f2(i)),attr2(f2(i)),'.', 'markersize',35,'color',’markercolor’); >> end >> xlabel('Gender');ylabel('Attractiveness');
Putting it all together with color Blue = Male Red = Female Brighter = younger Darker = older
3D scatter plots >> figure(803); clf; set(gcf,'color','w'); set(gca,'fontsize',16); >> plot3(age2,gender2,attr2, '.k’) >> xlabel('Age'); ylabel('Gender');zlabel('Attractiveness'); Figure 803
3D scatter plots with colors >> figure(805); clf; set(gcf,'color','w'); set(gca,'fontsize',16); for i=1:length(m2) markercolor = [0 0 1-(age2(m2(i))-min(age2(m2)))/(max(age2(m2))-min(age2(m2)))]; plot3(age2(m2(i)),gender2(m2(i)),attr2(m2(i)),'.', 'markersize',25,'color',markercolor); hold on end for i=1:length(f2) markercolor = [1-(age2(f2(i))-min(age2(f2)))/(max(age2(f2))-min(age2(f2))) 0 0]; plot3(age2(f2(i)),gender2(f2(i)),attr2(f2(i)),'.', 'markersize',25, 'color',markercolor) end xlabel('Age'); ylabel('Gender');zlabel('Attractiveness'); Figure 805
New example: Dominance, neoteny, and attractiveness (with Brianna Jeska ’15)
Predictions: Dominance attractiveness Neoteny attractiveness But dominance is negatively related to neoteny (???) New example: Dominance, neoteny, and attractiveness (with Brianna Jeska ’15)
Dominance, neoteny, and attractiveness Data: 13 raters x 39 faces x 3 characteristics Mean data across 13 raters 39 faces : 3 characteristics
Dominance, neoteny, and attractiveness In Matlab: >> dom = [ … ]; >> neot = [ … ]; >> attr = [ … ];
Pairwise scatter plots and correlations Attractiveness vs. neoteny In Matlab: >> figure(1); set(gca,'fontsize',16); >> plot(dom,attr,'.k'); >> xlabel('Neoteny') >> ylabel('Attractiveness'); >> [r p] = corr(neot,attr) r = p = marginally correlated
Pairwise scatter plots and correlations Attractiveness vs. dominance In Matlab: >> figure(2); set(gca,'fontsize',16); >> plot(neot,attr,'.k'); >> xlabel('Dominance') >> ylabel('Attractiveness'); >> [r p] = corr(dom,attr) r = p = e-04 strongly correlated
Pairwise scatter plots and correlations Neoteny vs. dominance? In Matlab: >> figure(3); set(gca,'fontsize',16); >> plot(dom,neot,'.k'); >> xlabel('Dominance') >> ylabel('Neoteny'); >> [r p] = corr(dom,neot) r = p = negatively correlated!
3D scatter plots >> figure(4); set(gca,'fontsize',16); >> plot3(dom,neot,attr,'.k'); >> xlabel('Dominance'); ylabel('Neoteny'); zlabel('Attractiveness'); Figure 4
3D scatter plots >> figure(4); set(gca,'fontsize',16); >> plot3(dom,neot,attr,'.k'); >> xlabel('Dominance'); ylabel('Neoteny'); zlabel('Attractiveness'); Figure 4 >> view([0 0]) attr vs. dom
3D scatter plots >> figure(4); set(gca,'fontsize',16); >> plot3(dom,neot,attr,'.k'); >> xlabel('Dominance'); ylabel('Neoteny'); zlabel('Attractiveness'); Figure 4 >> view([90 0]) attr vs. neot
3D scatter plots >> figure(4); set(gca,'fontsize',16); >> plot3(dom,neot,attr,'.k'); >> xlabel('Dominance'); ylabel('Neoteny'); zlabel('Attractiveness'); Figure 4 >> view([0 90]) neot vs. dom
Stepwise and full regression Stepwise regression model: >> stepwise([dom neot],attr)
Stepwise and full regression Stepwise regression model: >> stepwise([dom neot],attr) Full regression model: >> c = regress(attr,[dom neot]) c = (coefficients of dom and neot)
Stepwise and full regression Stepwise regression model: >> stepwise([dom neot],attr) Full regression model: >> c = regress(attr,[dom neot]) c = (coefficients of dom and neot) >> y = c(1)*dom + c(2)*neot; >> figure(5); set(gca,'fontsize',16); >> plot(y,attr,'.'); >> xlabel('Attractiveness predictor'); >> ylabel('Attractiveness');
Surface plots Requirements: – 2 independent variables – Data for very combination of values on the independent variables For example: a lexical decision task – IVs: 1.Orientation of the string (0°, 45°, 90°, 135°, 180°) 2.String length (3 letters, 4 letters, 5 letters, 6 letters) – DV: Reaction time
Surface plots Made up data:
Surface plots Made up data: In Matlab: >> v = [ ]; >> figure(11) >> surf(v)
Surface plots Figure 11 Made up data: In Matlab: >> v = [ ]; >> figure(11) >> surf(v)
Surface plots Figure 12 Made up data: In Matlab: >> v = [ ]; >> figure(11) >> surf(v) OR >> figure(12); set(gca,'fontsize',16) >> a = [0:22.5:180]; >> wl = [3:8]; >> surf(a,wl,v); >> xlabel('Angle') >> ylabel('Word length') >> zlabel('Reaction time')
Dimensionality reduction Principal components analysis – A type of factor analysis, assuming normally distributed variables – Reduces high-dimensional data into a manageable number of dimensions
Dimensionality reduction Principal components analysis – A type of factor analysis, assuming normally distributed variables – Reduces high-dimensional data into a manageable number of dimensions Example: face space
Parameterizing silhouettes Davidenko, Journal of Vision, 2007
Parameterizing silhouettes Davidenko, Journal of Vision, 2007
Parameterizing silhouettes Davidenko, Journal of Vision,
Parameterizing silhouettes Davidenko, Journal of Vision,
Parameterizing silhouettes Davidenko, Journal of Vision,
Parameterizing silhouettes Davidenko, Journal of Vision, x 480 matrix is the basis for PCA
PCA There is a lot of inter-correlation among the 36 original parameters. By using Principal Components Analysis, one can more efficiently represent the underlying data (in this case, silhouette faces) with fewer than 36 dimensions.
Principal Components Analysis x-y representation
Principal Components Analysis x-y representation
Principal Components Analysis x-y representationPC representation
PCA in Matlab In Matlab: e.g. X is an n by m matrix, where n = number of data points m = number of dimensions >> [pc score latent tsquare] = princomp(x); Note 1: n needs to be greater than m Note 2: it is useful to use zcore(x) instead of x
A slice of silhouette face space Davidenko, Journal of Vision, 2007
The average of 480 faces Davidenko, Journal of Vision, 2007 A slice of silhouette face space
Traveling along PC 1 Davidenko, Journal of Vision, 2007 A slice of silhouette face space
Traveling along PC 2 Davidenko, Journal of Vision, 2007 A slice of silhouette face space
Thank you! Slides, data, and Matlab code will be on the CSASS website: csass.ucsc.edu/short courses/index.html me with any questions or if you would like help analyzing and/or visualizing your multivariate data: