Data preparation for use in SEM Ned Kock. Data in table format Each column corresponds to a manifest variable. Some groups of columns correspond to a.

Slides:



Advertisements
Similar presentations
VORTEX Version Software Application Sociology; Marketing research; Social-psychological research Social-medical research Staff recruitment, staff.
Advertisements

Slide No. 1 Chapter 1, Unit c Relative vs. Absolute Addressing in a Spreadsheet H Spreadsheet Address H Relative Address H Absolute Address H Examples.
EGN 1006 – Introduction to Engineering Engineering Problem Solving and Excel.
Spreadsheet Basics Computer Technology.
Excel Objects, User Interface, and Data Entry. ◦ Application Window  Title Bar  Menu Bar  Toolbars  Status Bar  Worksheet Window  Worksheet Input.
Using Complex Formulas, Functions, and Tables. Objectives Navigate a workbookNavigate a workbook Enter labels and valuesEnter labels and values Change.
A Visual Follow-Along Guide to the Instructions of the NBTA Modular Hotel RFP.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
How to Handle Missing Values in Multivariate Data By Jeff McNeal & Marlen Roberts 1.
Assignment 3 Excel Tutorial IS for Management2 Content –Accurate –Relevant –Complete –Concise Time –Timely –Frequent (enough) Form –Easy to read –Appropriately.
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
Unit G: Using Complex Formulas, Functions, and Tables Microsoft Office Illustrated Fundamentals.
How to deal with missing data: INTRODUCTION
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
8/2/2015Slide 1 SPSS does not calculate confidence intervals for proportions. The Excel spreadsheet that I used to calculate the proportions can be downloaded.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
Help for Excel as Applied to the Map Your Hazards! Module, Unit 2 Map Your Hazards! Combining Natural Hazards with Societal Issues 1.
End Show Introduction to Electronic Spreadsheets Unit 3.
AMOS TAKING YOUR RESEARCH TO THE NEXT LEVEL Mara Timofe Research Intern.
WINKS 7 Tutorial 6 – Opening an Excel data file Permission granted for use for instruction and for personal use. © Alan C. Elliott, 2015.
Structural Equation Modeling in Human-centric Computing Research: A Study of Electronic Communication in Virtual Teams Using WarpPLS Ned Kock Texas A&M.
Data validation for use in SEM
Coding for Excel Analysis Optional Exercise Map Your Hazards! Module, Unit 2 Map Your Hazards! Combining Natural Hazards with Societal Issues.
Structural Equation Modeling Made Easy A Tutorial Based on a Behavioral Study of Communication in Virtual Teams Using WarpPLS Ned Kock.
Microsoft Excel 2003 Illustrated Complete And Editing Worksheets Building.
Introduction to SPSS Edward A. Greenberg, PhD
Creating a Web Site to Gather Data and Conduct Research.
Spreadsheet Introduction and Terminology Fill in the Listening Guide as you view this presentation.
P366: Lecture #1 Use of Excel for analysis Lei Chen, MD Jan 6, 2002.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
User’s Guide to the ‘QDE Toolkit Pro’ National ResearchConseil national Council Canadade recherches Excel Tools for Presenting Metrological Comparisons.
Excel. Spreadsheet Software  What Is a Spreadsheet, and How Does It Work? A spreadsheet program allows users to perform simple and complex sorting. It.
Data preparation for use in SEM Ned Kock. Data in table format Each column corresponds to a manifest variable. Some groups of columns correspond to a.
IB ITGS Case Study. Introduction: Serving thousands of clients, it is method of environment-friendly green ticketing. User friendly system which minimizes.
Chapter Fourteen Data Preparation 14-1 Copyright © 2010 Pearson Education, Inc.
Excel 2007 What You Should Have Learned about Excel had You Been Paying Attention.
Microsoft ® Office Access ™ 2007 Training Datasheets I: Create a table by entering data ICT Staff Development presents:
G Lecture 11 G Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)
Chapter 12 Creating a Worksheet.
Instructions for using this template. Remember this is Jeopardy, so where I have written “Answer” this is the prompt the students will see, and where.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
PROCESSING, ANALYSIS & INTERPRETATION OF DATA
Foundation Excel 2013 Gareth Johns & Paul Mugleston 1.
Advanced issues in SEM Ned Kock. View and save results (WarpPLS)
Chapter Fifteen Chapter 15.
SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile Homework Problem Solving the Percentile Problem with SPSS.
1 All about variable selection in factor analysis and structural equation modeling Yutaka Kano Osaka University School of Human Sciences IMPS2001, July.
Introduction to structural equation modeling
Data Preparation 14-1.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Computer Literacy BASICS: A Comprehensive Guide to IC 3, 5 th Edition Lesson 18 Getting Started with Excel Essentials 1 Morrison / Wells / Ruffolo.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Dependent t-Test PowerPoint Prepared by Alfred P.
Chapter Fourteen Data Preparation 14-1 Copyright © 2010 Pearson Education, Inc.
Microsoft Excel Illustrated Introductory Workbooks and Preparing them for the Web Managing.
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Spreadsheets.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
Microsoft Excel A Spreadsheet Program.
Advanced issues in SEM Ned Kock.
Using the Excel Creation Template to Create a Variable Parameter Problem (Macro Enabled “Alpha 1.4.2”) Getting started – Example 1 Note – You should be.
MS-Excel Part 1.
Integrating Word, Excel, Access, and PowerPoint
Microsoft Excel 2003 Illustrated Complete
Microsoft Office Illustrated
Dealing with missing data
Statistics for the Social Sciences
Statistics for the Social Sciences
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
Data validation for use in SEM
Unit G: Using Complex Formulas, Functions, and Tables
Presentation transcript:

Data preparation for use in SEM Ned Kock

Data in table format Each column corresponds to a manifest variable. Some groups of columns correspond to a latent variable. Each row often contains the answers from one subject under a particular condition, and is also known as a “case”.

Missing values A missing value is an empty cell in a data table. Missing values are a fact of life in many areas of research, including behavioral research. In terms of behavioral research, missing values may be present when: –Respondents do not answer one or more questions in a questionnaire. –A researcher empties a data cell when a respondent answers a question with non-usable data; e.g., by responding with a “0” (zero) when asked for his or her age.

Examples of missing values Datasets with missing values are a common occurrence in behavioral research, as well as other types of research.

Percentage of missing data A simple Excel formula can be used to calculate the percentage of missing data for a manifest variable. How much is too much? A recent Monte Carlo simulation suggests that as much as 30% may be okay. More than that can lead to problems. Supporting source: Kock, N. (2014). Single missing data imputation in PLS-SEM. Laredo, TX: ScriptWarp Systems.

Dealing with missing values A first step is to make an effort to ensure that no more than 30% of the data is missing in each column of a data table. The above can be accomplished by employing data collection techniques that minimize missing data; e.g., targeted questionnaires and interviews. Then the remaining missing cells can be filled using one of the several imputation methods, such as: –Arithmetic Mean Imputation –Multiple Regression Imputation –Hierarchical Regression Imputation –Stochastic Multiple Regression Imputation –Stochastic Hierarchical Regression Imputation

Missing data imputation with WarpPLS Using deletion, listwise or pairwise, to deal with missing data: Researchers have traditionally used deletion methods, often listwise and pairwise deletion, to deal with missing data. A report by the American Psychological Association Task Force on Statistical Inference stated that these techniques are ‘‘among the worst methods available for practical applications’’. Supporting source: Kock, N. (2014). Single missing data imputation in PLS-SEM. Laredo, TX: ScriptWarp Systems. Main menu > Settings > View or change missing data imputation settings:

Missing data imputation performance Results from a Monte Carlo simulation: Multiple Regression Imputation yielded the least biased mean path coefficient estimates, followed by Arithmetic Mean Imputation. With respect to mean loading estimates, Arithmetic Mean Imputation yielded the least biased results, followed by Stochastic Hierarchical Regression Imputation and Hierarchical Regression Imputation. Supporting source: Kock, N. (2014). Single missing data imputation in PLS-SEM. Laredo, TX: ScriptWarp Systems. Main menu > Settings > View or change missing data imputation settings:

Replacing missing values with SPSS

Creating source data file for WarpPLS Source data files contain the data used in a WarpPLS analysis. They are often referred to as “raw data files”. Source data files should be prepared as follows: –They should be.xls or.xlsx files (Excel), or plain text files with the names of the variables first followed by each data case in the same order as the variables listed (missing data points do not have to be imputed a-priori). –If text files, variable names and numeric data should be separated from each other by tabs. –If text files, the suffix of the data file should be designated as.txt.

Using Excel to create a.txt file

Important tips One file format that usually works well for a.txt file, and that is widely available is the ASCII tab-delimited format. If you are using Excel to create a.txt file, save the Excel- formatted file first, and create the.txt file with a different name. With Excel, have only one worksheet with the raw data. You can also create.txt tab-delimited files using SPSS, in which case it is important to instruct SPSS to write the variable names into the.txt file. –The above is done by default when you use Excel.

Reading raw data file in WarpPLS File import wizard Viewing and accepting data

Acknowledgements Adapted text, illustrations, and ideas from the following sources were used in the preparation of the preceding set of slides: 1.Kock, N. (2015). WarpPLS 5.0 User Manual. Laredo, TX: ScriptWarp Systems. 2.Kline, R.B. (1998), Principles and Practice of Structural Equation Modeling, The Guilford Press, New York, NY. 3.MS Excel, SPSS, and WarpPLS software applications. 4.Rencher, A.C. (1998), Multivariate Statistical Inference and Applications, John Wiley & Sons, New York, NY. 5.SPSS’ web site: 6.WarpPLS software. Final slide