Download presentation
Presentation is loading. Please wait.
Published byPierce Carpenter Modified over 6 years ago
1
Data visualization and manipulation with Open Office Calc and Microsoft Excel
2
Agenda About Excel/Calc Spreadsheets Key Features
What is the structure of spreadsheets? What can I do with them? Advantages/Disadvantages Key Features How does it fit in with other tools? Example – Import and analyze the Framingham data set
3
Consider the tasks you will need to perform
Data collection and editing? Data transformation/calculations? Basic vs. advanced statistics? Making figures? A single platform may not be enough, but use as few as possible Software Mac PC Data entry/ Forms Data editing Transform data Basic stats Adv statics Figures REDCap web +++ - + Epi Info X ++ MS Office/ Open Office R/RStudio
4
Basics about Excel/Calc
Microsoft Excel – standard part of MS Office Calc: Similar to, compatible with Microsoft Excel OpenOffice vs. Excel: OOO Free (+ upgrades) vs. $ for Excel Unlimited licenses OOO uses more different file formats Excel is more widely used Excel has more direct customer support
5
Spreadsheets Can inspect, modify, and analyses of data in table form
Can convert data of different formats (.xls, .csv, .txt, .tab, and more) Intuitive interface Little to no programming for most tasks Can be used to enter data, though not recommended Poor security, easy to lose track and have conflicting data
6
Basic spreadsheet
7
Spreadsheet structure
Sheet = rows (1 to …) and columns (A to …) A cell is described by a row and a column E.g. A1, E5, F10, etc. Can specify a range of cells (A1:A5, A2:C10) Can use cell location to perform calculations E.g. C5 = A5+B5 Multiple sheets together make a workbook
8
What can I do with spreadsheets?
Select, copy and paste rows or columns between data sets Visual filtering and sorting Link tables together using point-and-click Create expressions and formulas using drag- and-drop with visual selection of desired cells Can write more complex expressions but less flexible than other programs
9
What statistics can I do with Excel?
Descriptive statistics Mean/median/mode Distributions Tables (including cross-tables) Charts/figures Comparisons Chi-square T-test
10
Why not use Excel as a database?
Poor data integrity is limited Limited field validation Multiple file versions Can’t tell which data is correct!! Improper sorting can lead to data issues Limited rows and columns (1 million rows x columns) Important for translational data (e.g. genomics) Limited security Access (anyone can see your data) Auditing (you can’t tell who accessed or edited) Privileges (no way to limit what people can do to data)
11
Key features It has a simple interface
Rows and columns are easily manipulated It is user friendly no basic programming It is compatible with other data analysis applications Limited statistical analysis capacity for large datasets
12
Data Analysis Toolpak - PC
13
Data Analysis Toolpak - Mac
14
Importing data - Excel DEMO
15
Importing data - Calc DEMO
16
Filtering Filter Function can allow you to show the data you want and hide the rest in a spreadsheet. Three types of filter: list values , format and criteria when using AutoFilter A drop-down arrow means that filtering is enabled but not applied A Filter button means that a filter is applied.
17
Filtering - Excel
18
Filtering - Calc
19
Descriptive statistics
Describes your data using sample statistics to infer on population parameters Examples: Measures of location of data such Mean, Median , Mode( quantitative) or percentiles and proportions( qualitative) Measures of dispersion and peakedness of data such as standard deviation, variance and standard error, skewedness and kurtosis Demonstrations using Framingham Data
20
Rank and percentile
21
Derived variables - Excel
22
Derived variables - Calc
23
Creating figures - Excel
24
Creating Figures - Calc
25
Generating pie charts - Excel
26
Generating pie charts - Calc
27
Generating histograms
28
Generating tables - Excel
29
Generating tables - Calc
30
One-sided t-test
31
Guided Practice – Framingham
Import Framingham practice data CSV file from REDCap Find mean systolic and diastolic BP (systolic_bp and diastolic_bp) (sum/number of measurements) Make a derived variable of systolic (systolic_bp) – diastolic blood pressure (diastolic_bp) Make a scatter plot of weight vs. systolic BP DEMO
32
Summary Calc/Excel is one of the most familiar and intuitive ways to interact with data It should NOT be used as a primary data collection tool if possible Use it to make look for missing data, perform calculations, filter, sort, and make simple plots Possible to make (pivot) table, but takes some practice Generally need another program for statistics
33
On own practice Determine mean diastolic blood pressure (diastolic_bp) overall and according to any history of hypertension (hypertension) Create a scatter plot of ounces of hemoglobin (hgb) vs. diastolic BP (diastolic_bp) Create a table counting number of patients according to hypertension and education level.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.