Data Entry and Manipulation Lesson 4: Data Collection Entry and Manipulation CC image by Cobalt123 on Flickr.

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc. Slide 8-1 The Web Wizards Guide to PHP by David Lash.
Advertisements

Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
1 Adding a statistics package Module 2 Session 7.
How to Write Quality Metadata Lesson 8: How to Write Quality Metadata CC image by Sara Bjork on Flickr.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Working with Tables Class 10 GISG 110. Objectives Working with Tables Table structure Table creation and manipulation Tabular formats Connecting tables.
Database Ed Milne. Theme An introduction to databases Using the Base component of LibreOffice LibreOffice.
Computer Concepts BASICS 4th Edition
© Paradigm Publishing, Inc Access 2010 Level 2 Unit 2Advanced Reports, Access Tools, and Customizing Access Chapter 8Integrating Access Data.
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
John Porter Why this presentation? The forms data take for analysis are often different than the forms data take for archival storage Spreadsheets are.
33 CHAPTER BASIC APPLICATION SOFTWARE. © 2005 The McGraw-Hill Companies, Inc. All Rights Reserved. 3-2 Competencies Discuss common features of most software.
Lecture Microsoft Access and Relational Database Basics.
A Guide to SQL, Seventh Edition. Objectives Understand the concepts and terminology associated with relational databases Create and run SQL commands in.
Copyright 2003 The McGraw-Hill Companies, Inc CHAPTER Application Software computing ESSENTIALS    
Basic Concept of Data Coding Codes, Variables, and File Structures.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
MS Access 2007 IT User Services - University of Delaware.
Introduction to Data Management and Relational Databases.
U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 2.1: Data Collection Entry and Manipulation CC image by Cobalt123.
Microsoft Access Database software. What is a database? … a database is an organized collection of data. A collection of data of similar information compiled.
MS Access: Database Concepts Instructor: Vicki Weidler.
B.A. (Mahayana Studies) Introduction to Computer Science November March Office Tools A look at the main tools most computer users.
General-Purpose APPLICATION SOFTWARE
Software Apps. Word, PowerPoint, Excel, Access Mr. Miller.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
ASP.NET Programming with C# and SQL Server First Edition
1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith Mast, and Mary Anne.
Introduction to database systems
Introduction to SPSS Edward A. Greenberg, PhD
Best Practices for Preparing Data Sets Non-CO2 Synthesis Workshop Boulder, Colorado October 2008 Compiled by: A. Dayalu, Harvard University Adapted.
Administrative Software Chapter 7 Teaching and Learning with Technology.
Access Primer Africamuseum 5 June MS Access  Relational Database Management System Data/information resides in series of related tables Principle.
Open Your Mind to Open Source MPDO’s & EOPR’s Centre for IT & eGovernance AMR-APARD Hyderabad Welcome!
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Introduction to Computers Lesson 10B. home Database A collection of related data or facts.
Introduction to Computers Lesson 10B. home Database A collection of related data or facts.
Presented By: Gail Rose-Innes Camps Bay High School ICT & CAT Department Microsoft Access 2010.
33 CHAPTER General- Purpose APPLICATION SOFTWARE.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
McGraw-Hill/Irwin The O’Leary Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Lab 6 Creating and Using Lists and.
Database Management Systems.  Database management system (DBMS)  Store large collections of data  Organize the data  Becomes a data storage system.
1/62 Introduction to and Using MS Access Database Management and Analysis Yunho Song.
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
McGraw-Hill Technology Education © 2006 by the McGraw-Hill Companies, Inc. All rights reserved. 33 CHAPTER BASIC APPLICATION SOFTWARE.
Database Concepts Track 3: Managing Information using Database.
Data Entry and Assembly. Data Acquisition  Best Practices for Creating Data  Data Entry Options  Data Manipulation Options  Gathering Existing Data.
Introduction to Information and Computer Science
Lesson 13 Databases Unit 2—Using the Computer. Computer Concepts BASICS - 22 Objectives Define the purpose and function of database software. Identify.
Data Organization Quality Assurance and Transformations.
PREPARED BY: PN. SITI HADIJAH BINTI NORSANI. LEARNING OUTCOMES: Upon completion of this course, students should be able to: 1. Understand the structure.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
Computers Are Your Future Tenth Edition Spotlight 5: Microsoft Office Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
UNIVERSITY OF UTAH GREEN INFRASTRUCTURE MONITORING DATABASE CVEEN 7970 Hydroinformatics Semester Project Zachary Magdol, Jai Kanth Panthail, Pratibha Sapkota,
Todd Quinn – Business & Economics Librarian
GO! with Microsoft Office 2016
Tutorials on Data Management
GO! with Microsoft Access 2016
CHAPTER 2 Computer Software.
ICT Database Lesson 1 What is a Database?.
Working with Data in Windows
Tutorial 8 Objectives Continue presenting methods to import data into Access, export data from Access, link applications with data stored in Access, and.
MANAGING DATA RESOURCES
Unit# 6: ICT Applications
Tutorial 7 – Integrating Access With the Web and With Other Programs
The ultimate in data organization
Data Entry & Manipulation
Presentation transcript:

Data Entry and Manipulation Lesson 4: Data Collection Entry and Manipulation CC image by Cobalt123 on Flickr

Data Entry and Manipulation Best Practices for Creating Data Files Data Entry Options Data Manipulation Options CC image by JISC on Flickr

Data Entry and Manipulation Recognize inconsistencies that can make a dataset difficult to understand and/or manipulate Describe characteristics of stable data formats and list reasons for using these formats Identify data entry tools Identify validation measures that can be performed as data is entered Describe the basic components of a relational database

Data Entry and Manipulation Plan Collect AssureDescribePreserveDiscoverIntegrateAnalyze

Data Entry and Manipulation Create data sets that are: o Valid o Organized to support ease of use CC image by Travis S on Flickr

Data Entry and Manipulation Inconsistency between data collection events – Location of Date information – Inconsistent Date format – Column names – Order of columns

Data Entry and Manipulation Inconsistency between data collection events – Different site spellings, capitalization, spaces in site nameshard to filter – Codes used for site names for some data, but spelled out for others – Mean1 value is in Weight column – Text and numbers in same column – what is the mean of 12, escaped < 15, and 91?

Data Entry and Manipulation Columns of data are consistent: only numbers, dates, or text Consistent Names, Codes, Formats (date) used in each column Data are all in one table, which is much easier for a statistical program to work with than multiple small tables which each require human intervention

Data Entry and Manipulation Create descriptive column names without spaces or special characters o Soil T30 Soil_Temp_30cm o Species-Code Species_Code (avoid using -,+,*,^ in column names. Some software may interpret these symbols as an operator) Use a descriptive file name. For instance, a file named SEV_SmallMammalData_v csv indicates the project the data is associated with (SEV), the theme of the data (SmallMammalData) and also when this version of the data was created (v ). This name is much more helpful than a file named mydata.xls.

Data Entry and Manipulation Missing data o Preferably leave field empty (NULL = no value) o In numeric fields, use a distinct value such as 9999 to indicate a missing value o In text fields, use NA (Not Applicable or Not Available) o Use Data flags in a separate column to qualify missing value DateTimeNO3_N_ConcNO3_N_Conc_Flag M E1 M1 = missing; no sample collected E1 = estimated from grab sample

Data Entry and Manipulation Enter complete lines of data Sorting an Excel file with empty cells is not a good idea!

Data Entry and Manipulation For the long term, store data in a consistent format that can be read well in to the future and that can be used by any application now or in the future Appropriate file types include: o Non-proprietary: Open, documented standard o Common usage by research community: Standard representation (ASCII, Unicode) o Unencrypted o Uncompressed ASCII formatted files will be readable into the future o Use ASCII (comma-separated) for tabular data

Data Entry and Manipulation 1. Best Practices for Preparing Environmental Data Sets to Share and Archive. September Les A. Hook, Suresh K. Santhana Vannan, Tammy W. Beaty, Robert B. Cook, and Bruce E. Wilson pdfhttp://daac.ornl.gov/PI/BestPractices pdf

Data Entry and Manipulation Googledocs Forms Spreadsheets

Data Entry and Manipulation

20

Data Entry and Manipulation Great for charts, graphs, calculations Flexible about cell content typecells in same column can contain numbers or text Lack record integrity--can sort a column independently of all others) Easy to use – but harder to maintain as complexity and size of data grows Easy to query to select portions of data Data fields are typed – For example, only integers are allowed in integer fields Columns cannot be sorted independently of each other Steeper learning curve than a spreadsheet

Data Entry and Manipulation A set of tables Relationships A command language *siteID site_name latitude longitude description *siteID site_name latitude longitude description Sample sites *speciesID species_name common_name family order *speciesID species_name common_name family order Species *sampleID siteID sample_date speciesID height flowering flag comments *sampleID siteID sample_date speciesID height flowering flag comments samples *sampleID siteID sample_date speciesID height flowering flag comments *sampleID siteID sample_date speciesID height flowering flag comments Samples

Data Entry and Manipulation DateSiteHeightFlowering Advantages quality control performance Advantages quality control performance

Data Entry and Manipulation DateSiteSpeciesFlowering? 2/13/2010ABOGR2y 2/13/2010BHODRy 4/15/2010BBOER4y 4/15/2010CPLJAn SiteLatitudeLongitude A B C DateSiteSpeciesFlowering?LatitudeLongitude 2/13/2010ABOGR2y /13/2010BHODRy /15/2010BBOER4y /15/2010CPLJAn Mix and Match data on the fly

Data Entry and Manipulation DatePlotTreatmentSensorDepthSoil_Temperature CR BC CR AN015.1 SQL examples: Select Date, Plot, Treatment, SensorDepth, Soil_Temperature from SoilTemp where Date = DatePlotTreatmentSensorDepthSoil_Temperature CR BC DatePlotTreatmentSensorDepthSoil_Temperature AN015.1 This table is called SoilTemp Select * from SoilTemp where Treatment=N and SensorDepth=0

Data Entry and Manipulation Forms can be created that make entering data in to a relational database as easy as entering it in to Excel. The screenshot below shows embedded forms that were quickly generated in MS Access for adding data to three tables in a database of plant cover measurements

Data Entry and Manipulation Be aware of Best Practices when designing data file structures Choose a data entry method that allows some validation of data as it is entered Consider investing time in learning how to use a database if datasets are large or complex CC image by fo.ol on Flickr

Data Entry and Manipulation Consider trying one of these: o Personal, single-user databases can be developed in MS Access, which is stored as a file on the users computer. MS Access comes with easy GUI tools to create databases, run queries, and write reports. o A more robust database that is free, accommodates multiple users and will run on Windows or Linux is MySQL. GUI interfaces for MySQL include phpMyadmin (free) and Navicat (inexpensive).

Data Entry and Manipulation Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design (2nd Edition) by Michael J. Hernandez. Addison-Wesley

Data Entry and Manipulation Useful for analyzing, subsetting and transforming data Can be used to quality assure data Options include SAS, SPSS, R, and Matlab o Not Free SAS: Has outstanding support SPSS: Has a user-friendly GUI Matlab: Analysis and Visualization platform that has toolboxes available for different disciplines, such as modeling or genomic analyses

Data Entry and Manipulation Free ( Produces publication quality graphics Lots of forums from which to get help Software (such as Kepler for developing workflows) will integrate analytical components written in R

Data Entry and Manipulation The full slide deck may be downloaded from: Suggested citation: DataONE Education Module: Data Entry and Manipulation. DataONE. Retrieved Nov12, From anipulation.pptx Copyright license information: No rights reserved; you may enhance and reuse for your own purposes. We do ask that you provide appropriate citation and attribution to DataONE.