Importing Data Text Data Parsing Scrubbing Data June 21, 2012.

Slides:



Advertisements
Similar presentations
Chapter 10 Excel: Data Handling or What do we do with all that data?
Advertisements

Benchmark Series Microsoft Excel 2013 Level 2
Benchmark Series Microsoft Access 2010 Level 1
Introduction to Powerschool and Excel Jared Schatz Staff Accountant (509)
1 Linking & Consolidating Worksheets Applications of Spreadsheets.
Understanding Microsoft Excel
Exploring Microsoft Excel 2002 Chapter 7 Chapter 7 List and Data Management: Converting Data to Information By Robert T. Grauer Maryann Barber Exploring.
Managing Grades with Excel Viewing Help To view Help 1.Open Excel on your computer. 2.In the top right hand corner of the Excel Screen type in the.
Microsoft Excel 2010 Chapter 7
SUNY Morrisville-Norwich Campus- Week 7 CITA 130 Advanced Computer Applications II Spring 2005 Prof. Tom Smith.
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
XP New Perspectives on Microsoft Office Excel 2003, Second Edition- Tutorial 11 1 Microsoft Office Excel 2003 Tutorial 11 – Importing Data Into Excel.
Tutorial 11: Connecting to External Data
Pasewark & Pasewark 1 Access Lesson 6 Integrating Access Microsoft Office 2007: Introductory.
Tutorial 5: Working with Excel Tables, PivotTables, and PivotCharts
Chapter 2 Querying a Database
Lecture 7 Desktop Publishing IV – Spreadsheet Software Introduction to Information Technology With thanks to Dr. A. Zhang, Dr. Haipeng Guo, and Dr. David.
Lesson 1 – Microsoft Excel The goal of this lesson is for students to successfully explore and describe the Excel window and to create a new worksheet.
Advanced Excel for Finance Professionals A self study material from South Asian Management Technologies Foundation.
October 2003Bent Thomsen - FIT 3-21 IT – som værktøj Bent Thomsen Institut for Datalogi Aalborg Universitet.
CTS130 Spreadsheet Lesson 3 Using Editing and Formatting Tools.
Learning Microsoft Excel Getting Started  There are three features that you should remember as you work within PowerPoint 2007: the Microsoft Office.
Analysing Data with Excel Importing Data from a Text File To import data from a text file: 1.Start Excel. 2.Click File, click New, click Workbook,
Miscellaneous Excel Combining Excel and Access. – Importing, exporting and linking Parsing and manipulating data. 1.
1 Data List Spreadsheets or simple databases - a different use of Spreadsheets Bent Thomsen.
Chapter 6 Generating Form Letters, Mailing Labels, and a Directory
The Advantage Series ©2004 The McGraw-Hill Companies, Inc. All rights reserved Chapter 8 Managing Worksheet Lists Microsoft Office Excel 2003.
Chapter 19 Managing Worksheet Lists. Creating Lists ► Microsoft Office Excel 2003 is inarguably the most powerful electronic spreadsheet available. ►
XP New Perspectives on Integrating Microsoft Office XP Tutorial 2 1 Integrating Microsoft Office XP Tutorial 2 – Integrating Word, Excel, and Access.
Learning With Computers II (Level Orange) ©2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly.
Advanced Word - Lesson 1: Sorting and Calculating.
 Agenda: 4/24/13 o External Data o Discuss data manipulation tools and functions o Discuss data import and linking in Excel o Sorting Data o Date and.
With Excel 2007 Comprehensive 1e© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft Excel ® 2007 Comprehensive 1e Chapter.
Analysing Data with Excel Viewing Help To view Help 1.On the Start menu, point to Programs, and then click Microsoft Excel. 2.On the Help menu,
A lesson approach © 2011 The McGraw-Hill Companies, Inc. All rights reserved. a lesson approach Microsoft® Excel 2010 © 2011 The McGraw-Hill Companies,
Copyright 2007, Paradigm Publishing Inc. ACCESS 2007 Chapter 3 BACKNEXTEND 3-1 LINKS TO OBJECTIVES Modify a Table – Add, Delete, Move Fields Modify a Table.
XP. Objectives Sort data and filter data Summarize an Excel table Insert subtotals into a range of data Outline buttons to show or hide details Create.
Chapter 5 Working with Multiple Worksheets and Workbooks
CERTIPORT EXCEL PRACTICE. EDITING SORT/FILTER/FIND & REPLACE In the Summary worksheet, sort the data in descending order by Order Number, and then in.
Lesson 1 – Microsoft Excel * The goal of this lesson is for students to successfully explore and describe the Excel window and to create a new worksheet.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
Creating a Database Angelo Lafratta- Website: Search: Keith Valley Physical.
With Microsoft Excel 2007Comprehensive 1e© 2008 Pearson Prentice Hall1 Chapter 4: PowerPoint Presentation GO! with Microsoft Excel ® 2007 Comprehensive.
 Columns  Rows  Cells  Ranges  Cell addresses  Column headers  Row headers  Formulas  Spreadsheet.
Microsoft® Excel Create an Excel table. 1 Work with the Table Tools Design tab. 2 Sort and filter records in a table. 3 Identify structured references.
Processing Text Excel can not only be used to process numbers, but also text. This often involves taking apart (parsing) or putting together text values.
Understanding Microsoft Excel Lesson 1 – Microsoft Excel 2013.
Chapter 10: Working with Large Data Spreadsheet-Based Decision Support Systems Prof. Name Position (123) University Name.
An electronic document that stores various types of data.
Exporting & Formatting Budgets from FlexGen, NextGen & Zortec into Excel.
Excel for Everyone STORMY STARK ITC TRAINING SERVICES.
Chapter 7 Creating Templates, Importing Data, and Working with SmartArt, Images, and Screen Shots Microsoft Excel 2013.
By Martha Nelson Digital Learning Specialist Excel 2016 Charts and Graphs.
Copyright 2007, Paradigm Publishing Inc. EXCEL 2007 Chapter 8 BACKNEXTEND 8-1 LINKS TO OBJECTIVES Import data from Access, a Web site, or a CSV text file.
Microsoft Excel Illustrated Introductory Workbooks and Preparing them for the Web Managing.
Some other query issues:
Understanding Microsoft Excel
Miscellaneous Excel Combining Excel and Access.
Formatting a Worksheet
PIVOT TABLE BASICS.
Tutorial 11: Connecting to External Data
Computer Fundamentals
Data File Import / Export
Microsoft Excel 2007 – Level 2
Navya Thum January 30, 2013 Day 5: MICROSOFT EXCEL Navya Thum January 30, 2013.
Topic 6 Lesson 1 – Text Processing
REACH Computer Resource Center
Bent Thomsen Institut for Datalogi Aalborg Universitet
Lesson 13 Working with Tables
Presentation transcript:

Importing Data Text Data Parsing Scrubbing Data June 21, 2012

Using “String” Functions to Scrub Data When importing data from an external source, it is important to consider the data may not have had any kind of data scrubbing or cleaning to insure it was keyed in a proper format. GIGO – Garbage In – Garbage Out For Example: Planning, 53 SuPPort, 95 JoHn SMITH Mary Johnson carol ennen Smith, Larry

Importing Data Files  Excel has gotten “smarter” in its ability to open files of data that are “Delimited” or where fields are defined in fixed positions.  It “Knows” leading spaces in front of text and numbers are probably not correct and strips them out.  If you OPEN a TEXT file in Excel, it will start the Text Import Wizard.  You can Override the field parsing by telling the Wizard the data is Fixed Fielded and then define the field to be the entire length of the data.  You are prompted for where you want to import the data to and can also change some settings and attributes.

EXERCISE: Problem Statement: IT_SKILLS_STEPS You are to create an Excel Chart from the data on the following slide. You could CUT and Paste this data into Cell A1 of a Spreadsheet or use Excel to Open the file using the: [DATA] tab and {Get External Data} For the first example, I don’t want to use the Text Import Wizard. The data is also saved as a.TXT file called: IT_Skills_Data.txt There are 4 “problems” with this data. The “Correct” format was have ONE space after the Comma (,). It does assume that there is a comma after the description and before the number (But this might also be a point of a possible data error that might need to be checked in future problems)! You will use String and Text Functions FIND MID VALUE LEN IT_Skills_Steps IT_Skills_Data.TXT

Project / Program Management, 60% Business Process Management, 55% Business Analysis,53% Application Development,52% Database Management, 49% Security, 42% Enterprise Architect, 41% Strategist/Internal Consultant, 40% Systems Analyst, 39% Web Services,33% Help desk / User Support, 32% Networking, 30% Website Development, 30% QA/Testing, 28% IT Finance, 28% Vendor Management / Procurement, 27% IT - HR,21% Other, 3% IT_Skills_Data.TXT

STRING and TEXT FUNCTIONS LEN(string) Return the number of characters (Length) in a string FIND(Target, InString [,StartPos]) Look for a Target in a string of characters starting at the optional parameter of StartPos and return the position. MID(String, StartAt, #_of_Characters) Take a String of characters and begin at the StartAt position and extract #_Of_Characters. (The #_of_characters may be a value larger than what is actually there) VALUE(string) Take a sting of digits and convert it into a number format

STEPS TO PARSE DATA 1) Cut the text from: IT_SKILLS_Data.txt 2) Open Excel and Paste the data into cell: A2 3) In A1, Type the column heading: IT SKILL 4) In B1, Type: COMMA (This is a temporary value to be used in Step#9) 5) In B2, Type the formula: =FIND(",",a2) 6) Copy the formula in B2, down thru B19 7) In C1, Type: VALUE 8) In C2, Type: =VAL(Mid(A2,b2+2,Len(a2))) 9) In C2, format the column for Percent% 10) Copy the formula in C2 down thru C19.

11) Warning: THERE ARE 4 PROBLEMS WITH THE DATA THAT WILL NEED TO BE fixed! The data should be in descending order - Can you find and correct the data in Column "A" so the data in C2 is correct? 12) Hide column B: 13) Highlight data in the range: A2:C19 14) [Insert] a Chart 15) Adjust the size of the graph to show all descriptions 16) Add Titles and format the Chart to be pretty! You might want to redo the assignment and when you get to step 11, rather than "FIX" the data, you can get fancy with nested functions. What if there were 1000’s of data lines? It is not efficient to manually change and update data. WORK SMART / NOT HARD The nested set of functions that fixes the errors and does everything in one step: In D2: =VALUE(TRIM(MID(A2,FIND(",",A2)+1,LEN(A2)))) Format the cell as Percent and copy it down to D19. Compare columns C and D.

Exercise Data Parsing and Scrubbing The competed workbook exercise is called : IT_Skills There are 3 Sheets: {Raw Data}, {Parse} and {Final} that show the solution at various stages. THERE IS A 2 nd data file called IT_Skills_Data2 with different data that can be used to bring into the workbook and test your process.

Exercise: Use the [Data] {Get External Data} This is the same example but rather than CUT /PASTE the text it will use the FROM TEXT option to get the data.

{Get External Data} FROM TEXT An OPEN file Dialog box will be presented and only show.TXT files for selection. After selecting the file, Sample records are displayed to apply the Parse Pattern You can also specify what ROW you want to start importing from.

Since this data fields are delimited by a comma, Select the COMMA option and notice the PARSE line. Then press CSV Comma Separated Values CDF Comma Delimited Files

You have the Option to modify the DATA TYPE for different fields: General, Text, Date, or even to Omit a column from import.

The final option is to specify WHERE to import the data in the workbook. Since we “Know” we want to add headers to the data, place your cursor in A2. There is also an advanced {PROPERTIES} tab where you can specify other attributes about the import

Excel will even “remember” the attributes and dialog steps you just completed so the next time you select a file, it will apply the same steps to parse the data. Use the {DATA Refresh} option to specify a new file to load. Another TEXT file to load is called: IT_Skills_Data2.txt IT_Skills_Data2.txt

Get Data FROM EXISTING Connections You may have a workbook that pulls in data from another source to be used to update a Chart or you want to do something else with it. Sort – Filter- Report – Summarize - etc The Example: Get_Student_Grades is a workbook that LINKS and LOADS the StudentGrades workbook.. ANY CHANGES MADE WILL NOT BE MADE IN THE ORIGINAL FILE

Get Data From a Website It is also possible to link your spreadsheet to get data that is saved on a website The FIileName IS CASE SENSITIVE There is an Excel Spreadsheet saved as a Web Page: htm called: You specify the selection of data you want to bring in by clicking on the Yellow Arrow Tab. YOU CAN TRY LINKING TO OTHER SITES TO IMPORT DATA FROM

Exercise Data Parsing and Scrubbing The competed exercise is called : Parse_Names You will use String and Text Functions & PROPER TRIM LEFT FIND MID Bad _Data_Names_2

STRING and TEXT FUNCTIONS & Used to concatenate (JOIN) strings of text together PROPER(String) Convert the 1 st letter of each word to a Capital letter and all the remaining letters in the word into lower case. TRIM(String) Remove all duplicate spaces from anywhere within a string. LEFT(String, #_of_Characters) Take the LEFT most #_of_Characters from a String. It is like the MID function but starting at the first position. MID(String, StartAt, #_of_Char) Take a String of characters and begin at the StartAt position and extract #_Of_Char. (The #_of_char may be a value larger than what is actually there)

Importing Data Text Data Parsing Scrubbing Data