CS1100: Computer Science and Its Applications Text Parsing in Excel Martin Schedlbauer, Ph.D.

Slides:



Advertisements
Similar presentations
CATHERINE AND ANNIE Python: Part 3. Intro to Loops Do you remember in Alice when you could use a loop to make a character perform an action multiple times?
Advertisements

Basic Excel training KTHFS Amadeus Wennström Anders Bergvall
Dot Plot. Goal We will take two nucleotide base strings and look for common patterns – stretches where the bases match. GAATTCATACCAGATCACCGAAAACTGTCCTCCAA.
Computer Science & Engineering 2111 Text Functions 1CSE 2111 Lecture-Text Functions.
Character and String definitions, algorithms, library functions Characters and Strings.
CS1100: Computer Science and Its Applications Text Processing Created By Martin Schedlbauer
Consider the graphs of f(x) = x – 1 and f(x) = lnx. x y y = x – 1 y = lnx (1.0) Sections 2.6, 2.7, 2.8 We find that for any x > 0, x – 1 > lnx Now, suppose.
 Review answers  Grades and sample answers posted tonight  Comment sheets available tomorrow (interrupt any meetings) or in class on Thursday.
 Option 1 › Separate entries and hide fields › Hide columns or use separate spreadsheets  Option 2 › Build them up in pieces › Use parentheses if you.
 Just the word will find too many instances  Searching for blanks on both sides loses start and end  How to solve? › See last Thursday examples.
Mark Dixon Page 1 22 – Problem Solving. Mark Dixon Page 2 Session Aims & Objectives Aims –to provide a more explicit understanding of problem solving.
CS1100: Computer Science and Its Applications Table Lookup and Error Processing Martin Schedlbauer, Ph.D.
Spreadsheets Objective 6.02
2 Explain advanced spreadsheet concepts and functions Advanced Calculations 1 Sabbir Saleh_Lecture_17_Computer Application_BBA.
Computer Science 1000 Spreadsheets II Permission to redistribute these slides is strictly prohibited without permission.
Importing Data Text Data Parsing Scrubbing Data June 21, 2012.
MIS: Chapter 7: Data Base and Data Warehouses Cumulative concepts, features and functions, plus new functions COUNTIFS, SUMIFS, AVERAGEIFS All assigned.
Excel has a number of Count Functions that will total the number of cells in a selected range. We’re going to look at: COUNT COUNTA COUNTBLANK COUNTIF.
Lists in Python.
General Programming Introduction to Computing Science and Programming I.
Miscellaneous Excel Combining Excel and Access. – Importing, exporting and linking Parsing and manipulating data. 1.
Chapter 11 Creating Formulas that Count and Sum Microsoft Excel 2003.
Manipulation Masterclass By the VB Gods. In this masterclass, we will learn how to use some of the string manipulation function such as Len, Right, Left,
Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by.
Fall Week 4 CSCI-141 Scott C. Johnson.  Computers can process text as well as numbers ◦ Example: a news agency might want to find all the articles.
Exploring Office 2003 Vol 1 2/e - Grauer and Barber 1 Committed to Shaping the Next Generation of IT Experts. Chapter 4: Spreadsheets in Decision Making:
Colleague, Excel & Word Best of Friends Presented by: Joan Kaun & Yvonne Nelson College of the Rockies.
Topic 25 - more array algorithms 1 "To excel in Java, or any computer language, you want to build skill in both the "large" and "small". By "large" I mean.
MS-Excel XP Lesson 6. NOW Function 1.Returns the current date and time formatted as a date and time. 2.A1  =NOW() 3.A2  Select Insert menu Select Function.
Fundamentals of Python: First Programs
Working with the Data Table USINGQTP65-STUDENT-01A.
CS1100: Computer Science and Its Applications Building Flexible Models in Microsoft Excel Martin Schedlbauer, Ph.D.
Review 1 Arrays & Strings Array Array Elements Accessing array elements Declaring an array Initializing an array Two-dimensional Array Array of Structure.
More Strings CS303E: Elements of Computers and Programming.
Strings Mr. Smith AP Computer Science A. What are Strings? Name some of the characteristics of strings: A string is a sequence of characters, such as.
Spreadsheets COE 201- Computer Proficiency. Basic Interface Excel Book = Word Document Every book can contain up to 255 different sheets.
Working with Strings. String Structure A string in VBA can be variable length and has an internal structure Each character in the string has a position.
Processing Text Excel can not only be used to process numbers, but also text. This often involves taking apart (parsing) or putting together text values.
Excel Text Functions 1. LEFT(text, [num_chars])) Returns the number of characters specified starting from the beginning of the text string Syntax Text:
1 CSE 2337 Chapter 7 Organizing Data. 2 Overview Import unstructured data Concatenation Parse Create Excel Lists.
C language + The Preprocessor. + Introduction The preprocessor is a program that processes that source code before it passes through the compiler. It.
02/03/ Strings Left, Right and Trim. 202/03/2016 Learning Objectives Explain what the Left, Right and Trim functions do.
Excel for Everyone STORMY STARK ITC TRAINING SERVICES.
13/06/ Strings Left, Right and Trim. 213/06/2016 Learning Objectives Explain what the Left, Right and Trim functions do.
Sorts, CompareTo Method and Strings
Excel AVERAGEIF Function
Contents Introduction Text functions Logical functions
Logical Functions Excel Lesson 10.
TRACKER Contents Intro Excel 101 Math Operations Formulas 101.
Excel STDEV.S Function.
Excel IF Function.
Miscellaneous Excel Combining Excel and Access.
David Meredith Aalborg University
Data Validation and Protecting Workbook
CS1100: Computer Science and Its Applications
Dot Plot.
Summary of what we learned so far
Data types Numeric types Sequence types float int bool list str
Topics discussed in this section:
CS 106 Computing Fundamentals II Chapter 66 “Working With Strings”
Introduction to Strings
Topic 6 Lesson 1 – Text Processing
Introduction to Computer Science
Topic 3 Lesson 2 – Flexible Models
Introduction to Strings
MS Excel – Analyzing Data
The Basics of Excel Part I Monday, April 3rd 2017
Topic 25 - more array algorithms
Introduction to Computer Science
Presentation transcript:

CS1100: Computer Science and Its Applications Text Parsing in Excel Martin Schedlbauer, Ph.D.

Processing Text Excel can not only be used to process numbers, but also text. Processing text often involves taking apart the putting together text values (strings). The process of taking text values apart is called parsing. CS1100Text Parsing in Excel2

Text Processing Functions Excel provides a number of functions for parsing text: – RIGHT – take part of the right side of a text value – LEFT – take part of the left side of a text value – MID – take a substring within a text value – LEN – determine the number of characters in a text value – FIND – find the start of a specific substring within a text value CS1100Text Parsing in Excel3

Example Text processing is often necessary when files are imported from other programs: We’d like to extract the customer name and the payment terms from the text in column A. CS1100Text Parsing in Excel4

LEFT Function The LEFT function extracts a specific number of characters from the left side of a text value: CS1100Text Parsing in Excel5 =LEFT(A1,4)

RIGHT Function The RIGHT function extracts a specific number of characters from the right side of a text value: CS1100Text Parsing in Excel6 =RIGHT(A1,4)

MID Function The MID function extracts some number of characters starting at some position within a text value: CS1100Text Parsing in Excel7 =MID(A1,5,4)

FIND Function FIND returns the position where a substring starts within a string. Finds the first occurrence only. Returns a #VALUE! error if the substring cannot be found. CS1100Text Parsing in Excel8 =FIND("DEF",A1) =FIND(" ",A2) =FIND(",",A3)

Case Sensitivity Note that FIND is case sensitive. As an alternative, Excel has a SEARCH function which is not case sensitive but otherwise works the same way as FIND. CS1100Text Parsing in Excel9 =FIND("cde",A16) =SEARCH("cde",A17)

IFERROR and FIND Since FIND returns an error when a substring cannot be found, we need to use a sentinel value. CS1100Text Parsing in Excel10 =FIND("[",A5) =IFERROR(FIND("[",A5),"")

LEN Function The LEN function returns the total number of characters in a text, i.e., the “length” of the text value: CS1100Text Parsing in Excel11 =LEN(A9)

Parsing Text To extract parts of a text value (parsing) requires thoughtful analysis and a divide-and- conquer approach. For example: – Parse the text below into first and last name CS1100Text Parsing in Excel12

Strategy You need think about your strategy: – How do I detect where the first name starts? – Are there some delimiters? – What is the delimiter? – Does it always work? – Is there always a first or last name? Break the problem into several problems and create auxiliary or helper columns. CS1100Text Parsing in Excel13

HIDDEN COLUMNS Solving complex parsing problems often requires the use of intermediate values: – Solve the problem in pieces, don’t do it all in a single formula So, place intermediate values into temporary columns and then hide the column to make the model less confusing to read. CS1100Text Parsing in Excel14

COUNTA Function We have already seen COUNT as a way to count the number of cells in a range. However, COUNT only counts cells that contain numbers. – What about text? To count the number of cells that contain some value (either text or number), use COUNTA. CS1100Text Parsing in Excel15

COUNTBLANK Function As an alternative to COUNTA, there is COUNTBLANK. This function counts the number of cells in a range that do not contain any value (either text or number). CS1100Text Parsing in Excel16

Let’s Put This Together… Let’s return to the original example and see if we can parse the text into its name and terms components… Before starting with formulas, think about your strategy. – How can you recognize the beginning and end of the name component? – How about the beginning and end of the terms component? – Do we need intermediate values? CS1100Text Parsing in Excel17

Any Question? CS1100Text Parsing in Excel18