Download presentation
Presentation is loading. Please wait.
1
CSV Files and ETL The Good, Bad, and Ugly
Eric Freeman CSV Files and ETL The Good, Bad, and Ugly
2
Comma-Separated Values- Overview
3
CSV Overview CSV- comma-separated values Plain text
Delimited text file Each line is a new record Not fully standardized!
4
CSV Evolution 1972- IBM Fortran compiler under OS/360 Input Lists- commas or spaces only
5
CSV Evolution 1983- Osborne Executive computer w/ SuperCalc Spreadsheet Added quoted field containers
6
CSV Evolution 2005- RFC4180 (standardization initiative) Common Format and MIME Type for CSV Files
7
RFC 4180 RFC 4180 Standardization Initiative Each record Is delimited by a line break Last record may end with a line break Headers are optional- Same # of fields Double quotes may enclose fields: “abc”,”def”,”ghi” or abc,def,ghi Double quotes can be escaped: “abc”,”de””f”,”ghi”
8
CSV Overview Basic Concept- Clear Line-breaks Commas Quotes
Escape Character
9
Powershell CSV Functions
Export-Csv -InputObject <PSObject> [[-Path] <String>] [-LiteralPath <String>] [-Force] [-NoClobber] [-Encoding <String>] [-Append] [[-Delimiter] <Char>] [-IncludeTypeInformation] [-NoTypeInformation] [-WhatIf] [-Confirm]
10
Demo
11
Powershell CSV Functions
Import-Csv [[-Delimiter]] <Char>] [[-Path] <String[]>] [-LiteralPath <String[]>] [-Header <String[]>] [-Encoding <String>]
12
Demo
13
The Good Simple File, Comma delimiters only BULK INSERT
14
Demo
15
The Bad Huge CSV file with a consistent format BULK INSERT w/ Format File
16
Demo
17
The Ugly Huge CSV file with Changing format Embedded quotes May contain duplicate column names
18
Demo
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.