Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lesson 5: Wrangling Tools

Similar presentations


Presentation on theme: "Lesson 5: Wrangling Tools"— Presentation transcript:

1 Lesson 5: Wrangling Tools
Chapter 5A – Union and Dataset Swapping

2 Lesson 5 – Chapter 5A Chapter 5A – Union and Dataset swapping
In this chapter, you will: Combine datasets using Union Apply your recipe to a second dataset through dataset swapping A datasourse is a reference to a set of data that has been imported into the system. This source is not modified within the application datasource and can be used in multiple datasets. It is important to note that when you use Trifacta to wrangle a source, or file, the original file is not modified – therefore, it can be used over and over – to prepare output in multiple ways, for example. Datasources are created in the Datasources Page, or when a new dataset is created. There are two ways to add a datasource to your Trifacta instance: You can locate and select a file in HDFS – HDFS stands for Hadoop File System. You can use the file browser to locate and select the file. You can also upload a local file from your machine. Note that there is a 1 GB file size limit for local files. Several file formats are supported: CSV LOG JSON AVRO EXCEL – Note that if you upload an Excel file with multiple worksheets, each worksheet will be imported as a separate source. Trifacta. Confidential & Proprietary.


Download ppt "Lesson 5: Wrangling Tools"

Similar presentations


Ads by Google