Download presentation
Presentation is loading. Please wait.
1
I Learnded SQL And So Can You!
2
Background Started with iovation as a Client Manager in After seven months, moved to a Data Analyst role. Before iovation, I had experience with data analysis, but no experience retrieving data. Starting my Data Analyst role at iovation, I had… No prior knowledge of SQL No computer science background Used Excel spreadsheets for data analysis Never heard of Postgres My first task: Get familiar with iovation’s Postgres QA databases
3
Me after one day…
4
Me after one day…
5
Learning SQL Takes Focus
”You must feel the force around you. Here… between you and me... The tree... The rock... The SELECT... The FROM... JOINING ON... everything.” - Yoda, The Empire Stikes Back (1980) Luckily, the SQL syntax is pretty easy to understand 😄😄😄
6
How familiar are we with SQL?
I can build and maintain large scale optimized postgres databases I can write fancy SQL queries that run efficiently I regularly use relation databases and write SQL I’ve used relational databases like Postgres and know enough SQL to get by I’ve written a few SQL queries I’ve heard of SQL
7
Basic Layout of a SQL Query
SELECT – The columns you want to see in the result set FROM – The table you are pulling data from JOIN – Other tables you are pulling data from, specifying how they are related to each other WHERE – Criteria to filter your result set GROUP BY – Summarize/compute results into groups ORDER BY – Sort the data
8
Reading SQL vs Understanding SQL
SELECT FROM JOIN WHERE GROUP BY ORDER BY FROM JOIN SELECT GROUP BY WHERE ORDER BY
9
CONCEPT – JOINs Basic Concept: JOINs combine rows from two (or more) tables, based on a related column. Each row from the two related column is evaluated to see if the values match. When they match, the rows are combined into a temporary table. Columns are then selected from the temporary table to be shown in the results. Core concept of a relational database Many types of JOINs to account for different ways of relating tables to one-another (can even JOIN a table to itself) The database’s Query Optimizer is in charge of figuring out the most efficient way to join these tables together SELECT p.patient_nbr, at.description, p.age FROM patients p JOIN admission_type at ON p.admission_type_id = at.id
10
CONCEPT – GROUP BY Clause
SELECT p.age, COUNT(p.patient_nbr), FROM patients p JOIN admission_type at ON p.admission_type_id = at.id Basic Concept: Group rows together based on matched values, and compute (summarize, rollup,) values based on the groups. - Aggregation
11
Consistency … not data consistency (althought thats important too)
I’m talking about consistent process and organization for your work!
12
Consistency also means organized…
Whatever workflow process you have, be consistent and organized. Consistent requirements Keeps you and the requestor/reviewer on the same page (sometimes it’s hard to get consistent requirements) Consistent validation steps Reviewers can can understand why you did what you did Consistent workflow See Google Doc
13
Use Aliases No aliases 👎 Has aliases 👍👍👍
Aliases are your friend! Use them and be consistent when using them. See examples. While this top query works without aliases, its confusing. We can’t tell which fields in the SELECT come from which table. Aliases… Simplify your work: don’t have to write out full table names Allow you to rename fields to something more applicable Help with confusion (there happens to be a ‘description’ field in multiple tables...) If left out, often cause errors due to ambiguitiy Give reviewers a helping hand Has aliases 👍👍👍
14
Validation Extremely important. Often forgotten. Why?
Accuracy - You want to be sure that the result-set is the expected one. Code Reviews - Reviewers might want to know what you did. Data results can be surprising - Be prepared to defend your work! Revisiting a query - If you wrote a query a month ago, you might not remember why you did that weird WHERE clause. Know how to validate your query Simple things I look at… Row-counts match: If the results looks odd, it probably isn’t correct. Are there oddities in the data itself?
15
Tips, Tricks and Gotchas
Comments Think ahead about how much data you’re retrieving SELECT * FROM big_table is bad COUNT(*) - Great for validation and learning about a table SELECT 1; SELECT 2+2; SELECT function(); - Great for date math! COALESCE(); WHERE 1 = 1 WHERE TRUE BEGIN TRANSACTION Gotchas NULLs: are they truly NULL values, or are they empty strings? Joining on different data types
16
Resources stackoverflow.com W3schools.com discuss.codecademy.com
Free Not Free, but can be worth it stackoverflow.com W3schools.com discuss.codecademy.com Free Data Sources NOAA climate datasets: American Economic Ass. (AEA): Google Finance: EHDP Large Health Data Sets: Airports and their locations: their-locations Machine Learning Data Set Repository: Facebook Social Networks (since 2007): Lynda.com Udemy.com Corsera.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.