Using Structured Query Language (SQL) NCCS Applications –MS Access queries (“show SQL”) –SAS (PROC SQL) –MySQL (the new dataserver) –Visual Foxpro Other popular databases –Oracle, Microsoft SQL Server
SQL or “Zen and the Art of Working with Data” (Part 1)
Training Goals Understand CONCEPTS and VISUALIZE the data. Learn a few key terms that you can use for searching programming language guides or indexes. “Computer awareness”: Learn to think like a computer (when needed). Improve data problem solving.
Basic Concepts Database – collection of tables Table – rows (records) and columns (also called variables or fields)
Basic Commands SELECT.. FROM.. SELECT * FROM temp INSERT INTO... INSERT INTO temp (ein, name) VALUES (“ ”, “My org.”) UPDATE SET... UPDATE temp SET EIN = “ ”, NAME = “My org.” WHERE NAME = “My Old Org.” DELETE WHERE ein = “ ” CREATE DROP
SELECT CREATE TABLE tempNew AS SELECT ein, name FROM temp ORDER BY ein, name WHERE NAME = “ ” Which fields? Three choices: * (asterisk) List fields separated by commas Create new fields (“firstname || lastname AS fullname”) FROM what source? Sort?... ORDER BY ein, name For which records? WHERE NAME = “ ” Where is the result going? CREATE TABLE tempNew AS SELECT * FROM temp; (SAS) CREATE TABLE tempNew SELECT * FROM temp; SELECT... INTO TABLE tempNew (Foxpro) Summary or Detail Output?
“Calculated Fields”/Expressions Concatenate & manipulate strings: SELECT ein || taxper AS nccskey, LOWER(SUBSTR(name,1,30)) as name2... FROM core99... Create “row” percentages: SELECT p1tcont/p1totrev * 100 as contPct,... Flag problem records: SELECT IIF( p1tcont > p1totrev, 1, 0) as contbad... SELECT SUM( contbad) as bad, COUNT(ein) as count... GROUP BY nteecc
SELECT for Summarizing SELECT state, SUM(p1tcont) as totCont, COUNT( ein) as COUNT FROM temp GROUP BY state GROUP BY vs. ORDER BY Skip the GROUP BY, get a “total total” Summary functions: SUM() COUNT() - May behave differently in different programs. COUNT(*), COUNT(ein), COUNT( p1tcont) AVG() MIN() MAX()
Joining Tables: Sample Data
Inner & Outer Joins Inner = intersection of 2 sets SELECT core92.exps as exp92, core96.exps as exp96 FROM core92, core96 WHERE core92.ein = core96.ein; Outer = union of 2 sets SELECT core92.exps as exp92, core96.exps as exp96 FROM core92 LEFT JOIN core96 ON core92.ein = core96.ein;
Advanced Topics Subqueries: “Organizations in 1999 but NOT in 1998 files?” SELECT * FROM core99 WHERE EIN NOT IN (SELECT ein FROM core98); Dynasets (Access), Views (SAS) vs. Static sets: Data from some Access queries or Views can be edited. Working with LARGE datasets: INDEXing or SORTing dramatically improve performance. Find duplicates: SELECT ein, COUNT(ein) … GROUP BY ein HAVING count > 1;
Creating Tables, Inserting Rows Creating a table: CREATE new (ein C(9), name C(50)) Inserting a row: INSERT INTO new (ein, name) VALUES (core99.ein, core99.name);
Comparison Operators =, >, =, <= MySQL, SAS: name LIKE “%THEAT%” Access: name LIKE “*THEAT*”
ODBC: Open Data Base Connectivity To get there: Control Panel, ODBC or Admin. Tools, ODBC
Access: “SQL View” INSERT INTO vartest ( variable, format, disabled, label, qSubText, sequence, keyfield, display, required, form ) SELECT [Variable Dictionary].Field_name AS variable, IIf(IsNull([NDW_format]),"float0",[ndw_format]) AS format, [Variable Dictionary].NDW_disabled AS disabled, [Variable Dictionary].NDW_label AS LABEL, [Variable Dictionary].Description AS QSUBTEXT, [Variable Dictionary].NDW_sequence AS sequence, [Variable Dictionary].NDW_keyfield AS keyfield, [Variable Dictionary].NDW_display AS display, [Variable Dictionary].NDW_required AS required, "Core90Test" AS form FROM [Variable Dictionary] INNER JOIN (Files INNER JOIN [File/Variable Junction] ON (Files.[File ID] = [File/Variable Junction].[File ID]) AND (Files.[File ID] = [File/Variable Junction].[File ID])) ON [Variable Dictionary].FieldID = [File/Variable Junction].FieldID WHERE (((Files.filename1)="core90c3"));
Resources online documentationwww.mySql.com UI SAS users group intranet: Full SAS manuals are also online (somewhere)