Cursors Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi

Slides:



Advertisements
Similar presentations
AN INTRODUCTION TO PL/SQL Mehdi Azarmi 1. Introduction PL/SQL is Oracle's procedural language extension to SQL, the non-procedural relational database.
Advertisements

Introduction to Structured Query Language (SQL)
Cursors in Pl/SQL Database 1. Practice. Sample Database The schema of the sample database is the following: Drinkers (name, occupation, birthday, salary)
Bordoloi and Bock CURSORS. Bordoloi and Bock CURSOR MANIPULATION To process an SQL statement, ORACLE needs to create an area of memory known as the context.
Introduction to Databases Chapter 7: Data Access and Manipulation.
Dinamic SQL & Cursor. Why Dinamic SQL ? Sometimes there is a need to dynamically create a SQL statement on the fly and then run that command. This can.
COMPUTER PROGRAMMING. Control Structures A program is usually not limited to a linear sequence of instructions. During its process it may repeat code.
Linux Operations and Administration
CIS 338: Using Queries in Access as a RecordSource Dr. Ralph D. Westfall May, 2011.
Dr. José M. Reyes Álamo 1.  Review: ◦ Statement Labels ◦ Unconditional Jumps ◦ Conditional Jumps.
Guide to Oracle 10g ITBIS373 Database Development Lecture 4a - Chapter 4: Using SQL Queries to Insert, Update, Delete, and View Data.
I Power Higher Computing Software Development High Level Language Constructs.
Professor: Dr. Shu-Ching Chen TA: Hsin-Yu Ha Function, Trigger used in PosgreSQL.
DML Statements contd.. SQL Server CURSORS Cursor is used in handling results of select query for data calculations Cursors are used as buffered.
Aggregator Stage : Definition : Aggregator classifies data rows from a single input link into groups and calculates totals or other aggregate functions.
1 Section 10 - Embedded SQL u Many computer languages allow you to embed SQL statements within the code (e.g. COBOL, PowerBuilder, C++, PL/SQL, etc.) u.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Introduction to python programming
Numbers in ‘C’ Two general categories: Integers Floats
Topics Designing a Program Input, Processing, and Output
User-Written Functions
REPETITION CONTROL STRUCTURE
Containers and Lists CIS 40 – Introduction to Programming in Python
Views, Stored Procedures, Functions, and Triggers
Error Handling Summary of the next few pages: Error Handling Cursors.
Stored Procedure used in PosgreSQL
Dead Man Visiting Farrokh Alemi, PhD Narrated by …
LESSON 12 - Loops and Simulations
IPC144 Introduction to Programming Using C Week 2 – Lesson 1
Chapter 10 Programming Fundamentals with JavaScript
SQL Text Manipulation Farrokh Alemi, Ph.D.
Unit 2 Programming.
Variables In programming, we often need to have places to store data. These receptacles are called variables. They are called that because they can change.
Topics Introduction to File Input and Output
Graphical Interface for Queries
Stored Procedure used in PosgreSQL
GROUP BY & Subset Data Analysis
SQL for Predicting from Likelihood Ratios
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall
SQL for Calculating Likelihood Ratios
Types of Joins Farrokh Alemi, Ph.D.
SQL for Cleaning Data Farrokh Alemi, Ph.D.
Receiver Operating Curves
SELECT & FROM Commands Farrokh Alemi, PhD
PHP.
Date Functions Farrokh Alemi, Ph.D.
File I/O in C Lecture 7 Narrator: Lecture 7: File I/O in C.
Loops CIS 40 – Introduction to Programming in Python
Creating Tables & Inserting Values Using SQL
Stored Procedure used in PosgreSQL
Procedures Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi
Introduction to Problem Solving and Control Statements
Oracle9i Developer: PL/SQL Programming Chapter 8 Database Triggers.
CSC115 Introduction to Computer Programming
Indexing & Computational Efficiency
Loops.
Selecting the Right Predictors
SE1H421 Procedural Programming LECTURE 4 Operators & Conditionals (1)
Spreadsheets, Modelling & Databases
Chapter 8 Advanced SQL.
WEEK-2.
Topics Designing a Program Input, Processing, and Output
Improving Overlap Farrokh Alemi, Ph.D.
Topics Designing a Program Input, Processing, and Output
Chapter 11 Managing Databases with SQL Server 2000
Chapter 4: Repetition Structures: Looping
Prof. Arfaoui. COM390 Chapter 9
IST 318 Database Administration
A – Pre Join Indexes.
Topics Introduction to File Input and Output
Presentation transcript:

Cursors Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi This section provides a brief introduction to cursors, a tool used to test the predictive accuracy of models. This brief presentation was organized by Dr. Alemi and narrated by Yara Alemi.

Purpose of Cursor In SQL Server the cursor is used to go through data one row at a time and for each row redo a set of calculations. A row by row analysis of data is agonizing and time consuming but may be needed in certain types of analysis.

Adjusted & Re-Run A row by row analysis of data is needed when the SQL has to be adjusted, a field modified, and the SQL re-ran. One makes minor changes and sees how the entire SQL calculations changes. We might want to do this for understanding the sensitivity of results to specific assumptions. Or one might want to do row by row calculations to escape computational limitations that will not analysis of entire data.

Doc, will it work for me? A good example of use of cursors occurs when we try to use precision medicine. In precision medicine one has to find a subset of data that fits the patient. This is hard to do and use of cursors can allow for repeated SQLs that change one parameter to see if the fit to the patient improves. Suppose we want to create a decision support for selection of antidepressants. The task seems simple at first. Find the antidepressant that worked for matched cases in the database. In SQL, this would be done by calculating remission rates for different antidepressant matched on patient’s features.

Cursors Can Be Used to Find Rough Matches The problem arises with matching cases to the patient. Since patients have many features, no case in the data may match to the characteristics of the patient at hand. A procedure is needed that would repeatedly drop less relevant features and see if sufficient number of cases in the database match to the patient. This repeated dropping and re-matching cases to the patient’s features can be done using cursors. These slides show how we can do so.

Syntax of Cursor The use of cursors requires several lines of interrelated code.

DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] The first step is declare the name of the cursor.

DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] Insensitive makes a temporary copy of the data in a file called tempdb. 

DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] If scroll is not specified then next row is selected. If scroll is specified then all fetch options such as FIRST, LAST, PRIOR, NEXT, RELATIVE, ABSOLUTE are available.

DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] The select statement is any ordinary select statement but cannot contain INTO or BROWSE commands

DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] READ ONLY prevents updates made through the cursor.

DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] Defines updatable columns within the cursor. If column name is specified, only the columns listed can be modified. If UPDATE is specified without a column list, all columns can be updated.

Syntax of Do While Command The use of cursors requires several lines of interrelated code.

-- Start an index & repeat calculations DECLARE @Index INT SET @index = 1 WHILE (@Index <=Max_Value) BEGIN … -- Do intended calculations using SELECT; SET @Index = @Index + 1 END GO Do-While command accomplishes the goals of cursor by increasing an index value at each iteration. Think of the index value as the row in the cursor command. The index value is typically declared as an integer and since it is a constant it must be preceded with at sign. Typically, the initial index value is set to 1.

-- Start an index & repeat calculations DECLARE @Index INT SET @index = 1 WHILE (@Index <=Max_Value) BEGIN … -- Do intended calculations using SELECT; SET @Index = @Index + 1 END GO A while command tells the computer when to stop. In this case we are telling the computer to stop after it reaches the maximum value.

-- Start an index & repeat calculations DECLARE @Index INT SET @index = 1 WHILE (@Index <=Max_Value) BEGIN … -- Do intended calculations using SELECT; SET @Index = @Index + 1 END GO The begin command marks the start of the calculations. The END command marks the end of calculations.

-- Start an index & repeat calculations DECLARE @Index INT SET @index = 1 WHILE (@Index <=Max_Value) BEGIN … -- Do intended calculations using SELECT; SET @Index = @Index + 1 END GO After calculations are made the index value is increased by one. The code continues to work until the index value reaches the maximum allowed in the while command.

-- Start an index & repeat calculations DECLARE @Index INT SET @index = 1 WHILE (@Index <=Max_Value) BEGIN … -- Do intended calculations using SELECT; SET @Index = @Index + 1 END GO Until then, the code loops back to begin and recalculates the variables but now using a new index value.

Example: Find Subset of Data We demonstrate the use of cursors in context of repeatedly look for different subset of cases in the database that fit the patient at hand.

Our Plan for the Code All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code Since our plans are a bit complicated, we have drawn an outline of how we want the code to function.

Our Plan for the Code All Features into One Field All=-10 All=0-0 Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code All Features into One Field We begin with initializing the data and concatenating all relevant features of the patient into one variable.

Our Plan for the Code Look up which feature to drop All=-10 All=0-0 Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code Look up which feature to drop If we have started with n features concatenated to each other, then we look up the cursor row and see which feature is scheduled to be dropped next.

Our Plan for the Code Both presence & absence = dash All=-10 All=0-0 Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code Both presence & absence = dash The feature is dropped from the All field by making both the presence and absence of the feature to have the same symbol, in our case a dash.

Our Plan for the Code Identify subset of data, redo SQL All=-10 Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code We use group by command to re-examine subsets of data that match the modified all field. Since the modified all field has replaced presence or absence of one of the patient features with dash, then GROUP BY ignores this feature. In effect, we have redone the SQL and calculated it in a subset of data minus one key feature. Identify subset of data, redo SQL

Our Plan for the Code Finish when all features dropped All=-10 All=0-0 Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code We use group by command to re-examine subsets of data that match the modified all field. Since the modified all field has replaced presence or absence of one of the patient features with dash, then GROUP BY ignores this feature. In effect, we have redone the SQL and calculated it in a subset of data minus one key feature. Finish when all features dropped

Example: Walk through Code Suppose we have a patient who is male, diabetic and has cancer. Suppose that in our data we do not have anyone who is make, diabetic and has cancer. We have to look for some other combination of features that we do have in the data. Let us see if we can walk through this code to see how we can try to match to a reduced subset of patient features.

Walk through with 3 Features All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Walk through with 3 Features Since we have to examine the relationships of medications to remission across combination of features, it is important to include data where the feature is present and absent. So we include 3 fields in our features: gender, diabetes and cancer. Each of these fields can have two values. We concatenate these 3 features into one field which we call All. If all the variables are present we will have 111. If we have a code 010, then it means that we have a female patient who is diabetic but does not have cancer.

3rd variable always same 2nd variable always same All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features We look up next row. We are now at row 1 if we are using a cursor and at index 1 if we are using a do while command.

3rd variable always same 2nd variable always same All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features 1st variable always same We need to drop the first feature. We recode all field so that the first character is always a dash. Now this feature does not change across the data.

3rd variable always same 2nd variable always same All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features 1st variable always same We now group by the all field and calculate the remission rate associated with various antidepressants in this subset of data.

3rd variable always same All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- Walk through with 3 Features 1st variable always same 2nd variable always same We now repeat the analysis and move the cursor to the next row, row 2. Now we drop feature number 2. As before we do this by making it always dash. Finally after the all field has been modified we redo the calculations and find remission rates associated with various antidepressants.

3rd variable always same 2nd variable always same All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features 1st variable always same In this step, the cursor moves to the next feature, we are now at row 3. As before, we drop the feature in row 3, in this case cancer. We calculate which antidepressant is most associated with symptom remission. At exit we have examined matches to two out of three features of the patient and within these subsets identified which antidepressant will work for the patient.

Example: SQL Code Now let us look at the actual SQL code. This code was written to track the impact of 9 disabilities, gender and age on mortality in nursing home patients.

-- Alive in 6 months if never dead or dead after 180 days SELECT [dbo].[data].ID, AssessmentID, DayLast AS [Days], Cast(Age as float) as Age -- Alive in 6 months if never dead or dead after 180 days , CASE WHEN EverDead=0 THEN 0. WHEN EverDead=1 and Daylast > 0 and cast(DayLast as Float)<=180 THEN 1. WHEN EverDead=1 and DayLast> 0 and cast(DayLast as Float)>180 THEN 0. ELSE Null -- no 6 months outcomes are available on last assessment END AS Dead , CASE WHEN Sex='M' THEN '1' ELSE '0' END + CASE WHEN uEat=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uSit=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uGroom=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uToilet=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uBathe=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uUrine=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uWalk=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uDress=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uBowel=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN (Cast(Age as Float)+ CAST(DayFirst as float)/365.) >74. THEN '1' ELSE '0' END AS AllVariables INTO dbo.[Concatenate] FROM [dbo].[Data] inner join #EverDead ON [dbo].[Data].ID=#EverDead.ID WHERE Age>19 and DayLast>0 -- drop negative age and last assessment Go Now let us look at the actual SQL code. In the first step we concatenate the features of the patient into a field called all variables. Gender, various disabilities of the patient, and age are converted to either 0 or 1 text characters and then concatenated together using the addition function. We assume that in our data no case fits these 11 features of a patient and we need to select a subset of these features where we can find the relevant cases in our data.

-- Start an index & repeat for each variable DECLARE @Index INT SET @index = 1 WHILE (@Index <=11) BEGIN … In the next step we declare an integer variable called index. Instead of using a cursor that moves in field of data one row at a time, we can use the index to track the row we are at. At start we set the index to one. We plan to go through all 9 disabilities, gender and age, so we plan to do this for 11 patient features. Therefore the while is set to maximum of 11 loops. We will exit after the 11th loop.

-- Start an index & repeat for each variable DECLARE @Index INT SET @index = 1 WHILE (@Index <=11) BEGIN -- Set strata to all variables except case/control variable SELECT STUFF(AllVariables,@Index,1, '_') AS cStrata INTO #Cases FROM [dbo].[Concatenate] … The stuff command replaces the character in the position of index to a dash. It is doing this when the feature is present or absent. De facto the stuff command removes one of the patient features from the All Variables field and stores the result in a field called cStrata.

-- Start an index & repeat for each variable DECLARE @Index INT SET @index = 1 WHILE (@Index <=11) BEGIN -- Set strata to all variables except case/control variable SELECT STUFF(AllVariables,@Index,1, '_') AS cStrata INTO #Cases FROM [dbo].[Concatenate] -- Group by all variables except the case/control variable GROUP BY STUFF(AllVariables,@Index,1, '_') The GROUP BY command tells the computer to aggregate the data within all combinations of features except the combination where STUFF command has changed it to dash. Note that since the dash is in all cases, group by command is not affected by it.

-- Start an index & repeat calculations DECLARE @Index INT SET @index = 1 WHILE (@Index <=11) BEGIN … -- Do intended calculations; -- e.g. calculate remission from antidepressants SET @Index = @Index + 1 END GO The code ends with increasing the index value by 1 hence moving the cursor up one row. If we have reached the index value 11 then the code will stop, otherwise it will go back to begin and redo the calculations. This procedure de-facto drops a feature until we find cases in our database that match the patient. The cursor or the do-while commands help repeatedly execute the SQL code and find numerous subsets of data that fit the patient at hand.

Cursor and Do-While commands repeatedly execute an SQL Code