Download presentation
Presentation is loading. Please wait.
Published byJosé Antonio Flores Jiménez Modified over 6 years ago
1
Cursors Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi
This section provides a brief introduction to cursors, a tool used to test the predictive accuracy of models. This brief presentation was organized by Dr. Alemi and narrated by Yara Alemi.
2
Purpose of Cursor In SQL Server the cursor is used to go through data one row at a time and for each row redo a set of calculations. A row by row analysis of data is agonizing and time consuming but may be needed in certain types of analysis.
3
Adjusted & Re-Run A row by row analysis of data is needed when the SQL has to be adjusted, a field modified, and the SQL re-ran. One makes minor changes and sees how the entire SQL calculations changes. We might want to do this for understanding the sensitivity of results to specific assumptions. Or one might want to do row by row calculations to escape computational limitations that will not analysis of entire data.
4
Doc, will it work for me? A good example of use of cursors occurs when we try to use precision medicine. In precision medicine one has to find a subset of data that fits the patient. This is hard to do and use of cursors can allow for repeated SQLs that change one parameter to see if the fit to the patient improves. Suppose we want to create a decision support for selection of antidepressants. The task seems simple at first. Find the antidepressant that worked for matched cases in the database. In SQL, this would be done by calculating remission rates for different antidepressant matched on patient’s features.
5
Cursors Can Be Used to Find Rough Matches
The problem arises with matching cases to the patient. Since patients have many features, no case in the data may match to the characteristics of the patient at hand. A procedure is needed that would repeatedly drop less relevant features and see if sufficient number of cases in the database match to the patient. This repeated dropping and re-matching cases to the patient’s features can be done using cursors. These slides show how we can do so.
6
Syntax of Cursor The use of cursors requires several lines of interrelated code.
7
DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] The first step is declare the name of the cursor.
8
DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] Insensitive makes a temporary copy of the data in a file called tempdb.
9
DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] If scroll is not specified then next row is selected. If scroll is specified then all fetch options such as FIRST, LAST, PRIOR, NEXT, RELATIVE, ABSOLUTE are available.
10
DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] The select statement is any ordinary select statement but cannot contain INTO or BROWSE commands
11
DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] READ ONLY prevents updates made through the cursor.
12
DECLARE cursor_name [ INSENSITIVE ] [ SCROLL ] CURSOR FOR select_statement [ FOR { READ ONLY | UPDATE [ OF column_name [ ,...n ] ] } ] [;] Defines updatable columns within the cursor. If column name is specified, only the columns listed can be modified. If UPDATE is specified without a column list, all columns can be updated.
13
Syntax of Do While Command
The use of cursors requires several lines of interrelated code.
14
-- Start an index & repeat calculations DECLARE @Index INT
= 1 WHILE <=Max_Value) BEGIN … -- Do intended calculations using SELECT; + 1 END GO Do-While command accomplishes the goals of cursor by increasing an index value at each iteration. Think of the index value as the row in the cursor command. The index value is typically declared as an integer and since it is a constant it must be preceded with at sign. Typically, the initial index value is set to 1.
15
-- Start an index & repeat calculations DECLARE @Index INT
= 1 WHILE <=Max_Value) BEGIN … -- Do intended calculations using SELECT; + 1 END GO A while command tells the computer when to stop. In this case we are telling the computer to stop after it reaches the maximum value.
16
-- Start an index & repeat calculations DECLARE @Index INT
= 1 WHILE <=Max_Value) BEGIN … -- Do intended calculations using SELECT; + 1 END GO The begin command marks the start of the calculations. The END command marks the end of calculations.
17
-- Start an index & repeat calculations DECLARE @Index INT
= 1 WHILE <=Max_Value) BEGIN … -- Do intended calculations using SELECT; + 1 END GO After calculations are made the index value is increased by one. The code continues to work until the index value reaches the maximum allowed in the while command.
18
-- Start an index & repeat calculations DECLARE @Index INT
= 1 WHILE <=Max_Value) BEGIN … -- Do intended calculations using SELECT; + 1 END GO Until then, the code loops back to begin and recalculates the variables but now using a new index value.
19
Example: Find Subset of Data
We demonstrate the use of cursors in context of repeatedly look for different subset of cases in the database that fit the patient at hand.
20
Our Plan for the Code All=-10 All=0-0 All=01- Index=0 All=010
Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code Since our plans are a bit complicated, we have drawn an outline of how we want the code to function.
21
Our Plan for the Code All Features into One Field All=-10 All=0-0
Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code All Features into One Field We begin with initializing the data and concatenating all relevant features of the patient into one variable.
22
Our Plan for the Code Look up which feature to drop All=-10 All=0-0
Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code Look up which feature to drop If we have started with n features concatenated to each other, then we look up the cursor row and see which feature is scheduled to be dropped next.
23
Our Plan for the Code Both presence & absence = dash All=-10 All=0-0
Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code Both presence & absence = dash The feature is dropped from the All field by making both the presence and absence of the feature to have the same symbol, in our case a dash.
24
Our Plan for the Code Identify subset of data, redo SQL All=-10
Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code We use group by command to re-examine subsets of data that match the modified all field. Since the modified all field has replaced presence or absence of one of the patient features with dash, then GROUP BY ignores this feature. In effect, we have redone the SQL and calculated it in a subset of data minus one key feature. Identify subset of data, redo SQL
25
Our Plan for the Code Finish when all features dropped All=-10 All=0-0
Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Our Plan for the Code We use group by command to re-examine subsets of data that match the modified all field. Since the modified all field has replaced presence or absence of one of the patient features with dash, then GROUP BY ignores this feature. In effect, we have redone the SQL and calculated it in a subset of data minus one key feature. Finish when all features dropped
26
Example: Walk through Code
Suppose we have a patient who is male, diabetic and has cancer. Suppose that in our data we do not have anyone who is make, diabetic and has cancer. We have to look for some other combination of features that we do have in the data. Let us see if we can walk through this code to see how we can try to match to a reduced subset of patient features.
27
Walk through with 3 Features
All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 1st variable always same 2nd variable always same Walk through with 3 Features Since we have to examine the relationships of medications to remission across combination of features, it is important to include data where the feature is present and absent. So we include 3 fields in our features: gender, diabetes and cancer. Each of these fields can have two values. We concatenate these 3 features into one field which we call All. If all the variables are present we will have If we have a code 010, then it means that we have a female patient who is diabetic but does not have cancer.
28
3rd variable always same 2nd variable always same
All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features We look up next row. We are now at row 1 if we are using a cursor and at index 1 if we are using a do while command.
29
3rd variable always same 2nd variable always same
All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features 1st variable always same We need to drop the first feature. We recode all field so that the first character is always a dash. Now this feature does not change across the data.
30
3rd variable always same 2nd variable always same
All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features 1st variable always same We now group by the all field and calculate the remission rate associated with various antidepressants in this subset of data.
31
3rd variable always same
All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- Walk through with 3 Features 1st variable always same 2nd variable always same We now repeat the analysis and move the cursor to the next row, row 2. Now we drop feature number 2. As before we do this by making it always dash. Finally after the all field has been modified we redo the calculations and find remission rates associated with various antidepressants.
32
3rd variable always same 2nd variable always same
All=-10 All=0-0 All=01- Index=0 All=010 Gender Diabetes Cancer Initialize Cursor Concatenate Variables Go to Next Row Drop Index Variable GROUP BY modified All Index=1 Do calculations When All=-10 End? Exit Index=2 Do calculations When All=0-0 Index=3 3rd variable always same Do calculations When All=01- 2nd variable always same Walk through with 3 Features 1st variable always same In this step, the cursor moves to the next feature, we are now at row 3. As before, we drop the feature in row 3, in this case cancer. We calculate which antidepressant is most associated with symptom remission. At exit we have examined matches to two out of three features of the patient and within these subsets identified which antidepressant will work for the patient.
33
Example: SQL Code Now let us look at the actual SQL code. This code was written to track the impact of 9 disabilities, gender and age on mortality in nursing home patients.
34
-- Alive in 6 months if never dead or dead after 180 days
SELECT [dbo].[data].ID, AssessmentID, DayLast AS [Days], Cast(Age as float) as Age -- Alive in 6 months if never dead or dead after 180 days , CASE WHEN EverDead=0 THEN 0. WHEN EverDead=1 and Daylast > 0 and cast(DayLast as Float)<=180 THEN 1. WHEN EverDead=1 and DayLast> 0 and cast(DayLast as Float)>180 THEN 0. ELSE Null -- no 6 months outcomes are available on last assessment END AS Dead , CASE WHEN Sex='M' THEN '1' ELSE '0' END + CASE WHEN uEat=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uSit=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uGroom=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uToilet=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uBathe=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uUrine=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uWalk=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uDress=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN uBowel=1 THEN '1' WHEN Alive='1' THEN '1' ELSE '0' END + CASE WHEN (Cast(Age as Float)+ CAST(DayFirst as float)/365.) >74. THEN '1' ELSE '0' END AS AllVariables INTO dbo.[Concatenate] FROM [dbo].[Data] inner join #EverDead ON [dbo].[Data].ID=#EverDead.ID WHERE Age>19 and DayLast>0 -- drop negative age and last assessment Go Now let us look at the actual SQL code. In the first step we concatenate the features of the patient into a field called all variables. Gender, various disabilities of the patient, and age are converted to either 0 or 1 text characters and then concatenated together using the addition function. We assume that in our data no case fits these 11 features of a patient and we need to select a subset of these features where we can find the relevant cases in our data.
35
-- Start an index & repeat for each variable DECLARE @Index INT
= 1 WHILE <=11) BEGIN … In the next step we declare an integer variable called index. Instead of using a cursor that moves in field of data one row at a time, we can use the index to track the row we are at. At start we set the index to one. We plan to go through all 9 disabilities, gender and age, so we plan to do this for 11 patient features. Therefore the while is set to maximum of 11 loops. We will exit after the 11th loop.
36
-- Start an index & repeat for each variable DECLARE @Index INT
= 1 WHILE <=11) BEGIN -- Set strata to all variables except case/control variable SELECT '_') AS cStrata INTO #Cases FROM [dbo].[Concatenate] … The stuff command replaces the character in the position of index to a dash. It is doing this when the feature is present or absent. De facto the stuff command removes one of the patient features from the All Variables field and stores the result in a field called cStrata.
37
-- Start an index & repeat for each variable DECLARE @Index INT
= 1 WHILE <=11) BEGIN -- Set strata to all variables except case/control variable SELECT '_') AS cStrata INTO #Cases FROM [dbo].[Concatenate] -- Group by all variables except the case/control variable GROUP BY '_') The GROUP BY command tells the computer to aggregate the data within all combinations of features except the combination where STUFF command has changed it to dash. Note that since the dash is in all cases, group by command is not affected by it.
38
-- Start an index & repeat calculations DECLARE @Index INT
= 1 WHILE <=11) BEGIN … -- Do intended calculations; -- e.g. calculate remission from antidepressants + 1 END GO The code ends with increasing the index value by 1 hence moving the cursor up one row. If we have reached the index value 11 then the code will stop, otherwise it will go back to begin and redo the calculations. This procedure de-facto drops a feature until we find cases in our database that match the patient. The cursor or the do-while commands help repeatedly execute the SQL code and find numerous subsets of data that fit the patient at hand.
39
Cursor and Do-While commands repeatedly execute an SQL Code
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.