Rank Order Function Farrokh Alemi, Ph.D.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Advertisements

Instructions and Reporting Requirements Module 3 Electronic Reporting For Facilities March 2014 North Carolina Central Cancer Registry State Center for.
Query Methods (SQL). What is SQL A programming language for databases. SQL (structured Query Language) It allows you add, edit, delete and run queries.
Entity Relationship Diagram Farrokh Alemi Ph.D. Francesco Loaiza, Ph.D. J.D. Vikas Arya.
Oracle Data Definition Language (DDL)
Chapter 2 Basic SQL SELECT Statements
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
CS146 References: ORACLE 9i PROGRAMMING A Primer Rajshekhar Sunderraman
Infection Adjudication on REDCAP Module 1 Version: Feb 1st, 2012.
Chapter 4 Constraints Oracle 10g: SQL. Oracle 10g: SQL 2 Objectives Explain the purpose of constraints in a table Distinguish among PRIMARY KEY, FOREIGN.
Data Modeling (Entity Relationship Diagram) Farrokh Alemi, Ph.D. Updated by Janusz Wojtusiak (Fall 2009)
Chapter 3 Table Creation and Management Oracle 10g: SQL.
CS4432: Database Systems II
Select Complex Queries Database Management Fundamentals LESSON 3.1b.
Dr. Chen, Oracle Database System (Oracle) 1 Basic Nested Queries and Views Jason C. H. Chen, Ph.D. Professor of MIS School of Business Gonzaga University.
IFS180 Intro. to Data Management Chapter 10 - Unions.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Data Modeling (Entity Relationship Diagram)
Trigger used in PosgreSQL
More SQL: Complex Queries,
Assisting with the Nursing Process
CHAPTER 7 DATABASE ACCESS THROUGH WEB
Analyze ICD-10 Diagnosis Codes with Stata
Assisting with the Nursing Process
This shows the user interface and the SQL Select for a situation with two criteria in an AND relationship.
LESSON Database Administration Fundamentals Inserting Data.
Dead Man Visiting Farrokh Alemi, PhD Narrated by …
SQL Text Manipulation Farrokh Alemi, Ph.D.
Optimizing Efficiency + Funding
Graphical Interface for Queries
Normalization of Databases
Observations, Variables and Data Matrices
Insert, Update, Delete Manipulating Data.
GROUP BY & Subset Data Analysis
SQL for Predicting from Likelihood Ratios
Entity Relationship Diagrams
SQL for Calculating Likelihood Ratios
Types of Joins Farrokh Alemi, Ph.D.
SQL for Cleaning Data Farrokh Alemi, Ph.D.
Calculating Product of Values in Same Column
Receiver Operating Curves
Related Graduate Courses Farrokh Alemi, Ph.D.
SELECT & FROM Commands Farrokh Alemi, PhD
Undergraduate Courses
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Date Functions Farrokh Alemi, Ph.D.
Creating Tables & Inserting Values Using SQL
Procedures Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi
CS4222 Principles of Database System
HAVING,INDEX,COMMIT & ROLLBACK
Relationships as Primary & Foreign Keys
Cursors Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi
Dead Patients Visiting
Multivariate Analysis Project
Convert from Variable Character to Float
Relational Databases Farrokh Alemi, PhD Narrated by Farhat Fazelyar
Lab 3 and HRP259 Lab and Combining (with SQL)
Wednesday, September 21, 2016 Farrokh Alemi, PhD.
Normalization Organized by Farrokh Alemi, Ph.D.
Indexing & Computational Efficiency
Selecting the Right Predictors
Propagation Algorithm in Bayesian Networks
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Queries Training Module.
Spreadsheets, Modelling & Databases
Copyright © 2013 – 2018 by Curt Hill
Improving Overlap Farrokh Alemi, Ph.D.
Chapter 4 SURVIVAL AND LIFE TABLES
Topic 12 Lesson 2 – Retrieving Data with Queries
Presentation transcript:

Rank Order Function Farrokh Alemi, Ph.D. In this section we discuss the rank order function using SQL. This brief presentation was organized by Dr. Alemi.

Order Data Based on a Column Values Cross Join Order Data Based on a Column Values Repeated Same Dx for Same Patient Rank The purpose of Rank and Rank Dense functions are to order the records based on the values of one column. For example, we can find out if a patient has been repeatedly admitted to the hospital for the same diagnosis, a situation that happens when the earlier treatment has not worked and the patient is readmitted for further treatment.

Rank Skips Rank Dense Does Not Cross Join Rank & Rank_Dense Rank and Rank dense function differ in how they treat records where two or more records have the same value. If two records have the same rank, Rank function skips the next rank number. Rank dense does not.

Rank Skips: 1, 1, 3, 4 Rank Dense Does Not Cross Join Rank & Rank_Dense For example, if two records are ranked to occur at the same order, at rank 1, then the rank function will assign rank 1 to both of them and rank 3 to the next record. It skips rank 2.

Rank Skips: 1, 1, 3, 4 Rank Dense Does Not: 1, 1, 2, 3 Cross Join Rank & Rank_Dense In contrast, the rank dense will rank the first two at 1 and start the next one at 2.

Cross Join ID icd9 Age Rank 10 I276.1 63.16 1 10 I276.1 64.08 2 10 I276.1 64.25 3 10 I276.1 64.33 5 10 I276.1 64.66 6 10 I276.1 64.75 7 Skip 4 Here we see an example of what is happening to patient number 10. He has the diagnosis 276.1, Hyposmolality, repeatedly at different ages. Two of these diagnoses are reported for the same age. Hence we see them ranked the same. We see Rank 1, then rank 2, next 2 rank 3, and rank 4 is missing and we jump to rank 5. This is an example of skip in the ranks.

If it makes sense, delete repeated entries Advice: If it makes sense, delete repeated entries Cross Join Rank Skips: 1, 1, 3, 4 Rank Dense Does Not: 1, 1, 2, 3 Rank & Rank_Dense There is, of course, no difference between rank and rank dense, if no two records have the same order. This can be guaranteed by grouping by the fields used to order the rank, a first step that often should be done before using rank functions. When no two records have the same order then no two have same rank. Then we never have to deal with skip patterns and thus rank and rank dense functions produce same results.

Patient having same diagnosis at same time Advice: Patient having same diagnosis at same time Cross Join Rank Skips: 3, 3, 5, 6 Rank Dense Does Not: 3, 3, 4, 5 Rank & Rank_Dense For example, patients may have same diagnosis on same hospital admission. Once they are seen by one doctor and another time by another clinician. To rank these diagnoses as two different times of having the same diagnosis is a mistake. In this situation, it makes sense to delete the repetitions of the diagnosis for the same person at the same time. This helps make the ranking task more efficient and more sensible.

Cross Join RANK ( ) OVER ( [ <partition_by_clause> ] <order_by_clause> ) Rank Function Syntax Here is the syntax for the rank function.

Cross Join RANK ( ) OVER ( [ <partition_by_clause> ] <order_by_clause> ) Rank Function Syntax The reserved word Rank starts the syntax. Since this function has no arguments and is followed with parentheses with nothing in them.

Cross Join RANK ( ) OVER ( [ <partition_by_clause> ] <order_by_clause> ) Rank Function Syntax The syntax requires specification of the over clause, which requires us to specify what field or fields should be used to order the ranks.

Cross Join RANK ( ) OVER ( [ <partition_by_clause> ] <order_by_clause> ) Rank Function Syntax The partition clause is optional and describes whether the rank order should restart in subgroups of the records.

Cross Join RANK ( ) OVER ( [ <partition_by_clause> ] <order_by_clause> ) Rank Function Syntax The partition clause is optional and describes whether the rank order should restart in subgroups of the records.

, Rank() OVER (partition by id, icd9 order by icd9, ageatdx) Cross Join , Rank() OVER (partition by id, icd9 order by icd9, ageatdx) Rank Function Syntax Here is an example of the Rank command.

, Rank() OVER (partition by id, icd9 order by icd9, ageatdx) Cross Join , Rank() OVER (partition by id, icd9 order by icd9, ageatdx) Rank Function Syntax The order by command says that we want to set the order based on type of ICD code and age at which it occurs.

, Rank() OVER (partition by id, icd9 order by icd9, ageatdx) Cross Join , Rank() OVER (partition by id, icd9 order by icd9, ageatdx) Rank Function Syntax The partition command says that we want to organize the rank orders to start from 1 for each individual and each diagnosis.

Use Database called AgeDx DROP TABLE #Temp USE AgeDx SELECT ID, icd9, AgeAtDx , Rank() OVER (partition by id, icd9 order by icd9, AgeAtDx) AS [Repeated Dx] INTO #Temp FROM dbo.final WHERE ID=10 GROUP BY ID, icd9, AgeAtDx Select * FROM #Temp ORDER BY ID, icd9, [Repeated Dx] Cross Join Use table called Final Rank Function Syntax Note that the code uses table final from database called AgeDx, these may have different names in your data.

For ease we are working with person with ID 10 DROP TABLE #Temp USE AgeDx SELECT ID, icd9, AgeAtDx , Rank() OVER (partition by id, icd9 order by icd9, AgeAtDx) AS [Repeated Dx] INTO #Temp FROM dbo.final WHERE ID=10 GROUP BY ID, icd9, AgeAtDx Select * FROM #Temp ORDER BY ID, icd9, [Repeated Dx] Cross Join For ease we are working with person with ID 10 Rank Function Syntax For ease we are working with person with ID 10. Otherwise, these steps will take a long time to carry out.

Duplicates are removed so rank and rank dense are same DROP TABLE #Temp USE AgeDx SELECT ID, icd9, AgeAtDx , Rank() OVER (partition by id, icd9 order by icd9, AgeAtDx) AS [Repeated Dx] INTO #Temp FROM dbo.final WHERE ID=10 GROUP BY ID, icd9, AgeAtDx Select * FROM #Temp ORDER BY ID, icd9, [Repeated Dx] Cross Join Duplicates are removed so rank and rank dense are same Rank Function Syntax Duplicates are removed so rank and rank dense are the same. The group by command will delete any record for the same patient having more than 1 same diagnosis occurring at same age.

, Rank() OVER (partition by id, icd9 DROP TABLE #Temp USE AgeDx SELECT ID, icd9, AgeAtDx , Rank() OVER (partition by id, icd9 order by icd9, AgeAtDx) AS [Repeated Dx] INTO #Temp FROM dbo.final WHERE ID=10 GROUP BY ID, icd9, AgeAtDx Select * FROM #Temp ORDER BY ID, icd9, [Repeated Dx] Cross Join Order by Clause Rank Function Syntax The order by clause includes two variables

, Rank() OVER (partition by id, icd9 Partition by clause DROP TABLE #Temp USE AgeDx SELECT ID, icd9, AgeAtDx , Rank() OVER (partition by id, icd9 order by icd9, AgeAtDx) AS [Repeated Dx] INTO #Temp FROM dbo.final WHERE ID=10 GROUP BY ID, icd9, AgeAtDx Select * FROM #Temp ORDER BY ID, icd9, [Repeated Dx] Cross Join Rank Function Syntax The partition by clause also includes two variables.

ID icd9 AgeAtDx Repeated Dx 10 I041.89 64.666666 1 10 I112.0 64.25 1 Cross Join Rank Function Syntax The top 10 lines of the result of the query are provided here. All records refer to patient 10.

ID icd9 AgeAtDx Repeated Dx 10 I041.89 64.666666 1 10 I112.0 64.25 1 Cross Join Rank Function Syntax At 64.66, the patient was hospitalized with diagnosis 041.89, which is an unspecified bacterial infection. This infection did not repeat, so the rank does not exceed 1.

ID icd9 AgeAtDx Repeated Dx 10 I041.89 64.666666 1 10 I112.0 64.25 1 Cross Join Rank Function Syntax The next disease is 112.0, which is Candidiasis of mouth. This disease also does not repeat either.

ID icd9 AgeAtDx Repeated Dx 10 I041.89 64.666666 1 10 I112.0 64.25 1 Cross Join Rank Function Syntax The situation is different for hospitalization with diagnosis 253.6, which is “Other disorders of neurohypophysis.” The patient was hospitalized for this disease 3 times, first at 64.25, then at 64.75 and later at 65.25 years. We see that this disease is ranked 1, 2 and 3 for repetition.

ID icd9 AgeAtDx Repeated Dx 10 I041.89 64.666666 1 10 I112.0 64.25 1 Cross Join Rank Function Syntax Repetition also occurs for disease 272.4, which is unspecified hyperlipidemia.

Rank Order Function is useful in analysis of data in electronic health records This section described the rank order function. This function is useful in analysis of data in electronic health records.