February 2012

The Mystery Of The SQL Cursor

by Dan Rachlis

Have you ever wondered who used an SQL Cursor or even understood what one was?

Have you ever thought of that question? Chances are that you have but never had the opportunity to put an SQL Cursor into practical use in any of your work related projects. I recently had the opportunity to use an SQL cursor for a project and realized how powerful and useful it is. Having technical tools to manipulate data can be very powerful in being efficient in managing your time to provide deliverables to clients. Actuaries, in particular, need to keep their technical skill set finely tuned to adapt to certain project needs and to provide deliverables to internal and external clients in a timely manner. This article defines and explains what an SQL Cursor is and aims to assist you with understanding and using them in the future.

Let's start with the basics, assuming you know the basics of SQL Server as a tool to manipulate and analyze data: What is an SQL cursor? An SQL cursor is a control structure that enables traversal over the records in a database. In normal language, cursors are used to process and perform logic on individual rows returned by a query. It can be viewed as a pointer to one row in a table. A cursor can only reference one row at a time, but can move through all the rows in a table. Historically, it has been recommended to use a cursor as a last resort because SQL was not meant to be transactional but more dataset oriented. But, it can be very powerful and more efficient than some work-around. A cursor can be used in the SQL query analyzer just like a normal select—from statement. You will only need to learn a few new SQL statements:

  • Declare a cursor that defines a result set or declare a local variable using as Select statement
  • Open the cursor to execute the Select statement and populate the result set
  • Fetch the data into local variables as needed from the cursor, one row at a time
  • Close the cursor when done
  • Deallocate resources when done

SQL Cursor Example—With Correct Syntax

In this example, I am going to define a few cursors (called CLAIM and LINE) which will be used to go through a health claim dataset and renumber the line numbers to be sequential and in a three digit character format. This is from an actual project, but I can't tell you what it is, other than we had to create claim records in a certain format (where the line numbers could not be duplicates and had to be three digits with leading zeros). Now that we have that cleared up, the first step is to create a temporary table from the existing dataset for the unique set of claims for which we need to renumber the claim line numbers. In case you don't know, a temporary table is the same as a regular database table, except that it has the # symbol in front of the name and is only available to use while the database is open.

Select distinct claim_id
Into #distinct_claim_id
From medical_claims_2010

The next step is to declare a few variables that will be used in the cursor. These variables will be used by the cursors in the SQL script below. An SQL local variable is an object that can hold a single data value of a specific type. Variables in SQL scripts are typically used:

  • As a counter either to count the number of times a loop is performed or to control how many times the loop is performed;
  • To hold a data value to be tested by a control-of-flow statement; or
  • To save a data value to be returned by a stored procedure return code or function return value.
declare @claimid_in varchar(50)
declare @claimid_update varchar(50)
declare @claimline# varchar(50)
declare @linesequence# varchar(3)
declare @counter int

The next step is to define the cursors. The CLAIM cursor is the unique claim number that we will use to step through the dataset for each claim and renumber the line number. The LINE cursor is the unique claim number and claim line number that is used to update the dataset with my line counter variable called @linesequence#. The @counter variable is my line number counter for the claim number that I am currently resetting.

Declare CLAIM cursor for select claim_id from #distinct_claim_id
Declare LINE cursor for select claim_id, claim_line_number
From medical_claims_2010 where claim_id=@claimid_in

When you put it all together it looks like the SQL query below (pay close attention to the syntax, as this is very important when writing SQL statements.) The temp table and variables are defined and then the CLAIM cursor is opened and the first record is Fetched from the dataset. The first record result from the query is put into the variable called @claim_in. The Fetch_Status=0 is the test to see if you are at the end of the file, and will kick out if there are no more records to process. Each cursor also has Begin and End statements, similar to a While statement, to tell SQL where it starts and ends.

The next step is to define the cursor LINE (think of this as nested queries or loops.) The cursor LINE is used to update the actual main dataset for each claim and claim line number combination. The LINE cursor is dependent on the CLAIM cursor and the @claimid_in variable that is currently being pointed at. This cursor is opened and the first record is Fetched and put into two local variables called @claim_update and @claimline#. The counter variables are then set to zero and then to the main action section. The current record that is pulled in from the LINE cursor matches the main claim from the CLAIM cursor. The counter is incremented and re-formatted with leading zeros using if condition statements. The current record is then updated using an Update statement. What is unique here is the statement "Where Current of Line," which SQL recognizes as the current record from the cursor, LINE in this case. The next record is Fetched for the claim we are currently processing. When there are no more records (or claim lines, as they are usually referred to) for this unique claim, the internal LINE cursor kicks out and transfers processing back to CLAIM cursor to get the next claim number. The Close statement and Deallocate statement releases the current record set and resources that are used by SQL for the cursor.

compact-2012-rachlis-2

The result of this query would be a transformation that looked like this:

compact-2012-rachlis-1

The cursor process is pretty simple and powerful once you understand the key statements, structure and syntax. Although we (as actuaries) usually don't process data row by row in the work we do, there may be times when you need to loop through a result set a row at a time and perform a certain action. While cursors may seem like a good idea, they can often cause your database application problems as they are slow, and can lock the tables that are used to populate the cursor while the rows in the cursor are looped through. In any case, it is always good to have the cursor knowledge when you want or need to use one.

Dan Rachlis is a specialist master in the Chicago office of Deloitte Consulting LLP. He can be reached at drachlis@deloitte.com or at 312.486.5631.