January 2014

Beyond Excel—SQL

By Andrew Chan

Excel is a global standard analytical tool that is used in every industry. It is powerful and easy to learn. Most analysts receive the data from their IT department. When they need more data, they have to go back to their IT department. It can take days or even weeks before they can obtain the additional data. It would be great if we could retrieve data ourselves. So how does IT retrieve data from the systems?

Most enterprise data is stored in databases, and IT uses structured query language (SQL) to process data held in a relational database management system (RDBMS). We are going to learn what SQL is and how we can use SQL to manage data. We are going to use Microsoft® SQL Server® 2012 Express and a sample database, AdventureWorks. They can be downloaded (FREE) from the following sites:Microsoft® SQL Server® 2012 Express AdventureWorks sample database

This article will focus on how to use SQL statements to retrieve data, and will not cover database architecture or design.

Microsoft SQL Server Management Studio
Once we have installed SQL Server 2012 Express and attached AdventureWorks Sample database, we can launch Microsoft SQL Server Management Studio, where we can start using SQL statements.

Figure 1

Simple SQL Statement

SELECT
* FROM
[AdventureWorks2012].[Person].[Person]
This is probably the simplest form of SQL statement. It will retrieve every column (*) and every row (19,972 rows) from table [person].

Figure 2

We may not always want every column; let’s say we just want [FirstName] and [LastName].
SELECT
[FirstName], [LastName] FROM [AdventureWorks2012].[Person].[Person]

Figure 3
View larger image

We may not always want all records; we can select, e.g., LastName = ‘Brown’.
SELECT
[FirstName], [LastName] FROM [AdventureWorks2012].[Person].[Person] WHERE LastName =‘Brown’

Figure 4
View larger image

We can make sure the records are listed in a certain order, e.g., LastName.
SELECT
[FirstName], [LastName]

FROM
[AdventureWorks2012].[Person].[Person]

WHERE
LastName
=
‘Brown’

ORDER
BY

FirstName

Let’s try another table, [Sales].[SalesOrderDetail].

SELECT
[ProductID]

,[OrderQty]

,[UnitPrice]

FROM
[AdventureWorks2012].[Sales].[SalesOrderDetail]

We may want to calculate the total price, i.e. ,[OrderQty] *
[UnitPrice] , so we add a calculated column [TotalPrice] .

SELECT
[ProductID]

,[OrderQty]

,[UnitPrice]

,[OrderQty] *
[UnitPrice]
as
[TotalPrice]

FROM
[AdventureWorks2012].[Sales].[SalesOrderDetail]

In a simple SQL statement, we discuss the basic elements of an SQL statement.

  • SELECT
  • * – return all columns
  • FROM
  • WHERE
  • ORDER

BY

 

Aggregated SQL Statement
There may be situations that we don’t need every detail; instead we would just like to have the summary, e.g., how many units and sales order per [ProductID].

SELECT
ProductID,
SUM(OrderQty)
AS TotalQty,
COUNT(*)
AS
NumsSales
FROM
Sales.SalesOrderDetail
GROUP
BY

ProductID

We can also filter these aggregated values using HAVING. The following SQL statement returns records that haveNumsSales bigger than 1,000.

SELECT
ProductID,
SUM(OrderQty)
AS TotalQty,
COUNT(*)
AS
NumsSales
FROM
Sales.SalesOrderDetail
GROUP
BY
ProductID
HAVING
COUNT(*)
> 1000

There are other aggregated functions:

  • AVG
  • Max
  • MIN
  • STDEV
  • VAR
  • ...

If we want a list of ProductID without the aggregated values, then we can use DISTINCT.

SELECT
DISTINCT

ProductID
FROM
Sales.SalesOrderDetail

We can easily create a summary report using:

  • GROUP
    BY
  • HAVING
  • SUM, COUNT....
  • DISTINCT

JOIN Statement

With the SQL tool in hand, the actuary is empowered to operate much more quickly and deliver more timely results to customers. The IT Department is spared a repetitive task that otherwise diverts them from their primary mission, to create new and better processes.

Andrew Chan, ASA, is a financial model engineer with ALG Consulting. He can be contacted at chan_a@algconsultings.com.