July 2013

Tables Database goes XtbML part 2

By Steve Strommen

In Part 1 of this article, we reviewed the history that led to the large database of actuarial tables in electronic form. This database is now being actively maintained and expanded by the Society of Actuaries. As part of that history, we noted that the project started in 1995 in the Technology Section, when a custom file format for actuarial tables was invented for the purpose. Years later a more flexible file format called XtbML was developed in cooperation with ACORD, and the database was translated into that format, which is in use today. This article will discuss some of the technical issues that arose in adapting the Table Manager software to use the XtbML file format. One might consider this a case study in adapting legacy software to a new environment.

Scope of Changes
A simple change in file format would not typically require a major rewrite of any program. However, in this case it was not just the format, but the data itself that needed to change. XtbML was developed to allow much more flexibility in the layout of tables. Basically, a table in XtbML consists of one or more sub-tables, each of which is a column of numbers or a rectangular array of numbers. The numbers are indexed by axes along the column and row dimensions. Each axis is defined by a minimum value, an increment between values, and a maximum value. The increment between values on an axis must be equal within any sub-table. Table values are retrieved by providing an index along each axis, so it is important that the axes for the sub-tables within a table not have overlapping ranges, so that software can determine which sub-table contains a requested value.

This presents a major change in logical data structure from the original Table Manager, which allowed either a single column of numbers or a select and ultimate table layout but no other layouts, and which required table values at every age or duration along an axis (the increment between axis values was always 1). Since the Table Manager is basically a tool for managing data, a major change in the logical data structure requires a near-total rewrite. To the extent possible, only the external interface could remain the same.

Choice of Implementation Tools
It had been over a decade since the Table Manager was originally designed, and the tools available for software development evolved significantly over that period. Since a near-total rewrite was needed, the choice of tools was wide open.

Let’s focus on the implementation of the Excel add-in version of the software. This was originally implemented as an .xll file, which is essentially a .dll that provides an interface to Excel defined by the Microsoft Excel System Developer’s Kit. This had been developed in C++. One option was to use the same approach for the new version. However, there were other options.

Two developments in software technology led to the choice of a different approach. First, the .NET platform made the deployment of multiple-use software modules a much simpler task. A .NET assembly, which is a special kind of .dll, can readily be used by a wide variety of other software through a fully object-oriented interface. The second development is the improvement and much wider use of Excel VBA, including the ability to directly use routines in .NET assemblies. It was decided that the core functions of the Table Manager software would be built as a .NET assembly. That way the Excel add-in could use that assembly through VBA code, and any other software (including a stand-alone table manager program or any modeling system or pricing system) could use exactly the same assembly. Users of the updated Excel add-in know that the current version is an .xla file containing the VBA code that accesses the .NET assembly.

Another implementation decision was needed regarding the XML parser to use, because XtbML is based on XML. Writing such a parser from scratch was not a realistic option, and many parsers are available. Conveniently, the .NET framework provides a simple XML reader and writer based on xmlLite, a library of code that is a standard part of Microsoft Windows. The availability of this tool was another reason for the decision to work within the .NET framework.

File Storage Decisions
The original Table Manager stored all tables in a single file, end-to-end. A separate index file contained the offset from the beginning of the tables file where each table’s data began. With XtbML, each table is stored in a separate file named tNNN.xml where the NNN is the table number and can include any number of digits. The use of a separate file for each table led to some changes.

First, all tables in a database must be contained somehow, so now they are contained in a dedicated folder. The Table Manager now needs to be given the name of a folder to work with, rather than the name of a file.

Second, while tables can now be easily read from the database given the table number (because it is part of the file name), an index is still needed to find the table number corresponding to the table you want. An index is maintained as a separate file containing the list of tables along with several kinds of categorical information. A user interface facilitates screening the full list of tables to find the one you want.

This structure is less than fully controlled, because the user can manually add tables to the folder without updating the index. The index is only updated when the user requests the Table Manager to do so, or when the Table Manager itself makes changes to the tables in the folder. This could be improved in the future by having the Table Manager update or verify the index periodically in a background thread, but that degree of sophistication is not yet implemented.

XtbML Usage Rules
XtbML is a very flexible file format. However, in order to facilitate a software interface to look up values in such tables from within actuarial software, some rules regarding the use of XtbML needed to be developed and enforced.

One rule is that the sub-tables within a table must not have overlapping axes. If the axes overlap, then a request for a table value given points on the table axes might refer to more than one table value, making it logically impossible to return a unique value. This means that tables that vary by non-numerical criteria such as sex or occupation class need to be encoded as a set of separate tables rather than one large table. While XtbML would allow many such sub-tables to be stored in a single table, it provided no standardized data key that software could use to select the appropriate sub-table.

Another rule defines the way that “keywords” stored with each table are to be handled. These keywords are used both to aid the user in screening tables and to let the software know what kind of sub-tables to expect and how to interpret them.

XtbML allows any number of keywords in any order. However, to facilitate use of these files in software, it has been agreed that the first three keywords will be as follows:

Keyword 1: Table layout. This keyword defines the sub-tables to expect, and whether they have one or two axes. An “Aggregate” table is a single column of numbers. A “Select” table contains two sub-tables, one being rectangular with select values and one being a single column of ultimate values. A “Continuance” table has multiple rectangular sub-tables, with the increment between duration values being different in each one (e.g., “weeks,” “months,” “years”). A “Generational” table has one or more rectangular sub-tables, with the range and increment between years of birth being different in each one.

Keyword 2: Usage. This keyword is simply for user screening purposes. ACORD has defined a very long list of “usage” keywords, including “Annuitant mortality,” “Disabled lives mortality,” “Generational mortality,” “Population mortality,” “Premium persistency,” “Projection scale,” “Remarriage,” “Selection factors,” “Claim termination,” and so on.

Keyword 3: Nation. This keyword names the nation where the table originated, and is again simply for user screening purposes.Now, it might be assumed that each of these keywords could have been set up as a specially named field in the definition of XtbML. Unfortunately, when XtbML was being created, there was not a full recognition of the needs of software that needs to retrieve values from tables. Therefore there is no dedicated field in XtbML for table layout. In addition, the “Nation” field exists, but it exists both at the table level and at the sub-table level, so it is not clear which of potentially several “Nation” fields should be the governing one. To settle the question, the third keyword has been dedicated for that purpose.

Software Interface Changes
Two kinds of changes were made in the software interface. These were required due to 1) the need to support tabulated values at axis increments other than 1, and 2) the need to support more table layouts.

To support tabulated values at axis increments other than 1, a means was needed to know whether a requested value was present in the table. To facilitate this, an argument was added to each method that requests a table value. The extra argument is a true/false variable that is modified by the method to indicate whether the requested value was found in the table. In the Excel add-in, the Excel formula may request a select table with arguments of age and duration. When calling the .NET library, the legacy Excel add-in function tblValSel(tableNum, age, duration) calls the .NET assembly method tblValSel(found, age, duration) on the table with the selected table number. If the requested value is not found, the .NET assembly method returns zero and sets the value of “found” to false. When “found” is false, the Excel add-in function transforms the returned value of zero into an error code displayed as #VALUE.

To support table layouts other than a single column or select and ultimate, new Excel methods were defined.

  • To retrieve values from continuance tables, the VBA function tlbValCont(tableNum, age, duration, axisName) was created. The final parameter for “axisName” specifies whether the duration parameter is in weeks, months or years. An analogous method was added to the .NET assembly.
  • To retrieve values from generational tables, the VBA function tblValGen(tableNum, age, birthYear) was created. An analogous function was added to the .NET assembly.

Ongoing Development Issues
The need for flexibility in the software interface to actuarial tables continues, and goes beyond what was anticipated by XtbML. Consider the case of disability incidence rates that vary by age, sex, occupation class and elimination period. XtbML may require a separate table for each sex, occupation class and elimination period, because those three dimensions are not easily characterized by a minimum value, maximum value and equal increments. There is no set of sub-table keys in XtbML to index sub-tables by sex or elimination period. So the 1985 CIDA disability incidence rates table may turn into a very large number of individual tables in XtbML.

A method for encoding just about any tabular values with any lookup keys is readily available. A standard database “table” structure can be used. Each record in the table contains one field for each lookup key (e.g., age, sex, occupation class, elimination period) and one field for the tabulated value. There are many readily available software tools for handling tables in that form and retrieving values from them.

This flexibility presents two issues, however. First, any calling program would need to know what lookup keys were required for each table so that it could supply them and construct an appropriate query. Second, performance could be compromised due to the need to construct and evaluate an SQL query every time a table value is needed. Note that the XtbML/Table Manager approach stores table values in arrays and does fast integer math, not SQL query lookup, to determine the array index from which the requested value can be retrieved.

Closing Thoughts
The use of values from standard tables is one of the staples of actuarial work. The Technology Section has been involved in facilitating the computerized use of such tables for well over a decade, and the Society of Actuaries is now maintaining the extensive database of tables that has been assembled. The ongoing development of the database and software to enable its use is surely of ongoing interest to the section and its membership. Continuing involvement of the Technology Section in this area is to be encouraged.

Stephen J. Strommen, FSA, CERA, MAAA, is owner of Blufftop LLC. He can be reached at stevestrommen@blufftop.com.