By Andrew M. Webster
The 2016 movie “Hidden Figures” highlights the careers in the 1960s of three brilliant mathematicians, Katherine Johnson, Dorothy Vaughan and Mary Jackson, who manually performed the calculations necessary to launch the spacecraft commanded by astronaut John Glenn into orbit. IBM’s FORTRAN (FORmula TRANslation) programming language was growing in popularity and dramatically changed the role mathematicians played in space program.
As a general-purpose programming language, Python is suitable for many tasks. Python is the “Swiss army knife” of the programming world. For example, at Validate Health we gather hundreds of flat files each week containing health care claims data. We use Python to transfer and aggregate the files and to interact with the SQL databases to permanently store the transferred data.
Python is simple to learn. The syntax is concise and consistent throughout libraries. Punctuation is used sparingly to separate code blocks, making the code highly readable. The code is interpreted at runtime, not during the compilation process. Variables are dynamically typed at runtime. This makes it easier to develop and edit code but slows program execution. However, execution speed can be improved by connecting Python code to C, C++ or FORTRAN.
Python is interoperable with other programming languages. It is the preferred language for writing application programming interfaces (APIs). Some have termed Python “the programming glue” since it connects so many disparate systems. Another important benefit for actuaries who learn Python is that they can communicate directly with the IT department and can peer review code.
Python is growing in popularity among data scientists. KDNuggets is a website that tracks the top analytics, data science and machine learning tools. In a 2018 KDNuggets poll of 2,300 data scientists, 66 percent reported to use Python whereas 49 percent reported to use R.1 Kaggle is an online platform where data scientists can participate in data scientific competitions. In a 2016 analysis published in Kaggle Scripts2 , Python and R ranked high in usage and ratings. On a different front, the premier global “R in insurance” conference was renamed the “Insurance Data Science” conference in 2018 after a run of five successful years. It now incorporates Python sessions as well.3 Finally, the clearest indicator of interest in Python is the fact that the tech giants are investing heavily in analytic tools built in Python such as Google’s open-sourced deep learning library TensorFlow, developed in 2015. By mid-2018, three years later, the code base of TensorFlow is 48 percent C++ and 42 percent Python.4
Ultimately, the software tools and programming languages that an actuary uses depends on familiarity with what was learned in school, compatibility with legacy software and the availability of other actuaries knowledgeable in the same language to peer review and maintain the code. If actuaries want to explore non-traditional opportunities as data scientists, having Python skills certainly provides a competitive advantage over other candidates.
Andrew M. Webster, ASA, MAAA, M.S., is the founder of Validate Health, a member company of the HealthTech incubator MATTER in Chicago. He can be contacted at firstname.lastname@example.org .
1Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis. https://www.kdnuggets.com/2018/05/poll-tools-analytics-data-science-machine-learning-results.html
2 Python vs R as seen in Kaggle Scripts https://www.kaggle.com/mlearn/python-vs-r-as-seen-in-kaggle-scripts
3 Insurance Data Science Conference https://magesblog.com/post/welcome-to-insurance-data-science/
4 Tensorflow https://github.com/tensorflow/tensorflow
This article has been slightly modified from an article originally appearing on ACTEX Learning’s blog at: https://blog.actexmadriver.com/2018/06/13/why-actuaries-should-start-paying-attention-to-python/