Tuesday 2 September 2008

Dimensional units and programmers' productivity

Programmer's productivity is a function of personal abilities and of the tools you have at your disposal. In this post I'll talk about how one of those tools (programming languages) can influence productivity.

In the latest relase of the F# language it is possible to attach units of measure to numeric types. This allows to detect certain programming errors at compile-time and it has no run-time costs. A typical application is scientific codes, with units for distance, time, speed, mass, etc.
Another source of good examples for units is software for finance. Let's use currencies and exchange rates as our example. In F#:

[<Measure>] type usd
[<Measure>] type eur
[<Measure>] type jpy

I can now create some money for myself: 100.0<eur>. Exchange rates can be represented simply by numbers of the appropriate units.

let eur_usd = 1.4709<usd/eur>
let usd_jpy = 108.617<jpy/usd>

That is, if you have 1 eur, you can get 1.4709 usd. You can also define an exchange rate in terms of other rates

let usd_eur = 1.0 / eur_usd
let eur_jpy = eur_usd * usd_jpy

How do you define a function to exchange money? Easy, it's just multiplication!

> 100.0<eur> * eur_usd;;
val it : float<usd> = 147.09

But if you add/substract/compare values of different currencies, you get a compile-time error:

10.0<eur> + 5.0<usd>
error: The unit of measure 'usd' does not match the unit of measure 'eur'

The Fortress language also suppors units of measurement. Hopefully units will make their way to the type system of other programming languages soon.

Let's go back to the subject of productivity, while staying in the domain of finance. A lot of the software written in investment banks and funds is written in C++, Java, or C#. One open-source library of financial software is quantlib (versions in C++, C#, and other languages). Let's see how you model money and exchange rates in quantlib. One of the classes in quantlib C#, called
ExchangeRate, provides essentially the functionality I have described above, but the implementation is radically different:

1. The code is more complex and much larger.
ExchangeRate has 5 data members and 9 methods. In addition it is tigthly bounded with 3 other classes of the same library: ExchangeRateManager, Money, Currency. That amounts to several hundred lines of code. This includes the usual amount of boilerplate code. For instance, Money defines methods for equality, inequality, getting hash codes, etc.

2. The code is more fragile, as several kinds of errors can occur at run-time, rather than at compile-time. For instance, the application of an exchange rate to
Money of the wrong currency is a run-time error.

3. The code is more complex also for users of the library. As some of these classes contain mutable data, users require proper synchronization if used in a concurrent setting.

(Note: I've looked at the C# version of quatlib, which is derived from the C++ original. I assume that in the C++ version things are not any better.)

To summarize, it is possible to replace 4 closely-coupled classes (a few hundreds of lines of code) by dimensioned floating point numbers and simple arithmetic operations on them. In the process we gain huge reductions in code size and complexity, more static error detection, and better performance. It doesn't surprise me that banks are throwing away all their legacy C++ code and rewriting it in F#, Caml and Haskell :)

No comments: