22 April 2009

UML in the Functional Programming world

UML is a well known standard notation in the context of objectoriented design and programming. It allows to define classes and their relationsships. It makes it possible to display Use Cases and the Activities performed for them. They way how objects interact together can be visualized in more than one way.

The problem is: it's all about objects. But we are in the age of a new rising star: the functional programming. Here we deal only with functions. There is an input, there is an output, no state. Just a rule to map the input to the output. No behavior. Just a chain of input - output mappings. It seems that UML doesn't fit very well. Some languages interpret functions as objects, but in the end that doesn't help. Functions are a mathematical construct, they are best defined in the way we learned in our mathematics education.

Let's go a step back and ask: what is the goal if we use UML ? It is used during the design phase, so one goal might be to communicate architectures. An other goal might be to write down design elements of the software to build, on a level which allows to use abstractions and to suppress unnecessary details. Of course, abstractions and communication is not only about data structurs, but also about what is done with this data, and by which entities. And in the end this is the nature of using UML: finding out and document what is needed, what happens, and what is get.

Ups. Thats reminds us of functions. There must be something wrong. So, again, one step back: how should we document software ? Actually, the main purpose of any software (which is not used for controlling some device, because their nature is just to be an extension of mechnics) is to provide information. Information aggregated from other informations, as well as calculated, filtered, selected and transformed information. The nature of information is to be data, which have an intention: a question to answer, an goal to achive. If we describe the data and their intention, i.e. the information, and their origins and lifecycle, we have in fact described what a system is. It doesn't matter if the technical implementation is done by functions, or by interacting objects. Thats a matter of the detailed technical description, which is important only for a small set of people (and it has to be decided for every project if the code is the documentation or any detailed UML diagrams or equations are needed). Or in other words: it is in matter of level of detail.

Let us note two additional observations:
1) The data comming from or going into the real world can be modeled better by objects than functions.
2) The context as part of the system environment is the source of intention (what will I achieve, why I do something).

Bring this all together, we can see that UML helps to document the data and their intention and their lifecycle via UseCases (maybe annotated with activity diagrams, for context and intention), Collaboration diagrams (for lifecycles), Class diagrams (for the pure data). In this sense, UML should help document software independently of the programming paradigma used with a sufficient level of detail for the most cases. And if more detail is needed, UML is appropriate for the OOP and mathematical equations for FP (and PI calculus for heavy concurrency systems).