This is part 4 of an ongoing series of posts about my units of measure library in Scala. You’ll want to read part 1 before diving in here: part 2 and part 3 are more under-the-hood and, to be honest, should probably have been after this post in the series. That’s the problem with posting these as I write them.
From reading thus far, you might think this series of posts is in order to promote the use of the unitsOfMeasure library. Nothing could be further from the truth: it’s underdeveloped, and the usage examples I’ve provided are bad Scala. That’s not the point: the point is where trains of thought prompted by the writing led me. It’s underdeveloped because it’s not a real library – it’s an academic example, implementing enough to put together some cogent examples and make my arguments. I’ll cover why it’s bad Scala as we go.
As may be clear from the title of this blog, the readability of software is something that particularly concerns me. Whilst code is very rarely actually “write once, read many”, it’s a truism that code is read more than it’s written, so therefore the returns on making code easier to read are greater than the returns on making code easier to write. This is one of the reasons why I’m using a deliberately verbose syntax for Minstrel.
One of the things I really like about the units of measure library is how readable the resulting code is. For example, let’s take a look at a snippet of code using the library:
val length = 50 metres val width = 200 metres val tinyFarm = length * width print(tinyFarm in acres)
Look how readable that is! It’s almost natural language, as if units of measure are a first class concept in the language. It’s unambiguous what’s intended by the code, and the amount of “codey” syntax is minimal. Scala’s almost uniquely well-suited to writing libraries which function as language extensions, or support building full-blown DSLs.
The thing is, there are two types of readability. There’s what it’s trying to achieve, and how it achieves that. For a really trivial example:
What this line of code is trying to achieve is to print the size of a shopping list. It’s achieving that by invoking a method called size() on an object instance called list, and passing the return value of that method as an argument to a function called print. If you want to know how the size() method does what it does, you know to look on the class declaration for list.
This code is easy to read on both the initial “what is this supposed to do” and “how does it do it?” levels.
Let’s go back, now, to calculating the acreage of my tiny farm. We’ll start at the very end: it’s a very good place to start.
print(tinyFarm in acres)
What does this do? It prints the area of tinyFarm, in acres. It does this by passing the value of the expression
tinyFarm in acres to the function
How does it evaluate
tinyFarm in acres? It looks like
in is an operator.
For example, in Python, in is a keyword with multiple meanings – in the context of
if value in list,
in is a binary operator which returns a boolean; in the context of
for value in iterator, in assigns values from the iterator to value in turn. This functionality is baked into the language at the parser level.
But this is functionality we’ve brought in with a library. We’re clearly not recognising this operator at the parser level. So how are we implementing this? Well, Scala supports infix notation for method invocation on methods with one parameter. So, the following two lines of code are equivalent:
val words = "Hello world!".split(" ") val words = "Hello world!" split " "
Therefore, the following two lines of code are equivalent:
print(tinyFarm in acres) print(tinyFarm.in(acres))
So all we’re doing here is invoking the in method on tinyFarm, passing in the value
acres as an argument. The latter syntax isn’t as nice as the former, but it’s not that much worse. Extra punctuation definitely has a readability cost – once parentheses start getting nested it’s much harder for the eye to follow – but the intent of
tinyFarm.in(acres) is just as clear as
tinyFarm in acres.
So, if the latter syntax isn’t as nice as the former, then why would we ever use it? That’s a good question, but one for another post.
Moving on – we now know what the last line actually does. What about the third line?
val tinyFarm = length * width
What does this do? It multiplies the length of the farm by its width, and assigns it to the variable
width aren’t numbers, they’re instances of
Quantity, which are a custom class. How does the program know how to multiply them? Let’s refer back to our reasoning about the last line.
If the following are equivalent:
val words = string split separator val words = string.split(separator)
Then the following should also be equivalent:
val tinyFarm = length * width val tinyFarm = length.*(width)
That’s exactly what Scala does. Arithmetic operators are actually methods. Scala allows a number of different ways to construct a method name, and one of them is from operator characters, such as +, -, / and so on (the rules for mixing and matching operator characters and alphanumerics are a bit more complex). So, if we want to find out how it’s multiplying two quantities together, we know where to look: it’s a method on the quantity, just like in is.
Being able to redefine operators like this is a double-edged sword. It’s very powerful, but it’s also prone to misuse and (in Scala, at least) comes with a whole bundle of complications about things like reserved operators, precedences, and conventions of use. There’s room for a great big debate on how best to handle operator overloading, but that’s one for another post.
We still have to look at the first two lines of the program, though, and this is where things unfortunately get a little bit more complicated.
val length = 50 metres
What does this do? It assigns 50 metres to length. How do 50 and metres interact, though? Well, here’s the first pain in the ass – given the following code:
val length = 50 metres print(length in metres)
What metres refers to in the first line is not what metres refers to in the second line. On the second line, it’s a value of type
UnitOfMeasure. On the first line, however…
Scala allows methods with no parameters to be implemented in a style similar to infix notation (called suffix notation), by omitting the dot and parentheses. So the following are equivalent:
val length = 50 metres val length = 50.metres()
So where metres in the second line refers to a value, in the first line it refers to a method on 50.
At this point, I should really point out that this is considered bad Scala. I’ll leave the arguments to the Scala style guide for now, and get into more depth in the follow up post on punctuation. I’m sticking with it in the example because I don’t really care about Scala here – I care about syntax, and that’s pretty close to an idealised syntax for this problem.
50 is of type
Int, though, and
Int doesn’t define the metres method – obviously, that has to come in with the library. So how can we be invoking a method on an
Int that isn’t defined on
Int? I guess we’re defining it somewhere else, and sellotaping it on to Int as and when the compiler decides we need to.
That’s pretty much what happens. Scala has a feature called implicit conversions. When the compiler encounters a type error, it can apply these implicit conversions in an attempt to convert the types to other types which don’t yield a type error. So, in the above example, we provide an implicit conversion from
LengthBuilder, and the compiler converts
Double (as a numeric promotion – a different mechanism to implicit conversions, only required because Scala doesn’t use a sensible default number representation) and then from
LengthBuilder (initialised with the value of the double).
LengthBuilder then has the
So, the following are equivalent:
val length = 50 metres // what you type val length = new LengthBuilder(50).metres() // what the compiler converts it into
So how does the compiler know that we want it to convert
LengthBuilder? You can define methods as implicit, and the compiler will try any and all imported implicit conversions. When we imported
Length._, we didn’t just import the units, we also imported the implicit method. It’s more accurate to say the implicit method is applied to the value:
val length = 50 metres // what you type val length = Length.double2LengthBuilder(50).metres() // what the compiler converts it to
So we need to introduce a new class (
LengthBuilder), implement a method on it for each unit, implement an implicit conversion to create instances as required, and ensure that’s imported properly into each scope we want to declare quantities in. And what does it get us? It means we can do
val length = 50 metres
val length = metres 50
I’m digressing slightly when I point out the latter form is already supported by
UnitOfMeasure, via the
apply() method, which is syntactic sugar for applying a value to another value as if it were a function – the following are all equivalent:
val length = metres 50 val length = metres.(50) val length = metres apply 50 val length = metres.apply(50)
Hmm. Well, that’s been an interesting little tour of what’s going on under the hood of the syntax, and it’s asked as many questions as given us answers. We’ve got three outstanding topics that warrant further discussion now:
1) Infix/suffix notation and punctuation
2) Operator overloading
3) Implicit conversions as a solution to the subject problem
Guess I’d best get started on those, then!