I feared that the problems would become harder and harder, and after solving problem 88 I feared Problem 89 of Project Euler. Luckily for me I was wrong, problem 89 is not that difficult. The problem reads
The rules for writing Roman numerals allow for many ways of writing each number (see About Roman Numerals…). However, there is always a “best” way of writing a particular number.
For example, the following represent all of the legitimate ways of writing the number sixteen:
IIIIIIIIIIIIIIII
VIIIIIIIIIII
VVIIIIII
XIIIIII
VVVI
XVIThe last example being considered the most efficient, as it uses the least number of numerals.
The 11K text file, roman.txt contains one thousand numbers written in valid, but not necessarily minimal, Roman numerals; that is, they are arranged in descending units and obey the subtractive pair rule (see About Roman Numerals… for the definitive rules for this problem).
Find the number of characters saved by writing each of these in their minimal form.
Note: You can assume that all the Roman numerals in the file contain no more than four consecutive identical units.
My first thoughts regarding this problem was to make a converter for converting romans to arabic numbers and then a converter the other way as well. However after analyzing the problem a bit more I realized that the number of replacements that we can make is limited since we know that there are no more than 4 consecutive digits and we know that they are ordered unless they are already using the subtractive rule.
What we need to look for is where we can apply the subtractive rule our self and thereby shorten the numeral representation even further. There are three rules according to the explaining text.
- We can use I in front of V and X, meaning that we can convert IIII into IV and VIIII into IX
- We can use X in front of L and C, meaning that we can convert XXXX into XL and VXXX into XC
- We can use C in front of D and M, meaning that we can convert CCCC into CD and DCCCC into CM
That gives us a grand total of 6 substitution patterns to look for which all needs to be substituted with a 2 character replacement. So for counting purposes we don’t even need to make the correct substitutions, we just need to substitute all 6 patterns with a 2 character replacement and thus get the correct minimal string length.
What better method to do this than use one of my favorite tools, regular expressions? I love working with it, as it is a really powerful tool. But they are difficult to write correctly I think. However, for this problem they are just brilliant and this is why I chose to use them. For the .NET platform there is a guide for using regular expressions at MSDN library. If you want a more thorough walk through of Regular Expressions I can recommend Mastering Regular Expressions by Jeffrey Friedl.
Our pattern today is very simple though. So lets get right to the C# code for the problem
string roman = File.ReadAllText(filename); string pattern = "DCCCC|LXXXX|VIIII|CCCC|XXXX|IIII"; string replacement = "kk"; int result = roman.Length - Regex.Replace(roman, pattern, replacement).Length;
There you go, four lines of code and the problem is solved. I believe I could compress it to one line, but that would be ugly. This is assuming that the filename is parsed as a parameter.
Running this code gives us the following result
743 characters shorter Solution took 4,8019 ms
So it doesn’t even take that long to do the substitutions.
Wrapping up
Talking about romans always makes me think of Life Of Brian, and especially when we are talking substitutions the Pontius Pilate has a slight speech impedient resulting in sentences like “I’ve had enough of this wowdy webel sniggewing behaviour. Silence! Call yourselves Pwaetowian guards? You’re not – Seize him! Seize him! Blow your noses and seize him! “. An absolute must see movie in my opnion.
Anyway, back to the problem which was solved rather easily. You can find the source code here.
The beautiful image of roman numerals on a building was taken by Craig Hamnett who kindly shared it under the creative commons license which allows me to reuse it.
That was all I had to say about the solution for this little problem.


Written by Kristian on 24 March 2012
Topics: Project Euler