# Project Euler 22: What is the total of all the name scores in the file of first names?

Project Euler has a large variety in the sort of problems they pose. Problem 22, as we will solve now, has nothing to do with mathematics and everything to do with computer science and sorting algorithms. The problem reads

Using names.txt (right click and ‘Save Link/Target As…’), a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the list. So, COLIN would obtain a score of 938 x 53 = 49714.

What is the total of all the name scores in the file?

The way I want to solve this problem, is to split it into three parts

1. Read the input file and turn the data into a manageable data structure
2. Sort the data
3. Sum up and provide the answer

So this post will be split into three parts with most focus on the second part, as that is where the meat is. I will swap the sections a bit, such that I treat section 1 and 3 first to set up the framework. At last I will shortly treat two different sorting algorithms plus the .NET API way for sorting.

Reading a file for its content is a simple procedure which we have done before, but in order to get the data into a usable format, we need to make a bit of effort about the data structure in the file.

These are the notes I have made about the data structure in the file

1. There are no line breaks
2. All names are encapsulated in quotation marks – these needs to be stripped
3. All names are separated by a comma.
4. All names seems to be completely in upper case

With these facts/assumptions, it is fairly easy to write a small method that provides an output which is  a string array with just the names in all upper case. In C# this method can look as

```private string[] readInput(string filename) {
r.Close();

string[] names = line.Split(',');

for (int i = 0; i &lt; names.Length; i++) {
names[i] = names[i].Trim('"');
}

return names;
}
```

I read all of the data, split it around the comma and trim off quotation marks.

## Summing Up the values

In this section there are two pieces of code, one piece for is the main method which calls the input reader, and sums up all the names. The other piece of code is the method for finding the value of the individual name. And lets start with the main method which is pretty straight forward

```string filename = Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) + "\\names.txt";
int sumResult = 0;

//Insert some sorting here

for (int i = 0; i &lt; names.Length; i++) {
sumResult += (i + 1) * sum(names[i]);
}
```

It is a really simple piece of code that does as described in the beginning of the post; read the input, sorts the names and sums them up. I have not inserted the sorting part yet, as the implementation is a bit different depending on the method, but we will get back to that later.

The summing function is really simple. I converts each character to the ASCII value, which for A is 65, B is 66 and so on. So by subtracting 64, you get the value asked for in the problem description. It can be implemented as

```private int sum(string name) {
int result = 0;
for (int i = 0; i &lt; name.Length; i++) {
result += Convert.ToInt32(name[i]) - 64;
}
return result;
}
```

## Sorting the data

With the framework in place we are now ready to deal with the matter of the subject. Before even starting this section I would like to point to Wikipedia’s section on sorting algorithms, since I find the description rather extensive and well written. The entry covers stability, memory usage and running time comparisons.

As mentioned earlier I will pick three strategies for the sorting. The first approach is to use the built in sorting function called Array.Sort(). However, one of the names in the list starts with AA, which at least on my machine is treated as the Danish/Norwegian letter Å, which comes in the end of the alphabet, and thus messes up the sorting. So we have to implement our own comparison function for this approach to work.  I will give you the source code I used to implement the comparison for our own sorting algorithms, but it is rather IComparer interface, and you can check the source code to see the specific details.

In C# I have implemented it as

```private bool CompareStrings(string str1, string str2) {
int i = 0;
while (i &lt; str1.Length &amp;&amp; i &lt; str2.Length) {
if (str1[i] == str2[i]) {
i++;
} else {
return str1[i] &lt; str2[i];
}
}
return str1.Length &lt;= str2.Length;
}
```

The main method is implemented as

```Array.Sort(names, new mySorter());
```

and the execution gives

```The sum of all names are: 871198282
Solution took 8 ms
```

so it is a rather fast implementation.

Second I will implement a Bubble sort which is easy to understand and one of the first sorting algorithms I learned about. However, it is also very slow.

The last algorithm I will implement is a Merge Sort which is much faster, but uses more memory than the buble sort. One common property for the two algorithms is that they are stable, meaning that they will work even if the same key appears more times. I expect the .NET implementation to be a merge sort as well, since it is very used as a standard implementation in many languages

### Buble Sort

The Bubble sort is an algorithm which steps through the list swapping items two items at a time until they are in the correct relative order. The algorithm continues to run through the set until it is completely sorted. It has a worst case running time of O(n2) which means it scales poorly.

My implementation looks like

```private string[] BubleSort(string[] strings) {
bool changed = true;

while (changed) {
changed = false;
for (int i = 1; i &lt; strings.Length; i++) {
if (!CompareStrings(strings[i - 1], strings[i])) {
string temp = strings[i - 1];
strings[i - 1] = strings[i];
strings[i] = temp;
changed = true;
}
}
}
return strings;
}
```

and the implementation in the main method is

```names = BubleSort(names);
```

The execution using buble sort gives us

```The sum of all names are: 871198282
Solution took 699 ms
```

So a factor 90 slower than the .NET implementation. Lets see if we can improve it with a Merge Sort.

### Merge Sort

Merge Sort is a stable sorting algorithm which works has a good worst case running time of O(n log n).

Conceptually, a merge sort works as follows (quote from wikipedia)

1. If the list is of length 0 or 1, then it is already sorted. Otherwise:
2. Divide the unsorted list into two sublists of about half the size.
3. Sort each sublist recursively by re-applying merge sort.
4. Merge the two sublists back into one sorted list.

My implementation in C# looks as

```private string[] MergeSort(string[] strings) {
if (strings.Length &lt;= 1) {
return strings;
}

int firstPart = strings.Length / 2;
string[] strings1 = new string[firstPart];
string[] strings2 = new string[strings.Length - firstPart];

string[] sorted = new string[strings.Length];

// Split the array into two;
for (int i = 0; i &lt; firstPart; i++) {
strings1[i] = strings[i];
}

for (int i = firstPart; i &lt; strings.Length; i++) {
strings2[i - firstPart] = strings[i];
}

strings1 = MergeSort(strings1);
strings2 = MergeSort(strings2);

int j = 0;
int k = 0;

for (int i = 0; i &lt; sorted.Length; i++) {
if(j == strings1.Length){
sorted[i] = strings2[k];
k++;
} else if(k == strings2.Length){
sorted[i] = strings1[j];
j++;
}else if(CompareStrings(strings1[j], strings2[k])){
sorted[i] = strings1[j];
j++;
} else {
sorted[i] = strings2[k];
k++;
}
}

return sorted;
}
```

A bit longer piece of code compared to the bubble sort, but it should be easy enough to follow from the conceptual description. The execution looks like

```The sum of all names are: 871198282
Solution took 8 ms
```

So we are able to create a sorting algorithm as fast as the implemented. However, we might as well just use the supplied functionality.

## Closing remarks

This blog post treated the solution of Problem 22, of Project Euler, which is all about sorting a list of strings. I gave you three different methods to sort the list, and found that the fastest own implementation I could provide is as fast as the supplied .NET implementation. The only advantage to implement it yourself beside the educational aspect, is that you can customize it later on.

As usual the source code is available for you to study. And remarks, comments and questions are encouraged. ### Posted by Kristian Chris

var names = File.ReadAllText(“C:\\names.txt”).Split(‘,’).Select(x => x.Trim(‘”‘).Trim()).OrderBy(x => x);
var result = names.Select((x, i) => x.ToCharArray().Sum(y => (int)y – 64) * (i + 1)).Sum(); Kristian

Hi Chris

Thanks for your comment with another solution. It is a lot shorter using linq as you do.

Maybe it is only on my system it poses a problem due to my language settings. However, when I run the code there is one small problems you need to address. One of the names contain AA, which is interpreted as the Norwegian/Danish letter Å, and thus the solution ends up being 2 off.

I am not familiar enough with the linq syntax to be able to remedy this. But I am pretty sure you can write a a line of code to tell it how to sort things.

/Kristian Kristian

Hi again Chris

Once again thank you for submitting your approach. I have been meaning to look into Linq for a while. I played around with your solution, and while I do think that it is more readable in some ways, it is certainly slower. Not so much that it matters. I was able to execute it in 38ms vs. 8ms on the fastest of the solutions I presented. I does show that the more abstract code (as I see Linq) also performs a bit slower since it uses lists instead of arrays and so on.

That being said, I do really appreciate your solution, and I am inspired by it, so maybe one day you will see this type of solutions from me as well.

/Kristian SuprDewd

I’m pretty sure LINQ is neither using Lists nor arrays.
Underneath, they’re using some kind of a state machine and somehow pipe elements through to the foreach loop.
But, of course, it isn’t as fast as using arrays.

I love LINQ and all it’s components. It *usually* makes code a lot more elegant and shorter. Kristian

After looking a bit more, I think you might very well be right. It would be fun to learn a bit more about. Does anyone have some good sources both for an introduction to Linq and what happens behind the scenes when using linq?

/Kristian SuprDewd

Microsoft has a collection of 101 LINQ examples here: http://msdn.microsoft.com/en-us/vcsharp/aa336746
They also have a set of introduction articles:
http://msdn.microsoft.com/en-us/library/bb397933.aspx

I still think the best way to learn is to try things out, and use google if you run into issues. Jens

I was working on this problem and hit the same curious case of AARON being the last element of a sorted list, seems there is an easy workaround however, when sorting, you can choose which kind of string comparer to use, but instead of rolling your own, you can chose one of the build-in ones that uses an invariant culture: Array.Sort(StringComparer.InvariantCulture), then it can do proper alphabetical sorting. Kristian

Ah nice, I didn’t know you could do that to avoid the special case here with Danish and similar languages. Patrick

Hey, this is an incredible resource…but I am learning c++ not c#, though I am aware that I will need to know both and am capable of converting from one to another but its quite touch and go because i never formerly attempted to learn c#..anyway, my point is, is there a resource as incredible as this that explains euler in c++? Kristian

Hi Patrick

I don’t think there is any resource as great as this… (okay, that was cheap). I haven’t found anything specifically for c++, but on the other hand I haven’t really looked.

For the majority of the problems here, I think the big part is figuring out a method for solving it, after that the implementation in a specific language should be the lesser part of it. I might be wrong though, but that is the way it is for me.

Also, I think that you should make a solution for the problem, and then afterwards maybe read the blog to see if there are other ways of solving the problem. Thus you learn c++ from your initial implementation. Bjarki Ágúst

I think you should rather look for an introduction to C#, and then come back here when you’re a little more comfortable with it. In my opinion, C++ and C# aren’t really that different. The syntax is nearly the same, and the only thing that differs significantly are the libraries, but if there’s something you don’t recognize (for example, the StreamReader class), a simple Google search should be enough. Patrick

thank you, ya I have been fairly comfortable translating C# into C++ in the past and like I said I really should learn both for the job I’m going for so maybe its just time I buckle down on C#, especially since, like you said, it’s not so different and therefore relatively easy to learn once a person’s already familiar with C+…Thanks Again! Gopinath

.NET implements Quick sort… an unstable one. http://msdn.microsoft.com/en-us/library/w56d4y5z.aspx Gopinath

Here is a more detailed source on what basis the framework chooses to implement a algorithm.. .NET uses insertion,heap sort and as well as Quick sort..
http://msdn.microsoft.com/en-us/library/aw9s5t8f.aspx Stian

``` Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US"); ```
then the AA’s will be treated as you want 🙂 Kristian

I knew there had to be something like that 🙂 Raynor

Hello, i’ve been working on this problem on my own for a while, it gave me a different result, with almost the same code. I can not find why my number is ~1000 off from the right answer. Any suggestions? I can send you my code if you give me an email adress. I use the built in Array.Sort() method. (.net 4.5 VS2012 C#). My array is a list of strings. No doublecharacter issue (AA is treated as AA). Tried to check the sorting, it seems correctly sorted. My list got 5163 elements in it. Jean-Marie Hachey

Table 1
The total of all and five-name scores in the file of first names.
[Re: Project Euler – Problem 22]

http://img4.hostingpics.net/pics/890182pe22table1names.jpg

The complete list contains 5 166 names.
The five first: “MARY”, “PATRICIA”, “LINDA”, “BARBARA”, “ELIZABETH”.
Execution times measured is the average of 3 consecutives determinations.

P.S.
https://projecteuler.net/project/resources/p022_names.txt
This link in the article is not available:
http://projecteuler.net/project/names.txt

___

Sources:
1) PE22, Kristian’s algorithm
http://www.mathblog.dk/files/euler/Problem22.cs
2) Microsoft Visual C# 2010 Express
3) Microsoft Office Excel (2007)
4) Microsoft Office Word (2007) Mateus

A very simple solution

Array.Sort (names);

ulong breakp = Convert.ToUInt64(names.Length), sum = 0;

for (ulong x = 0; x < breakp; x++)
{
ulong aux = 0;
for (int y = 0; y < names [x].Length; y++)
{
aux += Convert.ToUInt64 (names [x] [y]) – 64;
}
aux *= (x + 1);
sum += aux;
}

Console.WriteLine (sum);

Don't forget to add references to System.IO; felix

I’ve always admired the cleverness of the merge sort algorithm. This is my first attempt in iterative merge sort written in C#; a large enough array can cause a stackoverflow error. I’ve omitted the compare string function

private void Mergesort(ref string[] names)
{
//use in place merge sort
//split array into a list of sorted arrays (natural split: group names that are already in order)
List<string[]> sortedArraysList = new List<string[]>();
for (int i = 0; i < names.Length; i++ )
{
List tempList = new List();
if (i+1 != names.Length)
{
//keep adding names to the temp list if the names are in order
while (names[i] == ChooseFrontString(names[i], names[i + 1]))
}
//convert temp list to an aray and add the array to the sorted array list
}
List<string[]> nuSortedList;
while (sortedArraysList.Count > 1)
{
nuSortedList = new List<string[]>(); //clear this list first
//pic 2 elements of the sorted array list and sent to sorted merge
for (int i = 0; i < sortedArraysList.Count; i+=2)
{
string[] A = sortedArraysList.ElementAt(i);
//no more array pairs to add, break out of for loop
if (i+1 == sortedArraysList.Count)
string[] B = sortedArraysList.ElementAt(i + 1);
//add sorted array to new sorted array list
}
//replace old list with a new list that has fewer, longer sorted arrays
sortedArraysList.Clear(); sortedArraysList = nuSortedList;
}
names = sortedArraysList.ElementAt(0);
}

``` private string[] SortedMerge(string[] A, string[] B) { //standard sorted merge algorithm string[] mergedArr = new string[A.Length + B.Length]; int Awalk = 0; int Bwalk = 0; int mergedwalk = 0;```

``` while(Awalk < A.Length && Bwalk < B.Length) { //string of A should be in front; add to merged array if (A[Awalk] == ChooseFrontString(A[Awalk], B[Bwalk])) { mergedArr[mergedwalk] = A[Awalk]; Awalk++; } else //string of B should be in front, add to merged array { mergedArr[mergedwalk] = B[Bwalk]; Bwalk++; } mergedwalk++; } ```

``` //when either A or B are empty, add the rest of the nonempty array into sorted array if (Awalk == A.Length) { while (Bwalk < B.Length) { mergedArr[mergedwalk++] = B[Bwalk++]; } } else { while(Awalk< A.Length) { mergedArr[mergedwalk++] = A[Awalk++]; } } return mergedArr; } ``` Aale

Only came to verify my total – yes all good

Done in c++ though — if anyone comes looking – linux mint 19

#include
#include
#include
#include
#include

namespace nsp{
void trimR(std::string &str, char delim){
std::string::size_type pos = str.find_last_not_of(delim);
str.erase(pos + 1);
}

void trimL(std::string &str, char delim){
//std::string::size_type pos = str.find_first_not_of(delim);
//str.erase(0,pos);
str.erase(0,1);
}

std::vector splitLine(std::string &strLine, const char delim){
std::vector vv;
std::stringstream ss(strLine);
std::string str;
while (std::getline(ss, str, delim)) {
vv.push_back(str);
}
return vv;
}

int processName(std::string nme){
int numbr{0};
for(auto cc:nme){
numbr += (static_cast(cc)-64);
}
return numbr;
}
int total{0};
int LineNumber{0};

void processLine(std::string lne){
std::vector name = splitLine(lne,’,’);
for(size_t ii = 0; ii < name.size(); ii += 1){
trimL(name[ii],'”‘);
trimR(name[ii],'”‘);
}
//
std::sort(std::begin(name),std::end(name),std::locale(“en_AU.utf8″));
for(auto nn:name){
total += (processName(nn) * ++LineNumber);
std::cout << nn << ” – ” << processName(nn) << std::endl;
}
std::cout << “Total ==> ” << total << std::endl;
}
} // namespace nsp

int main(){
std::string line;
std::ifstream ifs(“names.txt”);
if (ifs.is_open()) {
while(std::getline(ifs,line)){
nsp::processLine(line);
}
}else{
std::cout << “Nah not able to open”;
}
return 0;
} Filip

My solution, as it follows:
var allNames = File.ReadAllText(“p022_names.txt”).Split(‘,’).Select(s => s.Trim(‘”‘)).OrderBy(s => s);
int sum = allNames.Select((s, k) => s.Sum(c => c – 64) * ++k).Sum();
didn’t wanna work without this line put at the beginning: