# Project Euler 107: Determining the most efficient way to connect the network.

After the description of the disjoint set data structure (which I think you should read), it cannot come as a surprise what kind of solution we want to use for problem 107 of Project Euler. However, let as always start with the problem description:

The following undirected network consists of seven vertices and twelve edges with a total weight of 243.

The same network can be represented by the matrix below.

 A B C D E F G A – 16 12 21 – – – B 16 – – 17 20 – – C 12 – – 28 – 31 – D 21 17 28 – 18 19 23 E – 20 – 18 – – 11 F – – 31 19 – – 27 G – – – 23 11 27 –

However, it is possible to optimise the network by removing some edges and still ensure that all points on the network remain connected. The network which achieves the maximum saving is shown below. It has a weight of 93, representing a saving of 243 – 93 = 150 from the original network.

Using network.txt a text file containing a network with forty vertices, and given in matrix form, find the maximum saving which can be achieved by removing redundant edges whilst ensuring that the network remains connected.

Let me say it in this way. If you know that there already exists algorithms for this kind of problem, then it is dead easy. If you don’t then I wouldn’t even know how to start. However, I was in the lucky first category. What we are trying to find is a minimum spanning tree. Which is the smallest set of vertices in a weighted graph, such that we only have one component in the graph (all vertices are connected).

There exists two algorithms which are used to find this tree; Prim’s algorithm and Kruskal’s algorithm and to me the latter seemed like the simplest one to understand and implement, so that is what I used.

From Wikipedia we have the description

• create a forest F (a set of trees), where each vertex in the graph is a separate tree
• create a set S containing all the edges in the graph
• while S is nonempty and F is not yet spanning
• remove an edge with minimum weight from S
• if that edge connects two different trees, then add it to the forest, combining two trees into a single tree

At the termination of the algorithm, the forest has only one component and forms a minimum spanning tree of the graph.

So all we need now is to implement the algorithm and the input reading around it. Oh… and not least figure out that we should find the saving not the cost of the minimum spanning tree. That took me 10 minutes of debugging.

Using the disjoint data set class we just covered the code looks like

```//Read the input file

//create a forest F (a set of trees),
//where each vertex in the graph is a separate tree
int N = lines[0].Split(',').Length;
DisjointSet vertices = new DisjointSet(N);

//create a set S containing all the edges in the graph
//The tuple contains weight,vertex, vertex
List<Tuple<int, int, int>> edges = new List<Tuple<int, int, int>>();
int initialWeight = 0;

for(int i = 0; i < N; i++){
string&#91;&#93; edge = lines&#91;i&#93;.Split(',');

for (int j = 0; j < i; j++) {
if (edge&#91;j&#93; != "-") {
int weight = Convert.ToInt32(edge&#91;j&#93;);
initialWeight += weight;
}
}
}

//Sort edges to have the minimum weight at top
edges.Sort();
int k = 0;

//while S is nonempty and F is not yet spanning
int minSpanningTreeWeight = 0;
//while(!vertices.isSpanning()){
while (!vertices.isSpanning()) {
//remove an edge with minimum weight from S
//Since we have a sorted list we just go through the list

//if that edge connects two different trees,
//then add it to the forest,
//combining two trees into a single tree
if(vertices.Find(edges[k].Item2) != vertices.Find(edges[k].Item3)){
vertices.Union(edges[k].Item2, edges[k].Item3);
minSpanningTreeWeight += edges[k].Item1;
}
k++;
}
```

Then what remains is to subtract the weight of the spanning tree from the weight of the original graph and we have the following solution (click to see the solution)

```The saving is 259679
```

which runs in

`Solution took 2,4353 ms`

## Wrapping up

Once you figure out that there are indeed algorithms for this kind of stuff the problem is reduced to something which is a matter of implementation. Something which should be trivial, but I often find myself struggling to find the right way to code the algorithms and data structures. In this problem I had help with the Disjoint-set data structure and that made it almost trivial.

The full source code can be found here.

How did you solve this problem?

### Posted by Kristian

Scott

I’m solving it using Prim’s Minimal Spanning Tree Algorithm. It’s not working yet though (implementation issues . . . ).

Chintan

I have been using the Kruskal’s algorithm too but I’m getting 259671 as the answer.It shows the total weight to be 261832 which seems to be correct.If you could post the weights of the edges in the MST,it would be easier for me to check my error! 🙂

Lang:C++

Kristian

I could do that, but there are a pretty lot of them (513), so would you mind me mailing them to you instead? And what format would you like them in?

Chintan

Yes sure!You can mail them to me,I don’t mind them at all!
And I don’t know C# so a C/C++ code to print them or a text file containing them or even if you paste them in the body,anything would be fine!

Bilbo

Can you repost the input file network.txt. The link seems to be broken.

I solved this using python and dictionaries. But I had trouble detecting if an edge belong to two different sub trees. That’s where my time consumes.
https://gist.github.com/5636502e56fe57fc0cea.git

It took only 8 seconds.. < 9000 time. Which is apparently faster.

Thevesh Theva

I believe there is a much more efficient way to do this. =)

Prim’s algorithm can actually be run on a table, or in this case, a matrix.

I have posted my Java solution to this problem here:
https://github.com/Thevesh/ProjectEuler/blob/master/Level005/problem107.java

My code runs super fast, as follows:

```
Total weight of 261832
MST has weight of 2153
Algorithm saves 259679
Minimum value of 1 at row 37
Your code took 0.001761356 seconds to execute.

```

Cheers, and thanks for your website!

Hi, I believe we don’t need to call isSpanning method at all; since this is MST, edge count should be N – 1; simple variable holding # of edges should be sufficient.

Ed Sirett

Well I thought the beauty of C++ is the STL. So I implemented the forest as an unordered_set of ‘copses’ each of which is also and unordered_set (hash table) containing the indicies of individual ‘trees’ or ‘nodes’ in the graph. The edges were a vector of a simple class containing the two ends indicies and their distance and a flag to say if it had been selected as part of the MST. Also needed a comparison functor to sort the edge objects by distance.
Ran in about 1ms , I suspect that most of time was sorting the edges into length order, there is more scope here for a speed up, as the distances are all integers in a defined range, so a radix sort (O(n) time) could be used but that is not in the STL. Another optimisation would be to bail out of the main loop when the all the nodes are in only one set and the MST is thus complete and no further edges will be added.

// Euler Project 107
#include
#include
#include
#include

using namespace std;

static const int nBushes = 40;
static const int NoRoad = -1;

static const int adjM[nBushes][nBushes] = {
};

// Define a class for Edges
class Edge {

public:
int enda;
int endb;
int length;
bool inMST; Edge( int a, int b, int dist): enda(a), endb(b), length(dist), inMST(false) {}
void setInUse() { inMST =true;}

};

typedef unordered_set Copse;

typedef unordered_set Forest;

struct EdgeSortFunc {
inline bool operator() ( Edge &a , Edge &b)
{ return a.length < b.length;}
};

Copse * findInForest(Forest &f, int bush)
{
for (auto fit=f.begin(); fit!= f.end() ; fit++)
{
if ( (*fit)->count(bush) > 0 )
{
return *fit;
}
}
return nullptr;
}

typedef vector< Edge > EdgeList;

int main()
{
// Step 1, Build and sort the Edge List , check the asdM for errors whilst we are doing this.
EdgeList edges= EdgeList();
for (int r=0; r < nBushes; r++)
{
for (int c= r+1; c< nBushes ; c++)
{
```int beforeTotal=0; for (auto i= edges.begin(); i!= edges.end(); i++) beforeTotal += i->length;```
``` // Eaxamin each edge in turn. for (auto it=edges.begin(); it != edges.end(); it++ ) { Copse *a=NULL, *b=NULL; a = findInForest(forest, it->enda); b = findInForest(forest, it->endb); // There are four cases cases, case 4 splits into trivial (forget the the road) or non-trivial join copses if (a==NULL && b== NULL) { // add both a and b form a new copse Copse* c = new Copse(); c->insert(it->enda); c->insert(it->endb); // add new copse to forest forest.insert(c); // set road as in the MST it->setInUse(); } else if (b==NULL) // thus A must be non null { // b to copse a which contains bush 'enda' a->insert(it->endb); it->setInUse(); } else if (a==NULL) // B must be non null etc. mirror of above { b->insert(it->enda); it->setInUse(); } else // Both ends in a copse { // first the easy case where the road is redundant if (a==b) continue; // otherwise Transfer the nodes from copse B to the copse A for ( auto nit= b->begin(); nit != b->end(); nit++) a->insert(*nit); // remove copse b as all its bushes are now in copse A forest.erase(b); delete b; it->setInUse(); } } // now get the after total int afterTotal=0; for (auto i= edges.begin(); i!= edges.end(); i++) if (i->inMST) afterTotal += i->length; cout << beforeTotal-afterTotal << endl; ```
```return 0; ```