Every morning, John walks to the nice little coffee shop not far from his house. He just can’t get enough of that chai latte with an extra shot and cinnamon on top. One day, immersed in his musings about the nature of reality, he wondered, between a sip and another: “If I made a different path from home every day, how long until I would have to repeat a path?” It didn’t take too long until he realized that the answer is “infintely many” (you can always walk in circles and so on). Being a reasonable man, though, John also wants to keep his journey relatively short. No more than four blocks. Sometimes five, if the weather is nice.

You can picture the city John lives in as something like this:

A city modeled as a graph

The numbered circles are either points of interest, such as John’s house or the coffee shop, or intersections between two streets. The streets themselves are represented by the blue lines. This is, of course, a graph, in which the locations or intersections are the nodes and the streets are the edges.

So how would we solve this problem computationally?

Adjacency matrices

The first thing we need to decide is how to represent this graph to be manipulated by our program. One way is by using a square matrix like this:

\[A = \begin{bmatrix} 0 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0\\ \end{bmatrix}\]

The matrix \(A\) was constructed by following two simple rules:

  1. If the graph has \(n\) nodes, we will need a \(n \times n\) matrix. In this case, \(n = 12\).
  2. If there is an edge between node \(i\) and node \(j\), then the value of \(a_{ij}\) (row \(i\), column \(j\) of the matrix) will be \(1\). Otherwise, the value will be \(0\).

This is called the adjacency matrix of the graph. By the way, since all streets in this graph are two-way streets (the graph is undirected), an edge from node \(i\) to node \(j\) appears twice in the matrix: in \(a_{ij}\) and \(a_{ji}\). In other words, the adjacency matrix of an undirected graph is symmetric. If you draw a line over its diagonal, the part of the matrix to the top of the diagonal is a mirror of the part on the bottom.

Number of paths

Before we tackle the general problem posed by John, let’s consider a simpler one. Suppose we wanted to know how many ways one can go from node 5 to node 9, crossing just one edge. That’s almost like a silly question. Just look at element \(a_{5,9}\) of the matrix and you will see that the value is \(0\). There is no single edge connecting the two. That’s what the adjacency matrix is for: to tell which paths of length 1 exist in your graph.

What about length 2? Let’s think about what that means, for a moment. You want to go from node 5 to node 9 by crossing two edges. So first you need to go from node 5 to some other node (let’s call it \(k\)) and then, from node \(k\) to node 9. Of course, there may be many different choices of \(k\). Each such choice determines a path of length 2. By looking at the graph, we can see that the only choices we can make are 7 and 10. So there are two paths of length 2 in this case.

We can express this intuition more precisely using the adjacency matrix. Remember that each row shows all outbound edges from a node whereas each column shows all inbound edges to a node. Let’s then focus only on row 5 and column 9 (our source and destination, respectively):

\[r_5 = \begin{bmatrix} 0 & 1 & 0 & 1 & 0 & 0 & \mathbf{1} & 0 & 0 & \mathbf{1} & 0 & 0\\ \end{bmatrix}\] \[c_9^T = \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 1 & \mathbf{1} & 1 & 0 & \mathbf{1} & 1 & 0\\ \end{bmatrix}\]

Here I’m showing column 9 horizontally (also known as its transpose) to make the point more evident. If you think of them as two arrays of numbers and look at them position by position, you will notice that there are only two positions — highlighted in bold — where the number 1 appears in both arrays: 7 and 10, matching our visual inspection of the graph.

In general, the algorithm for counting the paths of length 2 between nodes \(i\) and \(j\) is:

counter = 0
for (k = 1 to n) do
  if (a[i, k] == 1 and a[k, j] == 1) then
    counter = counter + 1;
  end
end

which is equivalent to:

counter = 0
for (k = 1 to n) do
  counter = counter + a[i, k] * a[k, j]
end

which, in mathematical notation, becomes:

\[a^{(2)}_{ij} = \sum_{k = 1}^n a_{ik}a_{kj}\]

where \(a^{(2)}_{ij}\) is the number of paths of length 2 between \(i\) and \(j\). If you are familiar with the definition of matrix multiplication, you’ll recognize this as exactly \(A^2\), shown below:

\[A^2 = \begin{bmatrix} 3 & 0 & 0 & 0 & 2 & 2 & 1 & 1 & 0 & 0 & 0 & 0\\ 0 & 2 & 1 & 2 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 1 & 3 & 2 & 0 & 0 & 1 & 0 & 2 & 0 & 1 & 0\\ 0 & 2 & 2 & 4 & 1 & 1 & 2 & 0 & 2 & 2 & 0 & 0\\ 2 & 0 & 0 & 1 & 4 & 2 & 2 & 0 & 2 & 1 & 0 & 1\\ 2 & 0 & 0 & 1 & 2 & 4 & 2 & 2 & 1 & 2 & 1 & 0\\ 1 & 1 & 1 & 2 & 2 & 2 & 5 & 1 & 2 & 2 & 1 & 1\\ 1 & 0 & 0 & 0 & 0 & 2 & 1 & 3 & 1 & 1 & 1 & 1\\ 0 & 0 & 2 & 2 & 2 & 1 & 2 & 1 & 5 & 1 & 1 & 2\\ 0 & 1 & 0 & 2 & 1 & 2 & 2 & 1 & 1 & 4 & 2 & 0\\ 0 & 0 & 1 & 0 & 0 & 1 & 1 & 1 & 1 & 2 & 3 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 2 & 0 & 0 & 2\\ \end{bmatrix}\]

But there is no reason to stop there. If we want to know all the paths of length 3, we can use the same method again. A path of length 3 is nothing more than a path of length 2 followed by a path of length 1. And we have this information already for any pair of nodes, using \(A^2\) and \(A\), respectively. So, \(A^2A = A^3\) gives us what we want. In general, if you want to know all paths of length \(m\), then \(A^m\) is your friend.

The computational matrix

Notice, however, that John doesn’t need to know the number of paths from all nodes to all nodes. Only from his house to the coffee shop, which saves us a lot of computation.

Suppose his house is node 5 and the coffee shop is node 12. Instead of computing the whole multiplication of \(A\) by itself, we only need to multiply row 5 (a \(1 \times 12\) matrix) by \(A\) (a \(12 \times 12\) matrix), obtaining another \(1 \times 12\) matrix:

\[\begin{bmatrix} 2 & 0 & 0 & 1 & 4 & 2 & 2 & 0 & 2 & 1 & 0 & 1\\ \end{bmatrix}\]

In terms of graphs, this matrix contains the number of paths of length 2 from node 5 to every other node. We can then mutiply this matrix by \(A\) again to obtain the number of paths of length 3 and so on. In theory, we can repeat this infinitely many times. In Haskell:

import Data.Matrix
import Data.Vector

iteratedMult :: Vector Int -> Matrix Int -> [Vector Int]
iteratedMult row m = fmap (getRow 1) (iterate (`multStd` m) (rowVector row))

The use of the Vector type is not strictly necessary here, but makes the type of the function iteratedMult more descriptive. This function returns a lazy infinite list of all the intermediate results of this multiplication. Each intermediate result is necessary to compute the next one, but in the end all we care about are the values at the destination. So we need a function to extract those values:

pathCounts :: Int -> [Vector Int] -> [Int]
pathCounts destination = fmap (Data.Vector.! (destination - 1))

Finally, we take only as many results from this infinite list as we need and compute the sum to get our end result. Putting everything together:

numberOfPaths :: Matrix Int -> Int -> Int -> Int -> Int
numberOfPaths adjacency source destination maxLength =
  let sourceRow = getRow source adjacency
      allProducts = iteratedMult sourceRow adjacency
      paths = Prelude.take maxLength allProducts
  in Prelude.sum (pathCounts destination paths)

Using this function to solve John’s problem:

city :: Matrix Int
city = fromLists [[0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
                 ,[1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
                 ,[1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]
                 ,[1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0]
                 ,[0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0]
                 ,[0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0]
                 ,[0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0]
                 ,[0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0]
                 ,[0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0]
                 ,[0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1]
                 ,[0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1]
                 ,[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0]]

main = print $ numberOfPaths city 5 12 5
-- result: 42

There you go! In way less than 7\(\frac{1}{2}\) million years, the computer gave us the answer to the ultimate question of John, the coffee shop and everything.