How do you find the shortest path in a network of which not all connections are known, such as the internet or protein pathways in the body? Scientists have been searching for the answer to that question for decades. Researchers from TU Delft, among others, have now found a solution for certain networks.
Finding the fastest route in a network is a well-known problem that you can solve if you know all the nodes and connections. A solution to this came in 1959 in the form of the Dijkstra’s algorithm, developed by Dutch computer scientist Edsger Dijkstra. This algorithm is used, for example, in navigation systems.
‘But as soon as we don’t know everything about the network, the shortest path problem quickly becomes very difficult,’ says network scientist Maxim Kitsak, from TU Delft. He developed with colleagues a new method which can calculate the shortest path between two nodes in certain networks. This turns out to be possible even if 90 percent of the connections in the network are hidden, or if the network contains fake links.
READ ALSO
Brain training apps lack scientific backing
A lot of apps claim to make their users smarter. Not only are these brain training apps quite boring, their effect on our performance is…
Trust in the internet
This method is in high demand. Most large-scale networks are incomplete. Take social networks for example. During the corona crisis, the government tried to map people’s social contacts. That is an incomplete network, because it is impossible – and undesirable from a privacy point of view – to know all of everyone’s contacts.
‘Another example is the internet,’ says Kitsak. ‘The management of this is largely in the hands of companies that all deal with part of the network.’ Those companies don’t disclose much information. In addition, changes happen every few minutes.
At the moment, the internet operates on the basis of trust. Routers, the nodes of the internet, tell their neighbors to which other nodes they have short lines. The neighbors pass this on and it becomes clear through which nodes you can reach someone.
Kitzak: ‘Imagine: I’m in contact with a great PhD student, and I’m telling you about it. You can then share that information with others and tell them: if you have a message for this great PhD candidate, let me know, I will pass it on to Maksim and he will pass it on to that PhD candidate.’
In this scenario, you have to trust that everyone is telling the truth, and that no one is sending your confidential email past an intelligence agency or tabloid.
Shortest path in Manhattan
The method developed by Kitsak and his colleagues can make the Internet more secure by detecting and removing fraudulent communication paths. This may allow us to avoid situations in which internet traffic is undesirably redirected via Russia or China, for example.
The method works as follows: the researchers first look for a so-called geometric representation of the network. This representation distributes the nodes over a geometric space. Depending on the network, this can be an ordinary (Euclidean) be flat or something complex, such as a hyperbolic Poincaré disk. The distribution of the nodes in that space represents their mutual distances. Then you draw the shortest line between the two nodes you want to connect – the so-called geodesic. If the geometric space is a plane, then it is a straight line. The nodes closest to that line then most likely form the shortest path between the two points.
This method proves to work well, even if you don’t know many connections between the nodes. Kitsak and his colleagues were able to find the shortest paths on the Internet even when up to 90 percent of connections were randomly dropped.
“You can compare it to finding your way to a skyscraper in New York’s Manhattan,” says Kitsak. ‘The streets there form a grid pattern. And if you want to walk to the Empire State Building, for example, you will see it towering above the other buildings. You can now easily walk to that building because you know how the streets are arranged. You get there by always choosing a street that runs as far as possible in the direction of the skyscraper.’
No one size fits all
“Our work does not offer a one-size-fits-all solution for all incomplete networks,” says Kitsak. ‘We make two assumptions about the network. The first is that you can make a geometric representation of it.’ This is possible with the internet, but it is not certain that it is possible with all networks.
The second assumption is that there are about the same number of missing connections everywhere in the network, and whether a connection is missing or not is completely random. In reality, that is rarely the case. For example, Internet connections can change locally or disappear due to geopolitical events.
Kitsak: ‘We are now going to find out what the limits are of these assumptions. And we hope that the method will continue to work if, for example, the connections are not completely randomly and uniformly distributed across the network.’