### Archive

Archive for December, 2011

Aaron (McDaid) and I had an interesting conversation about what we mean by message routing.

We first discussed what it meant to have a single message, or a number of message copies, and termed ‘ink and paper’ to be a ‘cost’ associated with making a copy of a message and passing it to another node, rather than just calling it ‘cost’.

We decided that in my simulations, ‘cost’ did not exactly mean this, as even if a node send 1 message to a node it met, and kept a copy of it in it’s archive, the ‘cost’ would be exactly the same as if it discarded the message.
Whereas in our ink and paper cost scenario, there would be an additional 1 ‘ink and paper cost’ for every ‘copy’ of the message sent. Unless, the original message was passed on.

I mentioned some ideas about how to estimate global properties from a local viewpoint (i.e. graham’s periodicity paper combined with vector clocks).

We then went on to determine what ‘periodic’ meant, in terms of future prediction of events, Aaron was concerned that in fact I meant ‘often’ as opposed to ‘regular’, as often implies that some event is repeated, but not necessarily at a regular interval. Regular implies that the event is repeated with the same interval. (e.g. we meed every thursday), he suggested that I be very careful when using the work ‘periodic’.

He also expressed the concern over my liberal use of the term ‘dynamic networks’ and he suggested that in fact the network may better be describes as a temporal network, as edges are made and broken over time.

We then spent some time discussing a mechanism to determine optimum routes through a series of interactions in a temporal network of people. (I am pretty sure I have read a paper that does something similar to what follows).

Aaron suggested we model contact events as nodes, so we worked through this idea:
We have 5 Nodes A to E.
The first event at time= 0 is where each node meets itself.
A new node and is created when two nodes meet, an edge between each node is created, and an edge between a previous event and the new event is created if at least one of the nodes was in the previous meeting.
Edges are labelled by the amount of time since the last meeting.

At the end of this period, we are now able to determine the best ‘routes’ for sending information between nodes. There are two optimal routes, the time optimal route, and the hop optimal route. Information about A received by C can travel from A to B to C to D, 3 hops and a cost of 2 + 6 + 4 = 12, or between A and C directly, 1 hop at a cost of 13.

In the case of C to D, there are two routes, C to B to D (cost 12), or C to D (cost 12), so how do we decide which to use. Well, we can assume that the best route, is one that minimises hops primarily, and time secondarily. If we add a small epsilon value to the time, e.g. 0.00001,

This gives us:

C to B to D = 2 hops = 8.00001 + 4.00001  = 12.00002

vs

C to D  = 1 hop = 12.00001

So the second, shorter (hops) route takes precedence, as the total time is less. So we have chosen the shortest number of hops when the time period would normally have been equal. The reckoning is that it’s better to ‘copy’  (ink and paper) fewer times, when the total time is the same.

The only case when there is conflict, is when there are an equal number of hops, and the information is delivered after the same time peroid. In this case we might adopt a few strategies:

1. Keep a record of all most optimum paths
2. Record all paths, with a rank of efficiency
3. Choose only paths with the most recent event
4. Choose only paths with the oldest event
5. Choose a path at random from the optimum paths

If we now wished to predict routes, Aaron suggested that an exhaustive mechanism is to predict all possible future connections, assuming existing connections are regular and periodic, calculate the best routes, and send messages accordingly. This would be quite intensive and perhaps infeasible.

Perhaps a better strategy is to use our knowledge of existing routes, and on meeting a node on the optimal path to the destination for a message we hold, pass the message, else keep it.

We didn’t get as far as solving the problem of how to predict the future connections, and therefore calculate the optimal path.

Categories:

## Distributed Community finding in Dynamic Networks

I Have been thinking about this as an extension to the work we have done with BubbleH and BUBBLE Rap. The biggest drawbacks of these is that they assume that community information is available to them. This is not necessarily realistic, as in the real-world, nodes would have to get this information from some where.

A solution to this is to create some sort of distributed community finding mechanism, that adapts to the ever changing dynamic network that is made up by the nodes doing the calculations.

In a previous post, I mentioned Vector Clocks and whilst they like a nice idea, the Vector Clock idea on it’s own doesn’t give us a routing scheme.

I think there is a way to use some of the concepts of the vector clock idea to inform a distributed community finding scheme, which estimates, for example, InfoMap routing, or, ideally, the hierarchical clustering of LinkClustering.

I think there is something in sharing edge infomation (i.e. edges and connected time) which will allow us to detect communities: perhaps using our assumptions about threshold values (Mean, Mediam, 80th percentile and 20th percentile), we can run lightweight algorithms at node level.

This can be either done by a node to calculate it’s own communities, or to calculate communities for other  nodes.

For example, if a node knows about itself and some other nodes communities, it can make decisions about routing. If it has a message for a node that it doesn’t know about, it assumes it belongs to a global community.

It may also be possible to estimate betweenness centrality based on degree over time (as Graham did), in which case we have a way to calculate global rank, and perhaps LocalRank. If we can find a way to detect community size (perhaps nodes advertise membership), then we can also determine BridgingCommunities.

I would have to be careful not to step on Graham’s toes though, as his Persistance Based Routing also relies on connectivity times between nodes, but as far as I am aware, it does not do any sort of clustering of nodes.

Categories: