Home > Datasets, what i've been doing > BubbleRAP routing using different clustering techniques

BubbleRAP routing using different clustering techniques

February 15th, 2011 Leave a comment Go to comments

After a lot of wrangling with code, and spending a good deal of time reading through notes, etc. I was able to work out how to generate community clusters using LocationSim, and how to feed in clusters determined by other mechanisms.

BubbleRAP uses and adapted form of KCLIQUE clustering to generate community structures, along with a score of Betweenness centrality (referred to as Hui Ranking) globally and within clusters to drive the routing process. The adaption relates to the use of a contact threshold, (the Bubblerap paper uses 4.5 days, Graham uses 2 days), which specifies how long on average nodes must be in contact for them to be counted.

I adapted the clustering phase to use MOSES, GCE (using Conrad’s code) and the algorithm by Yong-Yeol Ahn, James P. Bagrow, Sune Lehmann in Link communities in complex networks which I have name ABL. At this stage I have tested only the basic algorithms against MIT-OCT and MIT-ALL node lists.

Dataset # Communities Average Size
MIT-OCT
KCLIQUE t=4.5days
KCLIQUE t=2days
MOSES 8 29
GCE 3 41
ABL 64 8
MIT-ALL
KCLIQUE t=4.5 days 6 6
KCLIQUE t=2 days 7 9
MOSES
GCE 1 101
ABL 16 19

I found that KCLIQUE does not find any communities for the small time slice in MIT-OCT, this could be because the threshold means that in the course of a month, participants do not spend alot of time together on average.  I will need to explore this further to verify this this.

MOSES and GCE perform badly on the denser, MIT-ALL dataset, and after speaking to Aaron and Conrad, it seems this can be expected with such a closely connected dataset. My next step is to try to use GCE using edge weights, to see if that makes a difference.

in order to get some idea of results for MIT-ALL MOSES and GCE, and MIT-OCT KCLIQUE, I used the communities from their opposite positions to populate the datasets, this meant I was able to simulate BubbleRAP for all cluster types for both MIT-OCT and MIT-ALL.

Another intersting thing I discovered, was Grahams implementation of  Bubble-Degree, which instead of using the Hui Ranking for communities, nodes are ranked by degree over a 6 hour time period. Bubble-Connections worked in the same way but the ranking is based on the number of connections in a 6 hour period. (Graham explains this in section 6.1.2 in his thesis, and in the comments of his code).

The results are below for all cluster types, and both datasets, using bubble-degree, and bubblerap. Also included is unlimited flood to show the maximum possible delivery between all pairs of nodes.

Results for Bubble and Bubble-Degree for dataset MIT-ALL, using KCLIQUE, MOSES, GCE and ABL

Results for Bubble and Bubble-Degree for dataset MIT-ALL, using KCLIQUE, MOSES, GCE and ABL

Results for Bubble and Bubble-Degree for dataset MIT-OCT, using KCLIQUE, MOSES, GCE and ABL

Results for Bubble and Bubble-Degree for dataset MIT-OCT, using KCLIQUE, MOSES, GCE and ABL

This seems to suggest that the clustering algoritm does have some effect on the routing algorithm, and further analysis may be able to show whether this is due to simply the difference in number/size of communities.

Other tasks to do:

  • Investigate recent improvements of Bubble Rap (if any)
  • Consider different network data  (e.g. can we run a similar analysis over a phonecall network?)
  • Tweaks to clustering algorithms?
  • Chase up SocialSensing dataset
  • Continue to think about Location Based Routing – is it feasable still?
  1. Pádraig
    February 15th, 2011 at 19:39 | #1

    Hi Matt,
    If I am interpreting the results correctly then this might be promising.
    GCE has a couple of parameters so it may be possible to get it to produce more than one community on the ALL data. Can you ask Conrad about this?

    Let’s discuss this on Thursday.

    Pádraig.

  1. No trackbacks yet.