BubbleRAP routing using different clustering techniques
After a lot of wrangling with code, and spending a good deal of time reading through notes, etc. I was able to work out how to generate community clusters using LocationSim, and how to feed in clusters determined by other mechanisms.
BubbleRAP uses and adapted form of KCLIQUE clustering to generate community structures, along with a score of Betweenness centrality (referred to as Hui Ranking) globally and within clusters to drive the routing process. The adaption relates to the use of a contact threshold, (the Bubblerap paper uses 4.5 days, Graham uses 2 days), which specifies how long on average nodes must be in contact for them to be counted.
I adapted the clustering phase to use MOSES, GCE (using Conrad’s code) and the algorithm by Yong-Yeol Ahn, James P. Bagrow, Sune Lehmann in Link communities in complex networks which I have name ABL. At this stage I have tested only the basic algorithms against MIT-OCT and MIT-ALL node lists.
Dataset | # Communities | Average Size |
---|---|---|
MIT-OCT | ||
KCLIQUE t=4.5days | – | – |
KCLIQUE t=2days | – | – |
MOSES | 8 | 29 |
GCE | 3 | 41 |
ABL | 64 | 8 |
MIT-ALL | ||
KCLIQUE t=4.5 days | 6 | 6 |
KCLIQUE t=2 days | 7 | 9 |
MOSES | – | – |
GCE | 1 | 101 |
ABL | 16 | 19 |
I found that KCLIQUE does not find any communities for the small time slice in MIT-OCT, this could be because the threshold means that in the course of a month, participants do not spend alot of time together on average. I will need to explore this further to verify this this.
MOSES and GCE perform badly on the denser, MIT-ALL dataset, and after speaking to Aaron and Conrad, it seems this can be expected with such a closely connected dataset. My next step is to try to use GCE using edge weights, to see if that makes a difference.
in order to get some idea of results for MIT-ALL MOSES and GCE, and MIT-OCT KCLIQUE, I used the communities from their opposite positions to populate the datasets, this meant I was able to simulate BubbleRAP for all cluster types for both MIT-OCT and MIT-ALL.
Another intersting thing I discovered, was Grahams implementation of Bubble-Degree, which instead of using the Hui Ranking for communities, nodes are ranked by degree over a 6 hour time period. Bubble-Connections worked in the same way but the ranking is based on the number of connections in a 6 hour period. (Graham explains this in section 6.1.2 in his thesis, and in the comments of his code).
The results are below for all cluster types, and both datasets, using bubble-degree, and bubblerap. Also included is unlimited flood to show the maximum possible delivery between all pairs of nodes.
This seems to suggest that the clustering algoritm does have some effect on the routing algorithm, and further analysis may be able to show whether this is due to simply the difference in number/size of communities.
Other tasks to do:
- Investigate recent improvements of Bubble Rap (if any)
- Consider different network data (e.g. can we run a similar analysis over a phonecall network?)
- Tweaks to clustering algorithms?
- Chase up SocialSensing dataset
- Continue to think about Location Based Routing – is it feasable still?
Hi Matt,
If I am interpreting the results correctly then this might be promising.
GCE has a couple of parameters so it may be possible to get it to produce more than one community on the ALL data. Can you ask Conrad about this?
Let’s discuss this on Thursday.
Pádraig.