Updated plots for Community Structure
In the last meeting I had with Pádraig we went through the spectral clustering technique, and discovered that there were some errors in my calculations of the the clusters. I took the time to re-do the community calculations, and to run the tests again, to make sure all is correct. I used the MIT dataset, and constrained it to the month of October 2004 (as before). Messages are created and initially sent at the first time step. The selected communities for each week are taken as the biggest group, which in practice were roughly the same set of connected nodes. I chose to omit the nodes with a value of 0.0 in the matrix (V) (see data pipelines).
Delivery Ratio over time

This plot shows the delivery ratio over time for MIT-OCT dataset, for LBR communities (where the community is taken from weeks 1, 2, 3, 4 and all 4 weeks), PBR, Prophet, Unlimited Flood and Random(1.0, 0.2, 0.0).
Final Delivery Ratio and Cost

This chart shows delivery ratio and cost for each Protocol, note that unlimited flood does not report cost.
In these updated plots, it is encouraging to see that with the corrected community structure, LBR actually performs well, and in fact for the community structure in week 1, it performs better overall than PBR, with weeks 2, 3, 4, all, close behind. It is also good to see that LBR is beating Random 0.0, and 0.2 consistently. LBR week 1 Community also tracks a small way behind Flooding for a short while around 05/10 – 08/10 which is quite interesting. The communities are listed below (or can be viewed here), for further analysis.
UPDATE:
Having given this a little more thought, this is a little mis-leading, so I have generated new plots where all Protocols are constrained to the same community nodes. Unfortunately, this yields a similar spread of results between the protocols, as did the results that did not consider community structure. Only the structure discovered in week 1 increases the delivery ratio.
Delivery Ratio for each community, for each protocol

This plot shows delivery ratios for each community generated from different weeks and all weeks in Oct 2004 for the MIT Reality Mining dataset
What I plan to do now is to go back to the ranking algorithm, and use the linked cells mechanism to re-calculate the rankings. It might also be useful to start thinking about being able to use the real-time location information (in this case cell towers/linked cell towers) in a routing algorithm directly. This would let us start to implement some simplistic location predictions.
I also created a clearer graph visualisation for the cell tower links, this shows that there is a componant that encompasses a large percentage of the reported cells overall. If we were to consider this one location, it is possible that it covers a very large area.

This shows the cell towers that are connected when at least 100 messages are passed and reported at both cell towers, edges are numbered where this is more than 300.