Archive for January, 2012

Exhaustive Simulations

January 18th, 2012 No comments

I re-ran all of the simulations, as I found an odd quirk in the MIT-NOV dataset, where the Hui Centrality was not calculated correctly, and resulted in NaN being recorded for every node in every local community, which gave very odd results… this was down to a dataset error, which I have rectified, and in future cases when NaN is returned, it is replaced with a rank of 0.0. I think it was down to a divide by zero error in the Hui  Ranking code.

This results in less pronounced results for MIT-NOV than we previously looked at. Other results were note affected.

The only other notable quirk was that Moses did not identify communities for IncoCom-2005, IncoCom-2005 and MIT-ALL. The plots below show results for each Dataset.

Still working on getting a decent run out of Salathe-School, and Studivz.

Categories: experiments

THE plan

January 12th, 2012 No comments

Complete exhaustive runs on all datasets:

  • Enron-a
  • Enron-b
  • Salathe??
  • Studivz

Implement Conga – use the number of communities from InfoMap as the target number for Conga.

Implement InfoMap-H.

Above to be completed by next weeks meeting, 19th Jan.

Write a thesis chapter based on this work, with a view to making it into a journal article. Chapter and Artical to be completed by the end of March 2012.

Then, conduct another piece of work to tie it all in, probably based on Vector Clocks.

Finally, fill in the (rather large, and difficult) blanks in the thesis.  PC: Easy when said quickly 🙂

Current Tasks

January 11th, 2012 No comments

Below is a list of tasks I had before I went on holiday for Christmas, since I came back, I have worked my way through most of them (strikethrough).

  • Incorporate the Blondel Louvain method into simulator
  • Incorporate the  Conga (steve ????) method into simulator
  • Consider using the InfoMap hierarchical method
  • Implement multiple Random runs giving an average result
  • Incorporate the dataset from the Salathe paper (motes in a school)
  • Complete an exhaustive run:
    • Use entire dataset for training and testing (using all core nodes)
    • over all datasets (MIT-NOV, MIT-ALL, Social Sensing, Hypertext2009, Cambridge, InfoCom05, InfoCom06, Enron-a, Studivz (selected nodes), Salathe-School dataset
    • using all algorithms (KCLIQUE, HGCE, InfoMap, LinkClustering, Moses, Blondel, Conga, Random)
    • selecting best of four threshold parameters (where appropriate)
  • Generate plots for all of the above
  • All that currently remains, is to incorporate Conga, which is a little tricky, as it is not easily to integrate it into the simulator (no output options that can be worked into later scripts), and there is no source code available.

    Also, heirarchical InfoMap needs to be looked at….

    Also, runs for MIT-ALL and Enron-a need to be set-up.

    Categories: what i've been doing