Archive for February, 2010

Paper reading with Davide 26 Feb 2010

February 26th, 2010 No comments

Spent a few hours going over the following paper with Davide:

Kossinets, G., Kleinberg, J., & Watts, D. (2008). The structure of information pathways in a social communication network. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining – KDD ’08 (p. 435). New York, New York, USA: ACM Press. doi: 10.1145/1401890.1401945.

This was a very useful session, as it meant that we were forced to really understand the paper.

The authors present analysis of email datasets usin using vector clocks as a framework. They argue that the issues of out-of-date information and indirect paths are central to the understanding of the patterns of systemic communicatoion.

They explore Granovetter’s theory of weak ties, which basically says that long range connections can give better information than close connections.

This paper lead me to think that vector clocks would be a great way to discover the state of an unknown network from the point of view of each node. E.g. if a node exchanges vector clocks with other nodes it meets (perhaps with some probability, and time restriction) it could quicky build up in internal state of the following:

  • the number of nodes in its connected network
  • hops to other nodes (?? maybe ?? – )
  • the range of other nodes (i.e. the amount information update a node gives about other nodes) – which can be used as a routing metric
  • membership clustering using the ball of radius (i.e. the nodes it is up to date with to some value Τ days, which could be based on periodicity)
  • periodic degree of a node can also be used (using T as the period)  as a routing metric

Each node must have a unique identity (perhaps based on BT Mac address), even Open World nodes that are not members of the closed world can be counted when it comes to degree calculations, but only as a ramp up, as only closed world nodes (i.e. others using this algorithm) will be of interest when sending messages.

Presentation 10th Feb 2010

February 11th, 2010 No comments

Gave presentation to Paddy, Davide, Neil Cowzer and Fergal Reid (clique) about my quick and dirty analysis of the dataset that I have collected allready.


General concensus was that there was not really enough users, and so there were some suggestions about other datasets that might be found -persuade a mobile phone company to give data about user movements. Mine flickr/twitter for geo-tagged photo’s/tweets, and try to determine groups of people based on similar locations.

Also suggested that the GMA is good for visualising data, not greatly interesting, PH is interesting as is SPD. BD is something that is useful as an application to gather data, but would need a very large engineering effort.

Paddy suggested that if we could make the data collection process very easy, then we could throw it out to the student population to start collecting data. Fergal said that in J2ME it would be very difficult, but by sticking to C++ it might work (for Nokia phones).

Also talked about getting ground truth for data, Fergal Suggested collecting accellorometer data too (so if someone asked – how did you verify GPS trace, one can say that we correlated it with the accelorometer data). I also suggested tagging locations.

Determined the following actions:

  • Look for access to datasets with good location – 1 week
    • WaveLAN Dataset
    • HeaNET – chase paddy – Eduroam
    • Mine location data from Flickr
  • Look at applying analysis to these datasets – specifically
    • Periodicity Hunting
    • Spatial Dependance on the Degree
  • See if we can construct overlay over these networks
    • e.g. drop nodes
      • Popular locations
      • popular people
      • Other?
      • Vector clocks might be the way to do it
  • Read up about Vector Clocks as suggested in the paper by Klineberg, Watts and ???? at  KDDOA
  • Speak to Graham about whether I can easily integrate this data into his code, if so – do it, otherwise think about implementing it seperately(robustly!)

Also planned to meet Paddy again next week to go over these things, and try to hammer out a better plan. Then meet with these people again in three weeks to show where I have go to.

Davide also talked about churn in proximity patterns – might be worth thinking about what this means (example was then a person regularly sees other people, and after a while, one of those people drops off the radar – what does this mean)

Paddy said that in his mind, the long goal is to be able to forward plan using the knowledge of data that has passed (prediction).

Discussion with Davide about plots etc 4th Feb 2010

February 4th, 2010 No comments

Three types of data analysis:

General Mobility Analysis

We calculate the distance between locations at the start of every time period, (e.g. 1 hour) and plot the number of time that particular distance is travelled (to some granularity) over some time period (1 week maybe)

Periodicity Hunting

We measure the time spent at a location, and count the number of times in a bounded time period (say a week), using the same timescale as above to bracket readings.

(people visit common locations frequenly, or the visit some locations for a long period of time. – also think about the case that lots of people visit a common location infrequently/frequently).

Statial Dependance of the Degree

We count the number of devices seen in a given time period (same as above – e.g. 1 hour) and the location

Buddy Discovery

We count the duration of the contacts between pairs (the user and the devices he can see) and also the location of the contacts, and try to see which devices are seen most often, and then try to see which devices are seen at multiple locations. (using the same time period as above – 1 hour slots over a week)

Categories: discussions, Ideas, projects

Discussion with Paddy and Davide 2nd Feb 2010

February 2nd, 2010 No comments

Met with Paddy and Davide and discussed what we have been doing.

  • Actions from last meeting:
  • Said that I had been collecting data which seems to have good location information.
  • Had spoken with prag etc. but not really very useful
  • Davide has come up with some great questions for analysis of data
  • The only thing I hadn’t done was arrange a presentation for findings so far.

Paddy was happy with the progress so far, and after we discussed a number of things, we came to the following action points:

  1. Do a quick and dirty analysis of data
    1. Mobility analysis
    2. Periodicity
    3. Buddys
    4. Spatial degree
    5. Situation detection e.g. what does periodiciy mean?
  2. This is so that we can ask:
    • Do we have the data we need already?
    • What are the limitations of the data?
    • Are there other questions we need to ask?
  3. Plan a presentation for next wednesday morning (more of a brainstorm) to develop the ideas further, and really try to hammer down the larger plan

Paddy also suggested that we think about putting a paper into ubicomp (deadline 13th March) about our analysis of this data, but put a spin on it, e.g. what does periodicity mean? Can we predict events based on this? – Can we infer some useful context, based simply on the structure of the data, without the need for advanced techniques ( – i call this Urban Guerilla Sensing).

We suggested that we might be able to do two applications based on one of buddy finding analysis part (see mobile_agents and PhD the Story) the first, Paddy dubbed F3 (Facebook Friend Finder) where we encourage people to collect data for us, in return for detecting the presence of other facebook users, and suggesting friends based on frequency of co-location. The second was a similar application, but for regular visitors to research seminars.

I mentioned my vision on the next three points of reference, the first being a paper about the collection and analysis of this dataset, the second being another work which tied this into an simulator for the dataset, which synthesises this data in to a generic set, which can be used to test MANETs etc. The final thing (I didn’t get this far) being the final writeup of my PhD which brings all of these ideas together.

Paddy likes this, and suggested the idea of Pattern Language (used to desrcribe patterns in software engineering) which had recently been applied to Ubicomp environments to describe patterns in situations, Paddy thought that this might be particularly relevent to this, and that he would like to see some language of description emerge from our analysis. This sounds like a great idea. 🙂

Finally, Paddy spoke anbo