How to run LocationSim for multiple parameters of GCEH
In Plotting all parameters to GCE-H in BubbleH simulation I showed the results of plotting lots of parameters to GCEH, in one simulation run. Here are some more details:
The process worked as follows:
- Generate edge list file with weights for run period
- Run GCEH script with multiple parameter values, on the edge list
- Generate community list files, and hierarchy files from each output of GCE for use by LocationSim
- Prepare configuration files for community ranking and simulation
- Generate community rankings for each community
- Run simulation for each community set
- Visualise output
1) Edge List
java -jar dtnsim.jar xml/graphs/edgelist.xml 4 DATASET=mit-nov-cheat
<!--?xml version="1.0"?--> <!-- <property name="DATASET" value="generated-comm" />--> <!-- messages take a constant delay to reach other nodes: 1000ms --> <!-- this experiment calculates the connected time data for all nodes --> <!-- <property name="aggregate-graph-task-rounding" value="8" /> -->
2) Run GCEG Script
mattstabeler@erdos:~/GCE/master$ php -f ~/LocationSim/scripts/clustering/batchGCE.php ~/LocationSim/OUTPUT/edgelist-graphs/mit-nov-cheat/edge_list.dat
<!--?php // must be run from the home directory of gce_serial.py if(!file_exists("gce_serial.py")){ die("must be run from the home directory of gce_serial.py (/home/mattstabeler/GCE/master/)rn"); //exit(); } $script = array_shift($argv); foreach($argv as $infile){ if(file_exists($infile)){ $infiles[] = $infile; }else{ throw new Exception("File does not exist: " . $infile); } } $datasets = array("mit-nov-cheat"); $ks = array(3,4,5); $es = array(0.15,0.25); $sts = array(0.9,0.5); $maps = array(0.9,0.5); $zs = array(0.2); //~ /LocationSim/OUTPUT/edgelist-graphs/mit-nov-cheat/edge_list.dat if(!$infiles){ die("No files specifiedrn"); } foreach($infiles as $infile){ $infile = realpath($infile); $info = pathinfo($infile); $outfile = $info['dirname'] . "/" . $info['basename'] . ""; $command = 'cat ' . $infile .' | cut -d " " -f "1,2,3" | python gce_serial.py -t5 -k${K} similarityThreshold=${ST} minAppearanceProp=${MAP} -e${E} -z${Z} --> ' . $outfile . '.gce_output_K-${K}_ST-${ST}_MAP-${MAP}_E-${E}_Z-${Z}.dat'; foreach($ks as $k){ foreach($es as $e){ foreach($sts as $st){ foreach($maps as $map){ foreach($zs as $z){ $run = preg_replace('/${K}/', $k, $command); $run = preg_replace('/${E}/', $e, $run); $run = preg_replace('/${Z}/', $z, $run); $run = preg_replace('/${MAP}/', $map, $run); $run = preg_replace('/${ST}/', $st, $run); $data = passthru($run); //~ `$run`; } } } } } } ?>
3) Generate files for LocationSim
mattstabeler@erdos:~/LocationSim/scripts/transform$ php -f ConvertGCEHOutput.php ~/LocationSim/OUTPUT/edgelist-graphs/mit-nov-cheat/edge_list.dat.gce_output*.dat
This generates: a re-written gce output file, which has consecutively numbered community ID’s from 0 to N (edge_list.dat.gce_output_K-3_ST-0.5_MAP-0.5_E-0.15_Z-0.2.dat.renumbered.dat), a LocationSim compatible list of communitites (edge_list.dat.gce_output_K-3_ST-0.5_MAP-0.5_E-0.15_Z-0.2.dat.communites.dat) and a JSON representation of the communities for use in LocationSim (edge_list.dat.gce_output_K-5_ST-0.9_MAP-0.9_E-0.25_Z-0.2.dat.json).
<!--?php require "../objects/GCEHParser2.php"; $script = array_shift($argv); foreach($argv as $infile){ if(file_exists($infile)){ $infiles[] = $infile; }else{ throw new Exception("File does not exist: " . $infile); } } $force_parent = false; $use_prefixes = false; foreach($infiles as $infile){ $parser = new GCEHParser($infile, $use_prefixes, $force_parent); $info = pathinfo($infile); $out_file = $info['dirname'] . '/'. $info['basename'] . ""; if($force_parent){ $out_file = $out_file . ".global_parent"; } file_put_contents($out_file . ".json", $parser--->toJson()); file_put_contents($out_file . '.communites.dat', $parser->toSimCommunityList()); file_put_contents($out_file . '.renumbered.dat', $parser->toString()); } ?>
Move the output of all files to /datasets/communities/{dataset namet}/{community type}
e.g /home/mattstabeler/LocationSim/datasets/communities/mit-nov-cheat/GCEH/
4) Config files
Datasets config:
~/LocationSim/xml/datasets/gceh-communities.xml
<!--?xml version="1.0" standalone="yes" ?--> <!-- this will make sure that we can use multiple values of K and it will load new communities each run-->
Community Generation Config
~/LocationSim/xml/bubbleH/centrality-k.xml
<!--?xml version="1.0"?--> <!--- This configuration creates the community ranking --> <!-- Create Global rankings (global.dat) --> <!-- Create local rankings (community.n.dat) --> <!-- End repreat COMMUNITY_TYPE --> <!-- End repreat DATASET --> <!-- End repreat Z --> <!-- End repreat MAP --> <!-- End repreat ST --> <!-- End repreat E --> <!-- End repreat K -->
Simulation Config
~/LocationSim/xml/bubbleH/bubbleH-k.xml
<!--?xml version="1.0"?--> <!-- This task resets the counters used by DEGREE and CONNECTION type metrics, but this must mean it is not a sliding window, but a fixed window mechanism -->
5) Generate community rankings
mattstabeler@erdos:~/LocationSim$ java -jar dtnsim.jar xml/bubbleH/centrality-k.xml 10
6) Run simulation
mattstabeler@erdos:~/LocationSim$ java -jar dtnsim.jar xml/bubbleH/bubbleH-k.xml 10
During the run, the system used only 4 cores (out of 24) and 6.1GB of memory (out of 124GB) – and should not have affected other users considerably (hopefully!).
7) Visualise output
mattstabeler@erdos:~/LocationSim$ python scripts/plot/plot-barchart-special.py 2 2 OUTPUT/bubbleH/mit-nov-cheat-k-test/GCEH-*/bubbleH/0.all-pairs.dat
Muchas gracias. ?Como puedo iniciar sesion?