Research begins. I met Professor Brockmeyer and the graduate students in my lab: Jawwad, who is working with me on this project, Xinjie, Chunbo, and Anne-Marie. During this week, I primarily acquainted myself with the implementation and theory of the simulator as it stands.
This week, we decided to test and evaluate the searching algorithm for exploring our network. In addition to the blind Breadth First Search already implemented, we decide to try a greedy version that gives priority to network edges with a lower latency. Furthermore, we decide to test a normal and a greedy Depth First Search algorithm.
In order to accurately rate each algorithm, we needed to develop a way that each algorithm could be run on the same topology with the same messages being sent to the same locations. Previously, the sender and recievers had been randomly chosen at runtime. This week, we implemented an option for the simulator to read in a list of messages from a text file, maintaining consistancy.
Finally, we ran the various algorithms, and ultimately came to the conclusion that the original BFS was the best option. In most cases, the optimized version returned a lower average latency per message. However, the improvement was insufficient to warrant the additional time required.
This was a very interesting week on the home front for Jawwad, whose firstborn son Mahad arrived on Wednesday. Father, mother, and baby are doing fine, with both grandmothers in town to help out.
Having trouble remembering what else I did this week...
During this week, I began implementation of a feature that would enable the simulator to cause a particular edge to fail, invalidating all channels that had previously used that edge and precluding that edge from being used again. We also began creating a means for the simulator to dynamically fail nodes, and to recover these edges and nodes at a later time.
Ultimately, our goal is to test the efficiency of this network-level implementation of failure with regards to the original model, which relied upon a "dice roll" based on the probability of the failure of the channel of a whole or either of the end nodes.
Most of this week was spent working on the implementation of the new failure paradigm discussed above. We also improved the user interface by moving user-declarable options from definitions embedded in the code to text files parsed by the program.
This week, we developed a mechanism to automatically generate the failures of nodes and edges by the program in an edge-level analysis. Prior to the start of the actual simulation of sending messages back and forth, the simulator will calculate how many nodes and edges need to be failed at each point in time, randomly choose them, and produce time-triggered events of failure and recovery.
This week was primarily spent working on the improvement of the failure model. We decided that, instead of allowing the user to select upper and lower bounds for the failure rate and randomly select from within that area for each point in time, the user would input the average failure rate and the rate at a given time would be generated using a Poisson distribution.
During this week, we entered the final stretch. We decided to change the manner in which failure and recovery messages are generated. Rather than determine them before time, we decided to produce them over the course of the simulation. This allows us to let the simulation run indefinately, as the previous option required a set end to the predetermined failures. Also, this allows us to "skip over" the long stretches of time between meaningful (not failure or recovery) messages, and not go through the calculation of failures that, due to the length of time prior to recover, are extremely unlikely to affect them.
We also attempted to run some experiments on the different paradigm options, to compare their time efficiency, memory usage, and faithfulness to real-world data. However, our efforts were stymied by the mysterious appearance of a segmentation fault between a pair of "cout" statements, indicating a deeper flaw in our program but offering no clues to its solution.
Time to leave. These few days were spent primarily on the scholarly tasks of modifying the documentation for our program and the beginning of the final paper. Despite the last-minute setback of code error, this project has been a rewarding and valuable experience. I plan to stay in contact with Professor Brockmeyer and Jawwad, and hopefully we can get the problem solved and obtain useful results in the months ahead. It has been absolutely wonderful working in this lab, and I wish them all the best.