My next task is to create a scheduling simulator – and a scheduler to further test my hypothesis.
I have been playing with Alea and GridSim. I will upload the papers here to this post, but you can google for both of these research projects.
1.GridSim is an established project, and used in the community. The creator is Rajkumar Buyya, who is very well known researcher and author in this field (http://www.buyya.com/gridsim).
2.Alea though not well known or an established project, presented a great idea in how to simulate a scheduler and large grid environments. (http://www.fi.muni.cz/~xklusac/alea/)
Alea, however, was not complete and I ended up rewriting the scheduler module, a fair-share scheduler and a job loader. I am not sure if I would like to open source the project – or my portions of the project as they present a great tool for simulating large grid environments and may be commercialized.
The greatest challenge was to create a job loader that can create a task profile similar to max{0, a*sin(bx)}. This pattern of task submission is very common in HPC environments. In a typical scenario, a client submits a certain number of tasks; waits for some responses to come back or creates another set of tasks, and repeats. This start-stop fashion of task submission is where the greatest ability to game the system comes from. For example, for one sets of submits a = 1000 (or 1000 tasks are submitted), but for a subsequent set, a = 5000. This fluctuation, although not modeled in this version of the scheduling simulator, can present a great challenge to a scheduler aiming to fairly distribute resources.
Art Sedighi