[gridengine users] Handling job dependences
reuti at staff.uni-marburg.de
Fri May 6 14:47:39 UTC 2011
Am 06.05.2011 um 16:23 schrieb jensli at lavabit.com:
> This is a problem we haven't figured out how to solve in a good way.
> We are going to use GE to run regression tests on servers that run our
> software. The test cases are distributed on a pool of servers that are
> used only for the regressions, one test case is one GE job, one queue
> instance on each server.
> The regression suites run against servers that have been configured in
> different ways Say that there are two different configurations, Conf A and
> Conf B, and that there are 100 test cases in suite A that must be run on a
> server configured with Conf A and 50 cases in suite B on a server with
> Conf B. The servers have to be initially configured at least once at the
> start of a regression, and can then reconfigured to run cases from another
> test suite.
With servers you mean exechosts here - right?
So the workflow would be:
- Regression B is ready and eligible to be scheduled, but we would need to reconfigure the exechosts first before it really starts?
Will there always be only one job per exechost, so it's safe to be reconfigured? What needs to be reconfigured in detail?
One solution could be to run 3 VMs per exechost (and each exechost appears triple in SGE), but only one VM at a time might receive jobs?
> The problem is that it takes some time to reconfigure the test servers, so
> we want to avoid doing that as much as possible.
> What we would like is to have GE to figure out when to reconfigure the
> servers, so that servers are automatically configured when there are
> pending jobs but not unnecessarily reconfigured.
> A simple, static, solution would be to, before the regression starts,
> configure 10 servers with Conf A, and 5 with Conf B. Then add a resource
> to the queue instances that corresponds to the configurations of that
> execution host. This is inflexible, but we really haven't come up with a
> better solution.
> Other ideas:
> * Could 'Reconfigure' be made into a GE job, that change the resource
> setup for the queue on execution host where it is run? Could GE be set up
> to run those jobs at an appropriate time?
> * Or could we use job dependencies in some way?
> * Maybe we have to write out own submission wrapper script? To examine the
> queue and the available execution hosts and decide when to reconfigure
> servers and change resources.
> * Could we abuse JVS’s for this in some way?
> Hm. I hope the description is understandable.
> This is actually one of the classic planning problems of AI. Can GE be
> made to solve it?
> users mailing list
> users at gridengine.org
More information about the users