[gridengine users] Running with clients and servers under SGE
William Hay
w.hay at ucl.ac.uk
Wed Jun 6 15:47:19 UTC 2012
On 28 May 2012 19:10, Earl Lazarus <earl.lazarus at gmail.com> wrote:
> I'll try to be as succinct at possible:
>
> 1) We have developed a CPU intensive simulation that is to be run on an SGE
> cluster.
> 2) Each simulation is a client that also requires an "environment" server to
> be running on that host. The environment server
> is associated with the physical environment (a location on the earth and
> a month). The client treats the
> server as a function call, making a query and waiting for a response, so the
> CPU impact of the server is minimal.
> The environment servers would generally be started before the main
> simulations are submitted to SGE and would be
> left running after each of the main simulations end. The servers can
> communicate with multiple clients needing the same
> environment representation. I might envision the servers running
> continuously for a week while the user submits hundreds of
> the Monte Carlo simulations, each taking 15 min of wall clock time. When
> the user is finished with his "study", he shuts down
> the servers.
> 2) Currently each host has a slot count equal to the number of CPUs (4).
> 3) There are other simulations already running under SGE on this same
> cluster; i.e. there are lots of other users.
> 4) At the moment I have 4 flavors of server, each representing a different
> physical environment. To get them up
> on each host will take 4 slots. If the host has only 4 CPUs, then it is
> saturated and no clients can run.
> 5) If I up the slot count to 8, then I can have 4 clients and 4 servers
> running. But the side effect is that if
> I am not running MY software, then SGE can feed that host 8 CPU intensive
> simulations belonging to
> someone else, thus oversubscribing the 4 CPUs by a factor of 2.
> 6) If only "slots" came in "flavors", then slots 5 thru 8 could only be used
> by my servers and no one else.
They do more or less. Here's one way to do it:
Use one queue for the servers with 4 slots another for
everything else also with 4 slots. Specify -q normal (or whatever) in
the global sge_request file.
When submitting a "server" job just specify -q server to override the
sge_request file.
If this is just for you you could add an ACL/userset on the server
queue. If you are after a more general
facility the possibly just set h_cpu low on the server queue so that
anyone trying to run real jobs on it
gets them killed real quick.
The above may not work if you already have a complex queue setup for
other reasons.
William
>
> Any ideas?
> 1) I could ensure that I only run 2 flavors of server on a 4 CPU host,
> leaving 2 slots for CPU intensive simulations
> to be run (either mine or those of other users). But then I'm cutting
> down the throughput since the servers
> use such a small amount of CPU resources.
More information about the users
mailing list