[gridengine users] I need a decoder ring for the qacct output

Reuti reuti at staff.uni-marburg.de
Thu Apr 25 15:56:27 UTC 2019


> Am 25.04.2019 um 17:41 schrieb Mun Johl <mun.johl at kazan-networks.com>:
> 
> Hi Skyler, Reuti,
> 
> Thank you for your reply.
> Please see my comments below.
> 
> On Thu, Apr 25, 2019 at 08:03 AM PDT, Reuti wrote:
>> Hi,
>> 
>>> Am 25.04.2019 um 16:53 schrieb Mun Johl <mun.johl at kazan-networks.com>:
>>> 
>>> Hi,
>>> 
>>> I'm using 'qacct -P' in the hope of tracking metrics on a per project
>>> basis.  I am getting data out of qacct, however I don't fully comprehend
>>> what the data is trying to tell me.
>>> 
>>> I've searched the man pages and web for definitions of the output of
>>> qacct, but I have not been able to find a complete reference (just bits
>>> and pieces here and there).
>>> 
>>> Can anyone point me to a complete reference so that I can better
>>> understand the output of qacct?
>> 
>> There is a man page about it:
>> 
>> man accounting
> 
> Well, I _did_ look at that prior to posting but I guess I just didn't
> see the keywords I was looking for.  So maybe I'll just ask the specific
> questions regarding my confusion.
> 
> WALLCLOCK is pretty well defined by ru_wallclock.  So that's basically
> the total wall clock time the job was on the execution host.
> 
> UTIME is user time used.
> STIME is system time used.
> 
> Should (UTIME + STIME) >= WALLCLOCK?  It isn't in my case and is mainly
> why I am confused.  Or perhaps process wait time is not included?

You mean in case of a parallel application? You set "accounting_summary" to "true" and get only a single record back?

This depends how the used CPU time is acquired by the OS (and whether all created processes are taken into account, even if they jump out of the process tree [like with `setsid`]). More reliable is the CPU time collected by SGE by the additional group ID.

-- Reuti


More information about the users mailing list