[gridengine users] I need a decoder ring for the qacct output

Mun Johl mun.johl at kazan-networks.com
Thu Apr 25 16:16:27 UTC 2019


Hi Reuti and Skylar,

Sorry for misspelling your name last time, Skylar.

On Thu, Apr 25, 2019 at 08:56 AM PDT, Reuti wrote:
> > Am 25.04.2019 um 17:41 schrieb Mun Johl <mun.johl at kazan-networks.com>:
> >
> > Hi Skyler, Reuti,
> >
> > Thank you for your reply.
> > Please see my comments below.
> >
> > On Thu, Apr 25, 2019 at 08:03 AM PDT, Reuti wrote:
> >> Hi,
> >>
> >>> Am 25.04.2019 um 16:53 schrieb Mun Johl <mun.johl at kazan-networks.com>:
> >>>
> >>> Hi,
> >>>
> >>> I'm using 'qacct -P' in the hope of tracking metrics on a per project
> >>> basis.  I am getting data out of qacct, however I don't fully comprehend
> >>> what the data is trying to tell me.
> >>>
> >>> I've searched the man pages and web for definitions of the output of
> >>> qacct, but I have not been able to find a complete reference (just bits
> >>> and pieces here and there).
> >>>
> >>> Can anyone point me to a complete reference so that I can better
> >>> understand the output of qacct?
> >>
> >> There is a man page about it:
> >>
> >> man accounting
> >
> > Well, I _did_ look at that prior to posting but I guess I just didn't
> > see the keywords I was looking for.  So maybe I'll just ask the specific
> > questions regarding my confusion.
> >
> > WALLCLOCK is pretty well defined by ru_wallclock.  So that's basically
> > the total wall clock time the job was on the execution host.
> >
> > UTIME is user time used.
> > STIME is system time used.
> >
> > Should (UTIME + STIME) >= WALLCLOCK?  It isn't in my case and is mainly
> > why I am confused.  Or perhaps process wait time is not included?
> 
> You mean in case of a parallel application? You set "accounting_summary" to "true" and get only a single record back?
> 
> This depends how the used CPU time is acquired by the OS (and whether all created processes are taken into account, even if they jump out of the process tree [like with `setsid`]). More reliable is the CPU time collected by SGE by the additional group ID.

Actually, we aren't running a parallel application yet.

I think the answers you two have provided satisfies my confusion.  I
mainly just need to know the wallclock time spent per project.

Thank you again for your informative and quick replies.

Regards,

-- 
Mun



More information about the users mailing list