[gridengine users] Collecting of scheduler job information is turned off

Joseph Farran jfarran at uci.edu
Fri Jun 8 20:13:12 UTC 2012


Hey PK - I know you! :-)

Thanks everyone for your helpful information.    I am now getting correct and useful information for why a job will not start.

Best,
Joseph

On 06/08/2012 12:19 PM, Prakashan Korambath wrote:
> I have often found "qalter -w p <jobid> " output helpful in diagnosing  the problems. Try man "qalter" and search for "poke" to get details.
>
> Prakashan
>
>
> On 06/08/2012 12:10 PM, Rayson Ho wrote:
>> This time it is performance reasons... in fact I was at a site that
>> was experiencing performance issues&  the qmaster was using more
>> memory than they ever like (their qmaster has many services running on
>> the machine). So they also turned scheduler info off (the default was
>> on at that time - and yes, they were perfectly fine with running a
>> very old version of Grid Engine).
>>
>> To turn it back on, set schedd_job_info to TRUE:
>>
>> http://gridscheduler.sourceforge.net/htmlman/htmlman5/sched_conf.html
>>
>> Rayson
>>
>>
>>
>> On Fri, Jun 8, 2012 at 3:02 PM, Joseph Farran<jfarran at uci.edu>  wrote:
>>> Me again :-)
>>>
>>> The Queue access list by Linux groups ( /etc/group ) is working perfectly!
>>>
>>> I submitted a test job to the bio queue from an account that has bio group
>>> ownership and the job runs.    When I submitt a test job to the bio queue
>>> from an account that does *not* belong to the bio linux group, the job does
>>> *not* start - as it should not.
>>>
>>> All works exactly as it should, however, for the job that does not start,
>>> when I try getting information as to why the job is not running expecting
>>> something like you are not authorized to run this queue (the bio queue), it
>>> does not say anything.
>>>
>>> Doing "qstat -j" on the not running job says (at the bottom):
>>>
>>>     (Collecting of scheduler job information is turned off)
>>>
>>> I Google this error and it is not straight forward what I need to turn on
>>> for job scheduler information.    Is this why OGE does not give any
>>> meaningful information as to why the job is not running?
>>>
>>> Here is the output of qstat:
>>>
>>> $ qstat -j 63
>>> ==============================================================
>>> job_number:                 63
>>> exec_file:                  job_scripts/63
>>> submission_time:            Fri Jun  8 11:45:02 2012
>>> owner:                      test
>>> uid:                        509
>>> group:                      users
>>> gid:                        100
>>> sge_o_home:                 /data/users/test
>>> sge_o_log_name:             test
>>> sge_o_path:<long path removed>
>>> sge_o_shell:                /bin/bash
>>> sge_o_workdir:              /data/users/test/oge
>>> sge_o_host:                 hpc
>>> account:                    sge
>>> cwd:                        /data/users/test/oge
>>> mail_list:                  test at login-1-1.local
>>> notify:                     FALSE
>>> job_name:                   Test
>>> jobshare:                   0
>>> hard_queue_list:            bio
>>> shell_list:                 NONE:/bin/bash
>>> env_list:
>>> script_file:                job.sh
>>> scheduling info:            (Collecting of scheduler job information is
>>> turned off)
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at gridengine.org
>>> https://gridengine.org/mailman/listinfo/users
>>
>> _______________________________________________
>> users mailing list
>> users at gridengine.org
>> https://gridengine.org/mailman/listinfo/users
>
>



More information about the users mailing list