[gridengine users] Maximum memory for running process?
Daniel Gruber
dgruber at univa.com
Tue Aug 9 06:28:52 UTC 2011
Am 08.08.2011 um 18:41 schrieb William Deegan:
> On 8/6/2011 12:59 AM, Daniel Gruber wrote:
>> Am 03.08.2011 um 10:28 schrieb William Hay:
>>
>>> On 2 August 2011 17:58, Rayson Ho<rayrayson at gmail.com> wrote:
>>>> It's a bug introduced by another bug fix in SGE 6.2u5, and Oracle was
>>>> first who fixed the bug in Oracle Grid Engine. Then we added a
>>>> workaround in SGE 6.2u5p1 in Open Grid Scheduler, and Son of Grid
>>>> Engine copied it. I think Univa also fixed the bug at some point, as
>>>> the fix was copied by Son of Grid Engine (and dropped the workaround).
>>>> OGS will just stick with the workaround as we don't like the
>>>> workaround or the fix...
>>>>
>>>> You will just need to upgrade your SGE 6.2u5 cluster with a patched
>>>> SGE execd - either compile execd yourself or in fact you can get it
>>>> from the hwloc drop-in upgrade package:
>>>>
>>>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
>>>>
>>> hwloc looks rather interesting. Do your integrations work with other
>>> versions of Grid Engine (we're at 6.2u3)?
>> Just for completeness: Univa Grid Engine 8.0.1 is going to support
>> hwloc as well.
>
> Are those changes already in the https://github.com/gridengine/gridengine git repo?
>
Not yet.
Daniel
> Thanks,
> Bill
>
>>
>> Cheers,
>>
>> Daniel
>>
>>> From poking around on the hwloc pages it appears to support cgroups
>>> which can do a lot more than just bind cpus and memory.
>>> Presumably if one had a cgroup based system one could just extend the
>>> hwloc created cgroup with the required additional features.
>>>
>>> William
>>>
>>>
>>>
>>>> Rayson
>>>>
>>>>
>>>> On Tue, Aug 2, 2011 at 8:15 AM, Jesse Becker<beckerjes at mail.nih.gov> wrote:
>>>>> On Mon, Aug 01, 2011 at 07:41:41PM -0400, William Deegan wrote:
>>>>>> Should the maxvmem column in the accounting file be the true max memory
>>>>>> footprint of the running process? (and children?)
>>>>> I've seen problems with 6.2u5 in the accounting records. It appears to
>>>>> "wrap" at 4GB, which probably indicates a 32/64 bit issue. I think
>>>>> there's information about it in the mailing list.
>>>>>
>>>>> I'm not sure about child processes.
>>>>>
>>>>> --
>>>>> Jesse Becker
>>>>> NHGRI Linux support (Digicon Contractor)
>>>>>
>
More information about the users
mailing list