[gridengine users] Qmaster Failing

Dave Love d.love at liverpool.ac.uk
Tue May 3 17:09:20 UTC 2011

"Murphy, Brian (E IT F 45)" <brian.murphy at siemens.com> writes:

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x47df6940 (LWP 2193)]
> 0x000000000055c698 in lCopySwitchPack ()
> (gdb) 
> which I guess we already knew.

The tight integration bug can segv in different places, but odds are
that's the problem.

> I'll look at monit and getting a coredump.

Don't bother with monit if you have a working solution, and I doubt the
core dump will be useful -- been there.

> We had added 'ulimit -c unlimited' to the master init.d script in hopes
> of getting a dump that way, but no such luck.

Yes.  That's what the libcore thing was about.

> It's probably all a moot point anyway since it appears to be a confirmed
> bug with 6.2u5.

Odds on, anyhow, but then the solution is trivial if you can rebuild
qmaster; you don't have to re-install anything else.

More information about the users mailing list