[gridengine users] Qmaster Failing
d.love at liverpool.ac.uk
Tue May 3 17:09:20 UTC 2011
"Murphy, Brian (E IT F 45)" <brian.murphy at siemens.com> writes:
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x47df6940 (LWP 2193)]
> 0x000000000055c698 in lCopySwitchPack ()
> which I guess we already knew.
The tight integration bug can segv in different places, but odds are
that's the problem.
> I'll look at monit and getting a coredump.
Don't bother with monit if you have a working solution, and I doubt the
core dump will be useful -- been there.
> We had added 'ulimit -c unlimited' to the master init.d script in hopes
> of getting a dump that way, but no such luck.
Yes. That's what the libcore thing was about.
> It's probably all a moot point anyway since it appears to be a confirmed
> bug with 6.2u5.
Odds on, anyhow, but then the solution is trivial if you can rebuild
qmaster; you don't have to re-install anything else.
More information about the users