[gridengine users] Core Binding on Magny-Cours
Ansgar.Esztermann at mpi-bpc.mpg.de
Mon Mar 7 13:42:47 UTC 2011
is anyone using the core binding feature on AMD Magny-Cours? If so, how do you do it? For me, loadcheck mis-reads the topology as (SCTTCTTCTTCTTCTTCTT)*4 rather than (SCCCCCCCCCCCC)*4. As far as I can tell, there are two problems:
- the kernel (2.6.18-194.26.1.el5) only exposes core_id and physical_package_id for each core, with core_id running from 0 to 5 and physical_package_id from 1 to 4. However, there are two dice (nodes) per socket, but this is not shown in the device tree.
- loadcheck does not seem to even look for that information. get_processor_ids_linux() simply counts the number of cores sharing core and socket number, and get_topology_linux blithely assumes that any number >1 means mulltiple threads.
Later on, this leads to the problem that any job submitted with -binding actually blocks twice the number of cores I've asked for, effectively reducing our 48-core nodes to 24 cores. To be honest, the causality here is just an assumption by me.
Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
More information about the users