[gridengine users] private BDB spooling and backup (was: Letter to the Grid Engine Community (+ need your help!))

Dave Love d.love at liverpool.ac.uk
Tue Jun 12 13:38:14 UTC 2012


The impressive shoot-the-messenger wasn't corrected while I was off-air,
and the original bad advice and extra confusion being sown could end up
with people's spools being clobbered.

Rayson Ho <rayson at scalablelogic.com> writes:

> Dave Love has
> started countless bashing on the Open Grid Scheduler Project, and
> spreading FUDs against Open Grid Scheduler/Grid Engine,

People can assess the accuracy of that to calibrate other claims.  Let's
address the program behaviour behind the poisonous smoke screen.

> Note that Dave Love referenced the Berkeley DB documentation, which says,
>
>  "[DB_PRIVATE] should not be specified if more than a single process
> is accessing the environment because it is likely to cause database
> corruption and unpredictable behavior. For example, if both a server
> application and Berkeley DB utilities (for example, db_archive,
> db_checkpoint or db_stat) are expected to access the environment, the
> DB_PRIVATE flag should not be specified."

It's presumably authoritative and people should take note.  If it's just
wrong, the people who know better than the BDB maintainers should have
got it corrected.

> TRUTH:
> When DB_PRIVATE is used, the *BerkeleyDB environment* is not backed by
> a physical file in the filesystem, and the BerkeleyDB utilities that
> try to open the environment would find that the environment file is

[There's no single "environment file".  The normal disk-backed "regions"
are in several __db... files.]

> missing and act accordingly. So the next question one might have is,
> as the environment file is missing, wouldn't db_archive,
> db_checkpoint, or db_stat then fail to do their job??

Well, the OP was advised to run them.  They are definitely likely not to
function safely, per the documentation, whether or not they're "doing
their job".  I'd have thought someone lecturing like this would know
what they do.

Note the modifications behind qmaster's back (where db_dump is the one
people would want to run):

  spooldb$ ls -l
  total 52
  -rw------- 1 sgeadmin sgeadmin 10485760 2012-06-10 23:04 log.0000000001
  -rw-r--r-- 1 sgeadmin sgeadmin    24576 2012-06-10 23:03 sge
  -rw------- 1 sgeadmin sgeadmin     8192 2012-06-10 23:00 sge_job
  spooldb$ db_stat -d sge | head -n
  3
  Tue Jun 12 11:11:16 2012        Local time
  53162   Btree magic number
  9       Btree version number
  spooldb$ db_dump sge | head -n 3
  VERSION=3
  format=bytevalue
  type=btree
  spooldb$ ls -l
  total 52
  -rw------- 1 sgeadmin sgeadmin 10485760 2012-06-10 23:04 log.0000000001
  -rw-r--r-- 1 sgeadmin sgeadmin    24576 2012-06-12 11:11 sge
  -rw------- 1 sgeadmin sgeadmin     8192 2012-06-10 23:00 sge_job
  spooldb$ db_checkpoint -1
  spooldb$ ls -l
  total 52
  -rw------- 1 sgeadmin sgeadmin 10485760 2012-06-12 11:12 log.0000000001
  -rw-r--r-- 1 sgeadmin sgeadmin    24576 2012-06-12 11:11 sge
  -rw------- 1 sgeadmin sgeadmin     8192 2012-06-10 23:00 sge_job
  
> Note that when
> Grid Engine is not spooling onto a Berkeley RPC server, the
> db_archive, db_checkpoint functionality is implemented inside the
> qmaster (the qmaster handles that by calling the Berkeley DB
> programming APIs), and that is the reason why the "bdb_checkpoint.sh"
> script is not needed when one is not using the RPC spooling server
> (this is true since SGE 6.0 - we did not change the user interface).

Of course, but the OP was advised to run them.  [Checkpointing is
obvious by inspection of the spool, without the need to read the code
for those who haven't already looked at it.]

> And
> thus to support sites running Berkeley RPC spooling and wanting to
> upgrade to Grid Engine 2011.11, we do support RPC spooling in GE
> 2011.11,

C.f. <http://sourceforge.net/mailarchive/message.php?msg_id=29359255>.
(The DB server code is actually still in SoGE, and spooldb will build
with BDB versions which don't have it, but the install script support
was removed, similarly to OGS.)

Yes, the save_sge_config.sh reference in the bootstrap man page is
wrong, which must have been a paste error of another script I was going
to include.  Obviously I know how the script works, having modified it,
but we're not all perfect.  (By the way the "backup" option advertised
by spoolinit isn't actually implemented.)

> It's WRONG again!!
>
> - Note that inst_sge -bup accesses the BerkeleyDB files directly using
> "cp -f", 

The polite word is "disingenuous":

  gridscheduler/trunk$ grep -A 1 db_dump source/dist/util/install_modules/inst_common.sh 
  #         DUMPIT="$SGE_ROOT/utilbin/sol-sparc/db_dump -f"
  #         ExecuteAsAdmin $DUMPIT $backup_dir/$DATE.dump -h $db_home sge
  --
           DUMPIT="$SGE_UTILBIN/db_dump -f"
           ExecuteAsAdmin $DUMPIT $backup_dir/$DATE.dump -h $db_home sge
  --
        DB_BIN="$SGE_ROOT/utilbin/sol-sparc/db_load
        $SGE_ROOT/utilbin/sol-sparc/db_dump"
        DB_LIB="$SGE_ROOT/lib/sol-sparc/libdb-4.2.so"
  --
              $INFOTEXT "32 bit version of db_load or db_dump not
              found. These binaries needs \n" \
                        "to be installed to perform a backup/restore of
                        your BDB RPC Server. \n" \
  --
              $INFOTEXT -log "32 bit version of db_load or db_dump not
              found. These binaries needs \n" \
                             "to be installed to perform a backup/restore
                             of your BDB RPC Server. \n" \

Alternatively run it with "sh -x" to check what creates the .dump file.

> As a responsible person and a responsible Grid Engine implementor, I
> would read the "How to Upgrade to 6.1 Software Using Classic/Berkeley
> DB Spooling" docs first to understand how users are supposed to
> upgrade from 1 version of Grid Engine to another.

[...]

> Ref: http://docs.oracle.com/cd/E19957-01/820-0697/eoqss/index.html

Skip the smoke and mirrors and see the current documentation on the
relevant feature of "automatic" (live) backup at
<http://docs.oracle.com/cd/E24901_01/doc.62/e21973/chapter2.htm#CHDEJAAG>,
although it's basically the same as in the 6.1 docs.  I'd assumed
commercial OGS customers got extra doc with a health warning about that
sort of thing that others don't, but it seems not.

> So it is wrong to claim that DB_PRIVATE is not safe to use with
> inst_sge -bup, giving users WRONG impressions that it is safe when
> DB_PRIVATE is not used. DB_PRIVATE just does not play any roles in
> this context. One *NEEDS TO* shutdown the cluster first before doing
> the inst_sge -bup backup!!

The OGS inst_sge apparently implements the method in the Oracle docs,
and I assume sites use it.  Better rant at those implementors/documentors.

> As William Bryce is not a technical person (he has technical
> background, but not down to the level to understand the details), he
> was misled by you.  What if Univa launches new market ads saying how
> unsafe Open Grid Scheduler/Grid Engine is when using BerkeleyDB
> spooling based on your info??

I gather Bill takes documentation seriously and he obviously has experts
to consult, so I doubt
<http://gridengine.eu/grid-engine-internals/104-univa-grid-engine-81-features-part-4-new-spooling-method-postgresql-spooling-2012-06-01>¹
relies on anything I've said.  Is it just "showing how naïve and
technically incapable other Grid Engine implementors are" who originally
implemented the feature and avoided a private database?

If the limits of my technical incompetence are trusting developers'
documentation in agreement with experiment, and cock-ups writing
documentation, then I'll be pleased and surprised.

__
1. Would someone like to work on free postgres spooling
   <https://arc.liv.ac.uk/trac/SGE/ticket/1331>?

-- 
Community Grid Engine:  http://arc.liv.ac.uk/SGE/




More information about the users mailing list