[gridengine users] Anyone using S-GAE reporting app with Univa grid engine?

Jesse Becker beckerje at mail.nih.gov
Tue Mar 3 00:38:23 UTC 2015


On Mon, Mar 02, 2015 at 04:21:57PM -0500, Jesse Becker wrote:
>One thing that we *have* learned is that you should keep all of the
>raw records.  They compress well, and disk space is cheap.  Our UGE
>logs compress about 85% using gzip -9, and is fast.  Other methods
>(xz) get almost 90%, but take about 100 times longer to compress.
>(The specific method doesn't matter, even LZO would do nicely).


Some numbers for reference.  YMMV, etc 

The source file covers a recent time period, with 1,485,006 jobs in a
about a week of time.

Normalized compression times (CPU user time, gzip = 1.0).  Options were
"-9" for gzip, "-e" for xz, and levels 1, 3, 6, and 9 for lzo:

    lzo1 =  0.055
    lzo3 =  0.061
    lzo6 =  0.062
    gzip =  1.000
    lzo9 =  2.510
    xz   = 23.538

    [jb at host sge]$ ls -l accounting.0*
    -rw-r--r-- 1 jb jb 748505122 Feb 14 13:26 accounting.0
    -rw-r--r-- 1 jb jb 201460466 Mar  2 18:49 accounting.0.lzo1 (73.1%)
    -rw-r--r-- 1 jb jb 200146962 Mar  2 18:49 accounting.0.lzo3 (73.3%)
    -rw-r--r-- 1 jb jb 200146962 Mar  2 18:49 accounting.0.lzo6 (73.3%)
    -rw-r--r-- 1 jb jb 109623543 Mar  2 16:06 accounting.0.gz   (85.4%)
    -rw-r--r-- 1 jb jb 135796662 Mar  2 18:51 accounting.0.lzo9 (81.9%)
    -rw-r--r-- 1 jb jb  75222044 Mar  2 16:17 accounting.0.xz   (90.0%)

    [jb at host sge]$ wc -l accounting.0
    1485006 accounting.0

Note that lzo9 took 2.5 times longer to compress, but produced a notably
larger file.


Decompression times (writing to /dev/null), normalized to gzip:

    lzo9: 0.332
    lzo6: 0.351
    lzo3: 0.361
    lzo1: 0.362
    gzip: 1.000
    xz:   2.002

Interesting that xz is so much faster at decompression.

-- 
Jesse Becker (Contractor)



More information about the users mailing list