Links

Content Skeleton

This Page

Previous topic

TODO

Next topic

Sys Admin Docs

LOG

Review Lessons from recovery

  1. not having an operational repo + trac + docs : serious hit on ability to recover and build these
  1. benefit of plain text sphinx docs (compared to wiki docmentation : hidden away in sqlite DB)
  1. self-hosting env (the scripts for building) not a good idea
  1. move at least env to github / google code ?
  2. move important documentation to Sphinx
  1. use of older machines generally means source builds for usable versions as distros are behind

Failed to find viable interim server

  1. cms01, system apache config for heprez too sensitive to touch as currently in use

    1. a front to tomcat + exist and static presenter
  2. belle7, in use for OUM + dybslv

    1. sudo yum --enablerepo=epel info trac at 0.10.5 MAYBE NEW ENOUGH : USING BELLE1
  3. belle1, made some progress but balked at having to downgrade 0.11 to 0.10.5 (and expectation that this would entail lots of trac config effort)

cms01 as stand-in server ?

  1. try to apache-edit to use existing trac+svn install on cms01
    • cms01 config has been changed for heprez
[blyth@cms01 ~]$ sudo apachectl stop
Syntax error on line 17 of /etc/httpd/conf/svnsetup/repos.conf:
Unknown DAV provider: svn

[blyth@cms01 ~]$ sudo apachectl start
Syntax error on line 17 of /etc/httpd/conf/svnsetup/repos.conf:
Unknown DAV provider: svn

[blyth@cms01 conf.d]$ ll /usr/lib/httpd/modules/*dav*
-rwxr-xr-x  1 root root  39324 Oct 25  2011 /usr/lib/httpd/modules/mod_dav_fs.so
-rwxr-xr-x  1 root root  80316 Oct 25  2011 /usr/lib/httpd/modules/mod_dav.so
-rwxr-xr-x  1 root root 426545 Dec 31  2007 /usr/lib/httpd/modules/mod_dav_svn.so

[blyth@cms01 ~]$ ll /usr/lib/httpd/modules/*svn*
-rwxr-xr-x  1 root root  29087 Dec 31  2007 /usr/lib/httpd/modules/mod_authz_svn.so
-rwxr-xr-x  1 root root 426545 Dec 31  2007 /usr/lib/httpd/modules/mod_dav_svn.so

Key precursor for apache config is svnsetup-, mismatch:

[Wed May 02 17:36:03 2012] [notice] Apache/2.0.52 (Scientific Linux) configured -- resuming normal operations
[Wed May 02 17:36:19 2012] [error] sys:1: RuntimeWarning: Python C API version mismatch for module _apache: This Python has API version 1013, module _apache has version 1012.
[Wed May 02 17:36:19 2012] [error] make_obcallback: could not import mod_python.apache.\n

belle1 as stand-in : balked at trac downgrade 0.11 to 0.10.5

EPEL distro trac 0.10.5

Yum epel based installation on belle1 : lighttpd/fastcgi/trac + apache/mod_dav_svn + ...

[blyth@belle1 ~]$ sudo yum --enablerepo=epel deplist trac
sqlite3 db/trac.db "update system set value=20 where name='database_version'"

EPEL distro config is not very smooth:

[blyth@belle1 trac-0.10.5]$ cat  /etc/httpd/conf.d/trac.conf
# Replace all occurrences of /srv/trac with your trac root below
# and uncomment the respective SetEnv and PythonOption directives.
<LocationMatch /cgi-bin/trac\.f?cgi>
    #SetEnv TRAC_ENV /srv/trac
</LocationMatch>
<IfModule mod_python.c>
<Location /cgi-bin/trac.cgi>
    SetHandler mod_python
    PythonHandler trac.web.modpython_frontend
    #PythonOption TracEnv /srv/trac
</Location>
</IfModule>
[blyth@belle1 trac-0.10.5]$

lighttpd fastcgi

Maybe can avoid apache with lighttpd:

[blyth@belle1 httpd]$ sudo yum --enablerepo=epel install lighttpd
[blyth@belle1 httpd]$ sudo yum --enablerepo=epel install lighttpd-fastcgi


[blyth@belle1 ~]$ rpm -ql trac | grep trac.fcgi
/var/www/cgi-bin/trac.fcgi

moving trac fast cgi into fastcgi avoids error,

2012-05-02 19:49:23: (server.c.1512) server stopped by UID = 0 PID = 17722 2012-05-02 19:49:26: (log.c.166) server started

but no show, nope need to

include "conf.d/fastcgi.conf"

gets to http://belle1.nuu.edu.tw/trac/ with:

Internal Error

The user root requires read _and_ write permission to the database file
/var/scm/tracs/test/db/trac.db and the directory it is located in.

Dealing with that:

[blyth@belle1 httpd]$ lighttpd-chown /var/scm/tracs/test
=== lighttpd-chown : sudo chown lighttpd:lighttpd -R /var/scm/tracs/test

gets to raw trac : using lighttpd + fastcgi on the test instance at

apache mod_dav_svn

unfortunately cannot do away with apache, as need http access to repo with mod_dav_svn distinct possibility this will mess up the trac install from epel ? Comparing deplist yields hope it will work:

[blyth@belle1 ~]$ sudo yum deplist mod_dav_svn
[blyth@belle1 ~]$ sudo yum deplist trac

Both .so from mod_dav_svn:

[blyth@belle1 ~]$ rpm -ql mod_dav_svn
/etc/httpd/conf.d/subversion.conf
/usr/lib/httpd/modules/mod_authz_svn.so
/usr/lib/httpd/modules/mod_dav_svn.so

Problem with using lighttpd/fastcgi/trac + apache/mod_dav_svn is port clash change the port for apache : as main use if via SVN when ugly URLs are not as glaring as with trac:

[blyth@belle1 ~]$ sudo /sbin/service httpd start
Starting httpd: (98)Address already in use: make_sock: could not bind to address [::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
                                                           [FAILED]

Open the port for me:

IPTABLES_PORT=8080 iptables-webopen-ip 140.112.102.77

Succeeds to make revision 0 of test repo visible

Can checkout:

g4pb-2:~ blyth$ svn co http://belle1.nuu.edu.tw:8080/repos/test
Checked out revision 0.
g4pb-2:~ blyth$
g4pb-2:~ blyth$ cd test
g4pb-2:test blyth$ svn st
g4pb-2:test blyth$ svn info
Path: .
Working Copy Root Path: /Users/blyth/test
URL: http://belle1.nuu.edu.tw:8080/repos/test
Repository Root: http://belle1.nuu.edu.tw:8080/repos/test
Repository UUID: 99398d55-88de-4a62-90a3-7cc4ad44927c
Revision: 0
Node Kind: directory
Schedule: normal
Last Changed Rev: 0
Last Changed Date: 2012-05-02 19:31:08 +0800 (Wed, 02 May 2012)

But checkin fails:

g4pb-2:test blyth$ svn add check.txt
A         check.txt
g4pb-2:test blyth$ svn ci -m "test mod_dav_svn on belle1 "
svn: E000013: Commit failed (details follow):
svn: E000013: Can't create directory '/var/scm/repos/test/db/transactions/0-1.txn': Permission denied

Fixing apache ownership allows checkin to work:

[blyth@belle1 conf.d]$ apache-chown /var/scm/repos -R
=== apache-chown : sudo chown -R apache:apache /var/scm/repos


g4pb-2:test blyth$ svn ci -m "test mod_dav_svn on belle1 "
Adding         check.txt
Transmitting file data .
Committed revision 1.

Visible via mod_dav_svn and trac:

belle1 next

  • multi “tracs” and “repos”
  • authentication + authorization
  • trac plugins : which extensions are really needed ? 0.10.4 probably means some difficulties

change to TRAC_ENV_PARENT_DIR in /etc/lighttpd/conf.d/fastcgi.conf succeeds to list projects and serve them in extensible manner

~/env/trac/tracdep.bash I have played with trac fastcgi previously it seems : with trac 0.11

generalize svnsetup- to the lighttpd/fastcgi/trac + apache/mod_dav_svn and get working with epel yum trac 10.4 + corresponding AccountManager plugin

svnsetup-repos- anon-or-real repos YES

trac version issues

http://trac-hacks.org/wiki/AccountManagerPlugin
(before Trac 0.11 this has been a separate trac:WebAdmin plugin)

need users and authz files to proceed, so transfer the backups:

[blyth@cms01 scm]$ rsync -e ssh -razvt /data/var/scm/backup/cms02 belle1.nuu.edu.tw:/var/scm/backup/

furnish these from backup with:

svnsetup-; svnsetup-from-backup-bootstrap

now get permission denied for http://belle1.nuu.edu.tw:8080/repos/test/

cms02 source build

replacement cms02 evaluation

  1. replacement cms02 network accessible + setup sudoer user account

  2. setup ssh keys, affording single keystoke login:

    g4pb-2:~ blyth$ ssh--putkey cms02.phys.ntu.edu.tw
  3. check what we have, hmm older than cms01 ... means a source build of trac for a reasonable version:

    [blyth@hfag blyth]$ cat /etc/redhat-release
    Scientific Linux SL release 3.0.9 (SL)
    
    [blyth@cms02 ~]$ cat /etc/redhat-release
    Scientific Linux SL release 4.5 (Beryllium)
    
    [blyth@cms01 ~]$ cat /etc/redhat-release
    Scientific Linux CERN SLC release 4.8 (Beryllium)
    
    [blyth@belle7 repos]$ cat /etc/redhat-release
    Scientific Linux SL release 5.1 (Boron)
  4. check distro versions

  • trac version from EPEL for SL4 is 0.9.3 sudo yum --enablerepo=epel info trac : too old for compatibility

basic yum installs

The former hep6 has nowt:

sudo yum install gcc
sudo yum install gcc-c++
sudo yum install curl
sudo yum install zlib-devel

tracpreq-again-source one-by-one

doing the tracpreq-again-source one-by-one:

t tracpreq-again-source
tracpreq-again-source is a function
tracpreq-again-source ()
{
    local msg="=== $FUNCNAME :";
    [ "$(tracpreq-mode)" != "source" ] && echo $msg ABORT this is for tracpreq-mode:source only && return 1;
    local ans;
    read -p "$msg ENTER YES TO PROCEED" ans;
    [ "$ans" != "YES" ] && echo $msg skipping && return 1;
    log-;
    log-init $FUNCNAME;
    python-;
    pythonbuild-;
    pythonbuild-again | log-- $FUNCNAME pythonbuild-again;
    configobj-;
    configobj-get | log-- $FUNCNAME configobj-get;
    swig-;
    swigbuild-;
    swigbuild-again | log-- $FUNCNAME swigbuild-again;
    apache-;
    apache-again | log-- $FUNCNAME apache-again;
    svn-;
    svnbuild-;
    svnbuild-again | log-- $FUNCNAME svnbuild-again;
    sqlite-;
    sqlite-again | log-- $FUNCNAME sqlite-again
}

Mis-ordering in the build

Hmm ordering seems wrong,

  1. configobj-get requires svn
  2. swigbuild-get requires svn

subversion : configure failures

configuring svn:

checking zlib.h presence... no
checking for zlib.h... no
configure: error: subversion requires zlib
make: *** No targets specified and no makefile found.  Stop.
make: *** No rule to make target `install'.  Stop.
Can't open Makefile: No such file or directory.
Can't open Makefile: No such file or directory.
diff: Makefile.orig: No such file or directory
diff: Makefile: No such file or directory
make: *** No rule to make target `swig-py'.  Stop.
make: *** No rule to make target `install-swig-py'.  Stop.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named svn
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named svn

Need the zlib-devel:

[blyth@cms02 subversion-1.4.6]$ rpm -ql zlib
/usr/lib64/libz.so.1
/usr/lib64/libz.so.1.2.1.2
/usr/share/doc/zlib-1.2.1.2
/usr/share/doc/zlib-1.2.1.2/README
/usr/lib/libz.so.1
/usr/lib/libz.so.1.2.1.2
/usr/share/doc/zlib-1.2.1.2
/usr/share/doc/zlib-1.2.1.2/README
[blyth@cms02 subversion-1.4.6]$

Missing a --shared in subversion config:

       cd subversion/libsvn_ra_dav && /bin/sh /data/env/system/svn/build/subversion-1.4.6/libtool --tag=CC --silent --mode=link gcc  -g -O2  -g -O2 -pthread   -L/data/env/system/svn/build/subversion-1.4.6/apr-util/xml/expat/lib  -rpath /data/env/system/svn/subversion-1.4.6/lib -o libsvn_ra_dav-1.la  commit.lo fetch.lo file_revs.lo log.lo merge.lo options.lo props.lo replay.lo session.lo util.lo ../../subversion/libsvn_delta/libsvn_delta-1.la ../../subversion/libsvn_subr/libsvn_subr-1.la /data/env/system/svn/build/subversion-1.4.6/apr-util/libaprutil-0.la /data/env/system/svn/build/subversion-1.4.6/apr-util/xml/expat/lib/libexpat.la /data/env/system/svn/build/subversion-1.4.6/apr/libapr-0.la -lrt -lm -lcrypt -lnsl  -lpthread -ldl /data/env/system/svn/build/subversion-1.4.6/neon/src/libneon.la -lz
       /usr/bin/ld: /data/env/system/svn/build/subversion-1.4.6/neon/src/.libs/libneon.a(ne_request.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
       /data/env/system/svn/build/subversion-1.4.6/neon/src/.libs/libneon.a: could not read symbols: Bad value
       collect2: ld returned 1 exit status
       make: *** [subversion/libsvn_ra_dav/libsvn_ra_dav-1.la] Error 1



* http://svn.apache.org/repos/asf/subversion/site/publish/faq.html#relocation-against-local-symbol

Backup Transfer

Transfer the cms01 backups to the new cms02:

[blyth@cms01 ~]$ rsync -av /data/var/scm/backup/cms02 cms02.phys.ntu.edu.tw:/var/scm/backup/
blyth@cms02.phys.ntu.edu.tw's password:

pysqlite : later handled by fixing python build

pysqlite failing... is it really needed?:

building 'pysqlite2._sqlite' extension
creating build/temp.linux-x86_64-2.5
creating build/temp.linux-x86_64-2.5/src
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DMODULE_NAME="pysqlite2.dbapi2" -DSQLITE_OMIT_LOAD_EXTENSION=1 -I/data/env/system/python/Python-2.5.6/include/python2.5 -c src/module.c -o build/temp.linux-x86_64-2.5/src/module.o
In file included from src/module.c:24:
src/connection.h:33:21: sqlite3.h: No such file or directory
In file included from src/module.c:24:
src/connection.h:38: error: syntax error before "sqlite3"
src/connection.h:38: warning: no semicolon at end of struct or union

from http://trac.edgewall.org/wiki/0.11/TracInstall maybe not:

If you're using Python 2.3 or 2.4 and need pysqlite,

tracbuild-auto

Launch tracbuild-auto, abort as find no easy_install or setuptools:

Checked out revision 4117.
Traceback (most recent call last):
  File "setup.py", line 9, in <module>
    from setuptools import setup
ImportError: No module named setuptools
=== package-look-version : version in the setup trunk/setup.py

Try setuptools-get but meet zlib issue, the zlib-devel was not available when building python:

[blyth@cms02 bitextra]$ setuptools-get
python ez_setup.py ... from /tmp/env/setuptools
Downloading http://pypi.python.org/packages/2.5/s/setuptools/setuptools-0.6c11-py2.5.egg
Traceback (most recent call last):
  File "ez_setup.py", line 278, in <module>
    main(sys.argv[1:])
  File "ez_setup.py", line 212, in main
    from setuptools.command.easy_install import main
zipimport.ZipImportError: can't decompress data; zlib not available
[blyth@cms02 bitextra]$

pythonbuild

back to pythonbuild-configure then make, but think this was not needed... should have just done make : the zlib handling is done in setup.py after interpreter created not at config level ? many curses errors from python make:

running build_ext
INFO: Can't locate Tcl/Tk libs and/or headers
building '_curses' extension
gcc -pthread -fPIC -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/data/env/system/python/build/Python-2.5.6/./Include -I/data/env/system/python/Python-2.5.6/include -I. -IInclude -I./Include -I/usr/local/include -I/data/env/system/python/build/Python-2.5.6/Include -I/data/env/system/python/build/Python-2.5.6 -c /data/env/system/python/build/Python-2.5.6/Modules/_cursesmodule.c -o build/temp.linux-x86_64-2.5/data/env/system/python/build/Python-2.5.6/Modules/_cursesmodule.o
In file included from /data/env/system/python/build/Python-2.5.6/Modules/_cursesmodule.c:113:
/data/env/system/python/build/Python-2.5.6/./Include/py_curses.h:45:20: curses.h: No such file or directory
In file included from /data/env/system/python/build/Python-2.5.6/Modules/_cursesmodule.c:113:
/data/env/system/python/build/Python-2.5.6/./Include/py_curses.h:73: error: syntax error before "WINDOW"
/data/env/system/python/build/Python-2.5.6/./Include/py_curses.h:73: warning: no semicolon at end of struct or union

eliminated after:

sudo yum install ncurses
sudo yum install ncurses-devel

still one complaint from make:

running build_ext
INFO: Can't locate Tcl/Tk libs and/or headers
running build_scripts

aftre python make install setuptools-get succeeds

back to tracbuild-auto run into lack of tracdev (duh the server is dead ... need to skip bitextra):

=== package-get : brn:trunk bnm:trunk pkt:svn tba:trunk url:http://dayabay.phys.ntu.edu.tw/repos/tracdev/annobit/trunk
=== package-get : svn checkout http://dayabay.phys.ntu.edu.tw/repos/tracdev/annobit/trunk rev 123 into /data/env/local/env/trac/package/bitextra with basename trunk
svn: PROPFIND request failed on '/repos/tracdev/annobit/trunk'
svn: PROPFIND of '/repos/tracdev/annobit/trunk': could not connect to server (http://dayabay.phys.ntu.edu.tw)
=== package-get : ABORT failed to checkout ...

skip packages requireing tracdev via temporary hiding:

[blyth@cms02 package]$ mv bitextra.bash.tmp-hide tmp-hide/bitextra.bash
[blyth@cms02 package]$ mv trac2latex.bash tmp-hide/
[blyth@cms02 package]$ mv trac2mediawiki.bash tmp-hide/
[blyth@cms02 package]$ l tmp-hide/
total 24
-rw-r--r--  1 blyth blyth 2681 Jan 10 09:30 bitextra.bash
-rw-r--r--  1 blyth blyth 3624 Jan 10 09:30 trac2latex.bash
-rw-r--r--  1 blyth blyth 3090 Jan 10 09:30 trac2mediawiki.bash

issue with docutils, move it out of way also, install it ordinarily env/python/docutils.bash rather than as a trac package

cms02 : tarball recovery

Try scm-backup-; scm-recover-all cms02:

=== scm-backup-synctrac : resyncing the instance with the repository ... as repository_dir has changed ... avoiding the yellow banner
=== trac-admin- : trac-admin : /data/env/system/python/Python-2.5.6/bin/trac-admin
=== trac-admin- : python : /data/env/system/python/Python-2.5.6/bin/python
=== trac-admin-
 LLP
 /data/env/system/sqlite/sqlite-3.3.16/lib
/data/env/system/svn/subversion-1.4.0/lib/svn-python/libsvn
/data/env/system/svn/subversion-1.4.0/lib/svn-python/svn

python: error while loading shared libraries: libpython2.5.so.1.0: cannot open shared object file: No such file or directory
=== trac-admin- : ABORT non-supported sqlite/pysqlite version ... see http://dayabay.phys.ntu.edu.tw/tracs/env/wiki/TracSQLiteMemoryExhaustion
env-abort
=== env-abort : ABORT ... sleeping forever

Reproducible:

[blyth@cms02 ~]$ trac-
[blyth@cms02 ~]$ TRAC_INSTANCE=data trac-admin-- resync
Password:
=== trac-admin- : trac-admin : /data/env/system/python/Python-2.5.6/bin/trac-admin
=== trac-admin- : python : /data/env/system/python/Python-2.5.6/bin/python
=== trac-admin-
 LLP
 /data/env/system/sqlite/sqlite-3.3.16/lib
/data/env/system/svn/subversion-1.4.0/lib/svn-python/libsvn
/data/env/system/svn/subversion-1.4.0/lib/svn-python/svn

python: error while loading shared libraries: libpython2.5.so.1.0: cannot open shared object file: No such file or directory
=== trac-admin- : ABORT non-supported sqlite/pysqlite version ... see http://dayabay.phys.ntu.edu.tw/tracs/env/wiki/TracSQLiteMemoryExhaustion
env-abort
=== env-abort : ABORT ... sleeping forever

python sqlite issue

Seems the pythonbuild did not find the sqlite, both an order problem and fact that the python setup.py looks only in hardcoded locations for sqlite headers. Kludge the setup.py of the python build with added path:

sqlite_inc_paths = [ '/usr/include',
                           '/usr/include/sqlite',
                           '/usr/include/sqlite3',
                           '/usr/local/include',
                           '/usr/local/include/sqlite',
                           '/usr/local/include/sqlite3',
                           '/data/env/system/sqlite/sqlite-3.3.16/include',
                         ]
   pythonbuild-cd
   make
   make install

This enables the version check to work:

trac-admin-sqlite-check-
sqlite_version_string:3.3.16 have_pysqlite:2

The sudo bash environment has the wrong SVN_HOME:

[blyth@cms02 Python-2.5.6]$ sudo bash -c "export ENV_HOME=$ENV_HOME ; . $ENV_HOME/env.bash ; env- ; echo \$SVN_HOME "
/data/env/system/svn/subversion-1.4.0
[blyth@cms02 Python-2.5.6]$ echo $SVN_HOME
/data/env/system/svn/subversion-1.4.6

Needed to set versions for NODE_TAG C2R to get correct paths, looked like bug with python-path but was not, just the C2R issue.

recover-all has some fails in Resyncing repository history... namely:

Command failed: /var/scm/svn/dybaux does not appear to be a Subversion repository.
Command failed: /var/scm/svn/dybsvn does not appear to be a Subversion repository.
Command failed: /var/scm/svn/toysvn does not appear to be a Subversion repository.

which correspond to traconlyies(how come?) in backup:

[blyth@cms02 scm]$ l backup/cms02/repos/
total 48
drwxr-xr-x  6 blyth blyth 4096 May  1 13:02 aberdeen
drwxr-xr-x  6 blyth blyth 4096 May  1 13:02 data
drwxr-xr-x  6 blyth blyth 4096 May  1 13:02 env
drwxr-xr-x  6 blyth blyth 4096 May  1 13:02 heprez
drwxr-xr-x  6 blyth blyth 4096 May  1 13:02 newtest
drwxr-xr-x  6 blyth blyth 4096 May  1 13:02 tracdev

[blyth@cms02 scm]$ l backup/cms02/tracs/
total 72
drwxr-xr-x  6 blyth blyth 4096 May  1 13:03 aberdeen
drwxr-xr-x  6 blyth blyth 4096 May  1 13:03 data
drwxr-xr-x  6 blyth blyth 4096 May  1 13:03 env
drwxr-xr-x  6 blyth blyth 4096 May  1 13:04 heprez
drwxr-xr-x  6 blyth blyth 4096 May  1 13:04 newtest
drwxr-xr-x  6 blyth blyth 4096 May  1 13:04 tracdev

drwxr-xr-x  4 blyth blyth 4096 May  1 13:04 toysvn
drwxr-xr-x  6 blyth blyth 4096 May  1 13:03 dybaux
drwxr-xr-x  3 blyth blyth 4096 Oct 17  2011 dybsvn

Also some broken link in htdocs issues:

=== apache-chown : sudo chown -R nobody:nobody heprez
/bin/chown: cannot dereference `heprez/htdocs/docs': No such file or directory
/bin/chown: cannot dereference `tracdev/htdocs/docs': No such file or directory


[blyth@cms02 scm]$ l tracs/*/htdocs/docs
lrwxrwxrwx  1 root root 48 May  4 20:12 tracs/heprez/htdocs/docs -> /data/usr/local/heprez/src/hfag/mods/webapp/docs
lrwxrwxrwx  1 root root 48 May  4 20:13 tracs/tracdev/htdocs/docs -> /data/usr/local/heprez/src/hfag/mods/webapp/docs

cms02 : apache hookup

Next, getting SVN+trac hooked up to source apache with svnsetup-apache

SVN and Trac access working:

http://dayabay.phys.ntu.edu.tw/repos/env/trunk/

Service hookup with apache-initd symbolic linking to apachectl

Post commit failure:

g4pb-2:env blyth$ ci -m "bring belle1 N1 into the sshconf fold "
...
Transmitting file data .....
Committed revision 3444.

Warning: 'post-commit' hook failed with error output:
/var/scm/repos/env/hooks/post-commit: line 8: /data/env/system/python/Python-2.5.1/bin/python: No such file or directory

cms02 : backups + rsync + chkconfig

C2R env hookup

Hookup root@cms02 ssh C2R to env to do backups, in .bash_profile:

export ENV_HOME=/home/blyth/env  ; env-(){      [ -r $ENV_HOME/env.bash ]           && . $ENV_HOME/env.bash            && env-env $* ; }
#env-

scm-backup-all misses svnlook

Wrong svn version dumped:

[root@cms02 ~]# scm-backup-all
=== scm-backup-all : starting from pwd /root
=== scm-backup-lock : /var/scm/LOCKED/scm-backup-all-started-2012-05-07@18:25:42
/data/env/system/python/Python-2.5.6/bin/python
/data/env/system/sqlite/sqlite-3.3.16/lib
/data/env/system/python/Python-2.5.6/lib
/data/env/system/svn/subversion-1.4.0/lib/svn-python/libsvn
/data/env/system/svn/subversion-1.4.0/lib/svn-python/svn

=== scm-backup-all : svn === skip non-folder /var/scm/svn/*
=== scm-backup-repo : name aberdeen path /var/scm/repos/aberdeen base /var/scm/backup/cms02 stamp 2012/05/07/182542 site ntu ===
=== scm-backup-repo : mkdir -p /var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542 && /data/env/system/svn/build/subversion-1.4.6/tools/backup/hot-backup.py --archive-type=gz /var/scm/repos/aberdeen /var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542 && cd /var/scm/backup/cms02/repos/aberdeen && rm -f last && ln -s 2012/05/07/182542 last
Beginning hot backup of '/var/scm/repos/aberdeen'.
Youngest revision is 1599
Backing up repository to '/var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542/aberdeen-1599'...
Done.
Archiving backup to '/var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542/aberdeen-1599.tar.gz'...
Archive created, removing backup '/var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542/aberdeen-1599'...
-bash: svnlook: command not found
tar: /var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542/aberdeen-.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error exit delayed from previous errors
=== scm-tgzcheck-ztvf : tgz /var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542/aberdeen-.tar.gz integrity check FAILURE 2
=== scm-backup-repo : tgz /var/scm/backup/cms02/repos/aberdeen/2012/05/07/182542/aberdeen-.tar.gz rev integrity check failure 2
[root@cms02 aberdeen]#
[root@cms02 aberdeen]#

scm-backup-rsync

Generate ssh config and setup keys for auto rsync:

[root@cms02 ~]# sshconf-
[root@cms02 ~]# local-tags                        ## corresponds to backup nodes
C N H1
[root@cms02 ~]# sshconf-gen                       ## generate .ssh/config with sections for the local-tags

[root@cms02 .ssh]# mv id_rsa id_rsa.pub former/   ## move aside old keys that I do not has passphrases for
[root@cms02 .ssh]# ssh--keygen
[root@cms02 ~]# ssh--putkeys                      ## mammoth session of password and passphrase entry to all backup nodes

[root@cms02 ~]# ssh--agent-start
[root@cms02 ~]# scm-backup-rsync                  ## manual test of backup

Bogged down by big dybsvn tarballs

Move big fat foreigners into dedicated folder:

[blyth@cms02 scm]$ mkdir -p foreign/cms02/tracs
[blyth@cms02 scm]$ mv backup/cms02/tracs/toysvn foreign/cms02/tracs/
[blyth@cms02 scm]$ mv backup/cms02/tracs/dybsvn foreign/cms02/tracs/
[blyth@cms02 scm]$ mv backup/cms02/tracs/dybaux foreign/cms02/tracs/

Also remote DNA checking

Restrict remote DNA check to the source LOCAL_NODE to avoid slow irrelevant DNA checks

Add belle1 to backup nodes

  1. add to local-tags , sshconf-gen adding to ssh config and ssh--putkey N1 from C2R

DNA check failures from a disappearing L (python version difference?)

Digest matches but the size has lost an L:

=== scm-backup-dnachecktgzs : OK /var/scm/backup/cms02/tracs/env/2012/05/01/130104/env.tar.gz
1c1
< {'dig': '9e1a36c02cdb837c55404d38b33def8b', 'size': 53648743}
---
> {'dig': '9e1a36c02cdb837c55404d38b33def8b', 'size': 53648743L}

Reproduce:

[root@cms02 ~]# scm-backup-dnachecktgzs /var/scm/backup/cms02/repos/heprez
1c1
< {'dig': 'df6aa49ac8917b8d4144de5abb1a02cc', 'size': 4215148L}
---
> {'dig': 'df6aa49ac8917b8d4144de5abb1a02cc', 'size': 4215148}
=== scm-backup-dnachecktgzs : FAIL /var/scm/backup/cms02/repos/heprez/2012/05/01/130104/heprez-764.tar.gz
=== scm-backup-dnachecktgzs : OK /var/scm/backup/cms02/repos/heprez/2012/05/07/183131/heprez-764.tar.gz
1c1
< {'dig': '6a1024a2b03e128bfd65aad168ff3c90', 'size': 4214891L}
---
> {'dig': '6a1024a2b03e128bfd65aad168ff3c90', 'size': 4214891}
=== scm-backup-dnachecktgzs : FAIL /var/scm/backup/cms02/repos/heprez/2012/04/29/130108/heprez-764.tar.gz
1c1
< {'dig': '2f02f6b5e844ac07c665ebf82850e561', 'size': 4214780L}
---
> {'dig': '2f02f6b5e844ac07c665ebf82850e561', 'size': 4214780}
=== scm-backup-dnachecktgzs : FAIL /var/scm/backup/cms02/repos/heprez/2012/04/30/130103/heprez-764.tar.gz
[root@cms02 ~]#
[root@cms02 ~]#
[root@cms02 ~]#
[root@cms02 ~]# ll /var/scm/backup/cms02/repos/heprez/2012/05/07/183131/heprez-764.tar.gz.dna
-rw-r--r--  1 root root 61 May  7 18:33 /var/scm/backup/cms02/repos/heprez/2012/05/07/183131/heprez-764.tar.gz.dna
[root@cms02 ~]# cat  /var/scm/backup/cms02/repos/heprez/2012/05/07/183131/heprez-764.tar.gz.dna
{'dig': 'f3e450f405704c1cabb3f32f4196af96', 'size': 4226393}
[root@cms02 ~]#

Seems harmless, will go away once have all new .tar.gz.dna

chkconfig setup : allowing auto-revival on reboot

Add chkconfig lines to:

[blyth@cms02 init.d]$ ls -l httpd
lrwxrwxrwx  1 root root 50 May  7 15:34 httpd -> /data/env/system/apache/httpd-2.0.64/bin/apachectl
# chkconfig: 345 95 05
# description: source apache serving Trac + SVN repositories via mod python and mod_dav_svn

cms02 chkconfig for auto-restart

June 20 2012, OOM killer httpd issue forced restart of C2, symptom: C2 is ping-able but not ssh-able. Restart failed to auto-restart httpd. Need to remember to chkconfig add

[blyth@cms02 ~]$ sudo /sbin/chkconfig --list httpd
service httpd supports chkconfig, but is not referenced in any runlevel (run 'chkconfig --add httpd')
[blyth@cms02 ~]$
[blyth@cms02 ~]$ sudo /sbin/chkconfig --add httpd
[blyth@cms02 ~]$
[blyth@cms02 ~]$
[blyth@cms02 ~]$ sudo /sbin/chkconfig --list httpd
httpd           0:off   1:off   2:off   3:on    4:on    5:on    6:off
[blyth@cms02 ~]$
[blyth@cms02 ~]$ sudo /sbin/service httpd start
[blyth@cms02 ~]$

cron hookup + notification mail

Adapt root and blyth cronlines from G:

SHELL = /bin/bash
59 13 * * *  ( export HOME=/root ; export NODE=cms02 ; export MAILTO=blyth@hep1.phys.ntu.edu.tw ; export ENV_HOME=/home/blyth/env ; . /home/blyth/env/env.bash ; env-  ; scm-backup- ; scm-backup-nightly ) >  /var/scm/log/scm-backup-nightly.log 2>&1

hfag chkconfig : starts undesired services

Tidied up the extraneous apache links in /etc/init.d/

Recovering from Yet Another NTU Powercut, Thu 10 May 2012 ~13:30

  1. cms01 : cannot access cms01, no ping:

       simon:env blyth$ date
       Thu 10 May 2012 14:17:36 CST
       simon:env blyth$ ping cms01.phys.ntu.edu.tw
       PING cms01.phys.ntu.edu.tw (140.112.101.190): 56 data bytes
    
    #. from console, twas stuck at BIOS initialization ... powercycling regained access
    
    #. usual manual mount::
    
          [blyth@cms01 ~]$ sudo mount /data
    
    #. do a manual ``exist-start`` as improper shutdown, this hangs ... but doing a exist-service-start succeeds
       and the XMLDB is operational, succeeded to to a heprez-propagate to G for backup
  2. hfag : again started too much, manually stop tomcat and exist, httpd OK:

    [blyth@hfag blyth]$ sudo /sbin/service tomcat stop
    [blyth@hfag blyth]$ sudo /sbin/service exist stop