The mpi API on beowulf at Hanford

Legend:
green ball Normal status or debugging message
yellow ball Notable condition which may be a non-fatal error
orange ball Error condition not fatal to job
red ball Error condition fatal to job
blue ball Notable condition which is not an error
purple ball Currently undefined
email Condition requires email notification of the responsible administrator of this API
telephone Condition requires phone notification of the responsible administrator of this API

Link: API Status Page for Hanford

11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP archiveLog file "/ldas_outgoing/logs/LDASmpi.log.html" already closed. (archived as /ldas_outgoing/logs/archive/mpiAPI/LDASmpi.847567169)
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP closeListenSock no cid registered for service 'data'
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP mpi::init unused data port 10018 closed
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP mpi::init port 10018 (jobstate) opened on beowulf as sock7
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP bgLoop Looping process watchlogs started
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP openListenSock port 10016 (operator) opened on beowulf as sock8
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP openListenSock port 10017 (emergency) opened on beowulf as sock9
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP leakLogger inital size of mpi API: 21008 kB
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 STARTUP bgLoop Looping process etchosts started
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 IDLE bgLoop Looping process statpagefile started
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 IDLE bgLoop Looping process killedjobreaper started
11/14/06-11:19:21 PST 
11/14/06-19:19:21 GMT 847567175 IDLE bgLoop Looping process logrotate started
11/14/06-11:19:22 PST 
11/14/06-19:19:22 GMT 847567176 IDLE setFTPandHTTPinfo (::FTPURL 'ftp://198.129.208.245') (::FTPDIR '') (::HTTPURL 'http://198.129.208.245/ldas_outgoing/jobs') (::HTTPDIR '/ldas_outgoing/jobs') (::GRIDFTPURL 'gridftp:/export/grid/ldas') (::GRIDFTPDIR '/export/grid/ldas') (::LDAS_GATEWAY 'ldas 198.129.208.245') (::LDAS_SYSTEM 'ldas-wa') (::RUNCODE 'LDAS-WA')
11/14/06-11:19:27 PST 
11/14/06-19:19:27 GMT 847567181 STARTUP mpi::killAllMpirun cleaning up for user ldas
11/14/06-11:19:28 PST 
11/14/06-19:19:28 GMT 847567182 STARTUP mpi::killAllMpirun ran kill 10 times in 1.095 seconds
11/14/06-11:19:28 PST 
11/14/06-19:19:28 GMT 847567182 STARTUP mpi::prestartLamds running lamboot for user search01
11/14/06-11:19:28 PST 
11/14/06-19:19:28 GMT 847567182 STARTUP mpi::prestartLamds running lamboot for user search02
11/14/06-11:19:29 PST 
11/14/06-19:19:29 GMT 847567183 STARTUP mpi::prestartLamds running lamboot for user search03
11/14/06-11:19:30 PST 
11/14/06-19:19:30 GMT 847567184 STARTUP mpi::prestartLamds running lamboot for user search04
11/14/06-11:19:31 PST 
11/14/06-19:19:31 GMT 847567185 STARTUP mpi::prestartLamds running lamboot for user search05
11/14/06-11:19:31 PST 
11/14/06-19:19:31 GMT 847567185 STARTUP mpi::prestartLamds running lamboot for user search06
11/14/06-11:19:32 PST 
11/14/06-19:19:32 GMT 847567186 STARTUP mpi::prestartLamds running lamboot for user search07
11/14/06-11:19:33 PST 
11/14/06-19:19:33 GMT 847567187 STARTUP mpi::prestartLamds running lamboot for user search08
11/14/06-11:19:34 PST 
11/14/06-19:19:34 GMT 847567188 STARTUP mpi::prestartLamds running lamboot for user search09
11/14/06-11:19:35 PST 
11/14/06-19:19:35 GMT 847567189 STARTUP mpi::prestartLamds running lamboot for user search10
11/14/06-11:19:35 PST 
11/14/06-19:19:35 GMT 847567189 STARTUP mpi::prestartLamds running lamboot for user search11
11/14/06-11:19:37 PST 
11/14/06-19:19:37 GMT 847567191 STARTUP mpi::prestartLamds running lamboot for user search12
11/14/06-11:19:38 PST 
11/14/06-19:19:38 GMT 847567192 STARTUP mpi::prestartLamds running lamboot for user search13
11/14/06-11:19:38 PST 
11/14/06-19:19:38 GMT 847567192 STARTUP mpi::prestartLamds running lamboot for user search14
11/14/06-11:19:39 PST 
11/14/06-19:19:39 GMT 847567193 STARTUP mpi::prestartLamds running lamboot for user search15
11/14/06-11:19:40 PST 
11/14/06-19:19:40 GMT 847567194 STARTUP mpi::prestartLamds running lamboot for user search16
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search01 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search02 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search03 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search04 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search05 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search06 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search07 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search08 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search09 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search10 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search11 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 STARTUP mpi::prestartLamds STARTUP search12 beowulf ok!
11/14/06-11:19:41 PST 
11/14/06-19:19:41 GMT 847567195 IDLE setFTPandHTTPinfo (::FTPURL 'ftp://198.129.208.245') (::FTPDIR '') (::HTTPURL 'http://198.129.208.245/ldas_outgoing/jobs') (::HTTPDIR '/ldas_outgoing/jobs') (::GRIDFTPURL 'gridftp:/export/grid/ldas') (::GRIDFTPDIR '/export/grid/ldas') (::LDAS_GATEWAY 'ldas 198.129.208.245') (::LDAS_SYSTEM 'ldas-wa') (::RUNCODE 'LDAS-WA')
11/14/06-11:19:42 PST 
11/14/06-19:19:42 GMT 847567196 IDLE mpi::updateCmonNodelist updated ::beowulfNodes in cntlmonAPI to 'beowulf beowulf beowulf beowulf beowulf beowulf beowulf beowulf beowulf'
11/14/06-11:19:42 PST 
11/14/06-19:19:42 GMT 847567196 STARTUP mpi::prestartLamds STARTUP search13 beowulf ok!
11/14/06-11:19:42 PST 
11/14/06-19:19:42 GMT 847567196 STARTUP mpi::prestartLamds STARTUP search14 beowulf ok!
11/14/06-11:19:43 PST 
11/14/06-19:19:43 GMT 847567197 STARTUP mpi::prestartLamds STARTUP search15 beowulf ok!
11/14/06-11:19:44 PST 
11/14/06-19:19:44 GMT 847567198 STARTUP mpi::prestartLamds STARTUP search16 beowulf ok!
11/14/06-11:19:44 PST 
11/14/06-19:19:44 GMT 847567198 STARTUP mpi::killAllMpirun {ldas@beowulf:mpirun: child process exited abnormally} {ldas@beowulf:wrapperAPI: child process exited abnormally}
11/14/06-11:19:47 PST 
11/14/06-19:19:47 GMT 847567201 IDLE mpi::updateCmonNodelist updated ::beowulfNodes in cntlmonAPI to 'beowulf beowulf beowulf beowulf beowulf beowulf beowulf beowulf beowulf'
01/31/07-17:24:30 PST 
02/01/07-01:24:30 GMT 854328284 SHUTDOWN closeListenSock port 10016 (sock8) (operator) closed on beowulf
pehrens@ligo.caltech.edu,  gmendell@ligo-wa.caltech.edu, bjohnson@ligo-wa.caltech.edu 854328284 SHUTDOWN mpi::sHuTdOwN Subject: LDAS Hanford mpi shutdown at 854328284 ( 01/31/07 17:24:30 PST ); Body: mpi shutting down NOW
01/31/07-17:24:30 PST 
02/01/07-01:24:30 GMT 854328284 search12 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:30 PST 
02/01/07-01:24:30 GMT 854328284 search04 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:30 PST 
02/01/07-01:24:30 GMT 854328284 search14 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:31 PST 
02/01/07-01:24:31 GMT 854328285 search06 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:31 PST 
02/01/07-01:24:31 GMT 854328285 search16 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:31 PST 
02/01/07-01:24:31 GMT 854328285 search08 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:31 PST 
02/01/07-01:24:31 GMT 854328285 search01 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:32 PST 
02/01/07-01:24:32 GMT 854328286 search11 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:32 PST 
02/01/07-01:24:32 GMT 854328286 search03 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:32 PST 
02/01/07-01:24:32 GMT 854328286 search13 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:32 PST 
02/01/07-01:24:32 GMT 854328286 search05 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:33 PST 
02/01/07-01:24:33 GMT 854328287 search15 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:33 PST 
02/01/07-01:24:33 GMT 854328287 search07 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:33 PST 
02/01/07-01:24:33 GMT 854328287 search10 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 search09 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 search02 mpi::abortJobInDcApi datacond API unreachable!!: sock::open: could not connect to datacond emergency on port 10014 on datacon. {connection refused}
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search01'
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search02'
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search03'
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search04'
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:34 PST 
02/01/07-01:24:34 GMT 854328288 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search05'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search06'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search07'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search08'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search09'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search10'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search11'
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:35 PST 
02/01/07-01:24:35 GMT 854328289 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search12'
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search13'
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search14'
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search15'
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN ::mpi::atExit calling lam::halt for user 'search16'
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN lam::halt Permission denied, please try again. Permission denied, please try again. Permission denied (publickey,gssapi-with-mic,password).
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN closeListenSock port 10017 (sock9) (emergency) closed on beowulf
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN closeListenSock no cid registered for service 'data'
01/31/07-17:24:36 PST 
02/01/07-01:24:36 GMT 854328290 SHUTDOWN closeLog /ldas_outgoing/logs/LDASmpi.log.html (file5) closed