On Domino thread IDs and Linux/Windows process IDs
A short tip on something which many people are probably not aware of, but which can be a huge time saver when you’re troubleshooting a Domino problem.
As an example, see this error message from a Domino log:
[062372:000014-00007F8001776700] 28/02/2023 13:16:20 CertStore: Error opening CertStore database [CN=PROD02/OU=SRV/O=ACME!!certstore.nsf] : The server is not responding. The server may be down or you may be experiencing network or VPN problems. Contact your system administrator if this problem persists. [062372:000014-00007F8001776700] 28/02/2023 13:16:20 CertStore: Error opening CertStore on [CN=PROD02/OU=SRV/O=ACME] : The server is not responding. The server may be down or you may be experiencing network or VPN problems. Contact your system administrator if this problem persists.
Your first hunch might be that this is an error that’s caused by the CertMgr process. It’s related to the Certificate Store after all. But is this really the case? The answer is in the thread ID at the beginning of the line: 062372:000014-00007F8001776700 and specifically in the first digits: 062372
Let’s have a look at the process IDs of the various Domino tasks:
ps -ef|grep notes notes 61849 1 0 Feb28 ? 00:00:00 /bin/bash /opt/nashcom/startscript/rc_domino_script start notes 61906 61849 0 Feb28 ? 00:14:58 /opt/hcl/domino/notes/latest/linux/server notes 61914 61906 0 Feb28 ? 00:01:01 /opt/hcl/domino/notes/latest/linux/logasio NOTESLOGGER reserved notes 61922 61906 0 Feb28 ? 00:02:59 /opt/hcl/domino/notes/latest/linux/event notes 62145 61906 0 Feb28 ? 00:02:15 /opt/hcl/domino/notes/latest/linux/dircat -x notes 62147 61906 0 Feb28 ? 00:01:33 /opt/hcl/domino/notes/latest/linux/adminp notes 62148 61906 0 Feb28 ? 00:00:34 /opt/hcl/domino/notes/latest/linux/smtp notes 62149 61906 0 Feb28 ? 00:00:21 /opt/hcl/domino/notes/latest/linux/daosmgr notes 62371 61906 0 Feb28 ? 00:00:25 /opt/hcl/domino/notes/latest/linux/amgr notes 62372 61906 3 Feb28 ? 00:00:35 /opt/hcl/domino/notes/latest/linux/http notes 62373 61906 0 Feb28 ? 00:00:16 /opt/hcl/domino/notes/latest/linux/replica notes 62374 61906 0 Feb28 ? 00:02:17 /opt/hcl/domino/notes/latest/linux/router notes 62375 61906 1 Feb28 ? 00:24:10 /opt/hcl/domino/notes/latest/linux/update notes 62578 62371 0 Feb28 ? 00:00:17 /opt/hcl/domino/notes/latest/linux/amgr -e 1 notes 62579 62371 1 Feb28 ? 00:18:50 /opt/hcl/domino/notes/latest/linux/amgr -e 2 notes 62580 62371 0 Feb28 ? 00:17:54 /opt/hcl/domino/notes/latest/linux/amgr -e 3 notes 62581 62371 4 Feb28 ? 01:19:52 /opt/hcl/domino/notes/latest/linux/amgr -e 4 notes 62597 62374 0 Feb28 ? 00:00:46 /opt/hcl/domino/notes/latest/linux/mtc notes 67060 61906 0 Feb28 ? 00:00:30 /opt/hcl/domino/notes/latest/linux/certmgr
This is Linux, but in Windows you can see process IDs from the Task Manager. Do you notice the match between the first part of the Domino thread id and the PID of the HTTP task? So the problem wasn’t caused by the CertMgr task, but by the HTTP task and, in this case, was solved by simply restarting that HTTP task.
So if you have a weird log message and you want to know which task was responsible, just check the thread ID and compare it to the PIDs of the tasks on your OS.
Thanks Daniel Nashed for teaching me this!