Search This Blog

Friday, February 18, 2022

If you server hangs up during DAOS resync...


Recently I worked on the issue with one Domino-server which became unresponsive after running program document with command "tell daosmgr resync force" at night. We noticed that in the morning only and you know how stessful it could be :)

I checked the server via Domino Console and it looked alive, it reacted to commands like "sh ta" or "sh ser", however Notes clients couldn't connect to it. I noticed also some other messages telling that something was wrong, like:

Error connecting to server <SERVERNAME>: Remote system no longer responding
Server task Router on <SERVERNAME> is no longer responding
Server task SMTP Server on <SERVERNAME> is no longer responding

I restarted server and when it went up I noticed following errors:

The DAOS catalog cannot be opened. DAOS cannot operate normally.: The integrity of a database storage container has been lost - the container will be rebuilt.

and also many errors like this:

The database D:\IBM\Domino\data\<database.nsf? was unable to write to file D:\HCL\DAOS\0001\912B6AA17EFDE1C9C12587E9002FDA051912B6AA17EFDE1P.nlo: Release 12.0.1 HF24|January 19, 2022

Without thinking twice I ran "tell daosmgr resync" manually and in 10 minutes the server hanged up again with the same symptoms - reacted to commands via Domino Console but Notes client couldn't connect.

I restarted server again, all the errors from the previous start were shown again, but that time I checked DAOS status and it was REBUILDING.

This time I decided to not run resync while catalog was rebuilding and I decided to wait until the evening and then try to do the resync agin. That time resync worked without issues, the only stupid thing I did - I didn't check the catalog status before the running resync, so I can't say if server did something during the day before I ran the resyn again in the evening.

I communicated this case with HCL and they told me that one of the solutions could be to resync the catalog while server is down by running "ndesign.exe resync" (the similar way you can compact names.nsf or log.nsf).

No comments:

Post a Comment