Hi
Unfortunately FP1 for Domino12.0.1 still contains some bugs.
I know about two different Domino-environments with 12.0.1 FP1 (I mean, two different companies) which constantly expirience issues after moving to Domino 12.0.1 and then to 12.0.1 FP1.
Before FP1 there were issues with DAOS.
Now with FP1 there are issues with other tasks (mainly Replicator) and all cases were accompanied by an error "The caller's SemWait timeout expired." for a specific database.
Here it is a common case:
Environment: two or more Domino-servers 12.0.1 FP1 in a cluster, Windows Server 2016 or higher
The case: you start to see errors like
[1358:0002-135C] 16.07.2022 22:07:47 Unable to replicate <SERVER> <DATABASE>: The caller's SemWait timeout expired.
It is unclear why it started happening but some of the cases might be connected with a stuck nCompact.exe on this database and high CPU usage by MTA task. It is not clear though if nCompact hung up and caused the error "The caller's SemWait timeout expired." or if the error was the reason of nCompact hanging.
There was no way to do anything about that - only Domino restart.
Sometimes Domino even refused to shutdown properly, most of the tasks successfully quit but you might not see the final message "server shutdown complete" so in some cases it was required to "kill server".
After server started again you could see a new error for the database, saying "<DATETIME> Database <DATABASE> time is too far in the future."
Check of database icon showed that its modification time was in fufure:
Hi, did you open a support case at HCL specially for 12.0.1FP1?
ReplyDeletehi, yes, I did.
DeleteKeep us posted about the outcome of the support ticket.
DeleteWhat you are seeing is a time creep, where the number of time related actions (e.g. creating a unique number, creating a documentuniqueID, creating a new document, creating new databases, etc.) are performed so very often, that Domino's internal time is moving into the future. This is not a product issue, it is working as designed . And it is typically caused by your own / self developed applications, not by the product. The error message you are seeing is a warning to admins, that you better watch out before more damage is caused. Typically the root cause of a time creep is an agent, that is going crazy (e.g. creating millions of documents in a short period of time). So please take a look at what agents are doing...
ReplyDeleteAll the issues happened with mail databases without any customizations. There were no any custom agents dealing with the mail databases either.
DeleteWe never had such issues before moving to 12.0.1.
We didn't change server config and didn't develop/introduce anything new.
Besides that, HCL confirmed that 12.0.1 had several deadlock-errors related to DAOS (see https://ds_infolib.hcltechsw.com/ldd/fixlist.nsf?OpenDatabase&Start=1&Count=30&Expand=2.11), this is just another example of deadlock but in another place.
Hmm, i don't see, that the problem you're facing, is related to the DAOS deadlock mentioned in the fixlist. What is the outcome of your support ticket so far?
DeleteThe text of the DAOS-related error was exactly the same - see my ealier post https://ypastov.blogspot.com/2022/01/both-domino-1201-and-if1-contain-bug.html
DeleteThere is no outcome yet, HCL is still investigating the probelm....