The following fixes have been made:
- Update LocalTransactionLeader and ReplicaTransaction to better handle failures of the TransactionService/Transact RPC. This fixes a particular edge case where the RPC fails after the replica sent in its prepare response but hasn't granted its write token. Previously, this could cause the write lock to wait indefinitely (or at least until the write transaction times out).
- Fix a race condition in RemoteTransactionLeader which could cause a yield request to be sent to the (remote) transaction leader before the grant message was sent. When this happened, the remote leader would log a message but not yield the token (because it didn't hold it yet). And since there was no retry on the yield request, the write lock never yielded the token to a higher priority transaction and the write lock could deadlock.
- Add a recurring job to DefaultWriteTokenRegistry that retries requesting a token to be released if the token holder hasn't released it after the (configurable) timeout.