splice_locked + channel_reestablish bug fixes#4654
Conversation
When a splice confirms while disconnected after our channel_reestablish was generated, promotion clears pending_splice before their reestablish arrives. Fall back to the current promoted splice funding txid so we still send splice_locked after reestablishing.
If we have pending updates to send to our counterparty on reestablishment, while also pending a `splice_locked` send, then we must send our `splice_locked` first as the pending updates are considering the post-splice-locked state.
A stale ChannelManager can be reloaded after a monitor update has already completed in a prior runtime and released its post-update messages to the counterparty. The latest ChannelMonitor is not stale, but the serialized manager may still contain the old in-flight monitor state and `monitor_pending_*` resend flags from before the completion action ran. This becomes observable when startup monitor-completion background events are interleaved with splice promotion. On reload, the completed monitor update is queued as a background event. If a splice confirmation is processed before that background event fully resumes the channel, splice promotion can create a new `RenegotiatedFundingLocked` monitor update. The old completion is then blocked behind the new in-flight splice update. Once the channel reconnects and the splice update completes, `monitor_updating_restored` may consume the stale `monitor_pending_revoke_and_ack` / `monitor_pending_commitment_signed` flags and release a duplicate `revoke_and_ack` or `commitment_signed`. The peer's `channel_reestablish` commitment numbers are authoritative for this case. If `next_remote_commitment_number` says the peer is not waiting for a `revoke_and_ack`, clear `monitor_pending_revoke_and_ack`. Likewise, if `next_local_commitment_number` says the peer already has our latest `commitment_signed`, clear `monitor_pending_commitment_signed`.
|
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
| }).or_else(|| { | ||
| // If a splice confirms after we've sent `channel_reestablish` but before we've | ||
| // received theirs, we may promote the splice while disconnected and clear | ||
| // `pending_splice`. We still need to send `splice_locked` after reestablishing | ||
| // unless the promoted txid was already included in our `channel_reestablish`. | ||
| let current_funding_txid = self.funding.get_funding_txid()?; | ||
| (self.funding.channel_transaction_parameters.splice_parent_funding_txid.is_some() | ||
| && Some(current_funding_txid) != funding_locked_txid_sent_in_reestablish) | ||
| .then(|| msgs::SpliceLocked { | ||
| channel_id: self.context.channel_id, | ||
| splice_txid: current_funding_txid, | ||
| }) | ||
| }); |
There was a problem hiding this comment.
This or_else fires whenever the first closure returns None, which happens not only when self.pending_splice is None (the intended case per the comment), but also when self.pending_splice is Some and sent_funding_txid was filtered out because it matched funding_locked_txid_sent_in_reestablish.
In a sequential-splice scenario (first splice promoted, second splice negotiated and confirmed locally, then reconnect), this would send splice_locked for the previously promoted splice's txid. If the peer has already promoted that splice and has a pending_splice for the second splice, their splice_locked handler would check negotiated_candidates for the old txid, not find it, and force-close the channel with "unknown splice funding txid".
Consider adding self.pending_splice.is_none() to the guard to match the comment's intent—this branch should only handle the case where pending_splice was cleared due to promotion:
| }).or_else(|| { | |
| // If a splice confirms after we've sent `channel_reestablish` but before we've | |
| // received theirs, we may promote the splice while disconnected and clear | |
| // `pending_splice`. We still need to send `splice_locked` after reestablishing | |
| // unless the promoted txid was already included in our `channel_reestablish`. | |
| let current_funding_txid = self.funding.get_funding_txid()?; | |
| (self.funding.channel_transaction_parameters.splice_parent_funding_txid.is_some() | |
| && Some(current_funding_txid) != funding_locked_txid_sent_in_reestablish) | |
| .then(|| msgs::SpliceLocked { | |
| channel_id: self.context.channel_id, | |
| splice_txid: current_funding_txid, | |
| }) | |
| }); | |
| }).or_else(|| { | |
| // If a splice confirms after we've sent `channel_reestablish` but before we've | |
| // received theirs, we may promote the splice while disconnected and clear | |
| // `pending_splice`. We still need to send `splice_locked` after reestablishing | |
| // unless the promoted txid was already included in our `channel_reestablish`. | |
| let current_funding_txid = self.funding.get_funding_txid()?; | |
| (self.pending_splice.is_none() | |
| && self.funding.channel_transaction_parameters.splice_parent_funding_txid.is_some() | |
| && Some(current_funding_txid) != funding_locked_txid_sent_in_reestablish) | |
| .then(|| msgs::SpliceLocked { | |
| channel_id: self.context.channel_id, | |
| splice_txid: current_funding_txid, | |
| }) |
Review SummaryOne issue found:
The remaining changes look correct:
|
| }) | ||
| }).or_else(|| { | ||
| // If a splice confirms after we've sent `channel_reestablish` but before we've | ||
| // received theirs, we may promote the splice while disconnected and clear |
There was a problem hiding this comment.
I don't understand the "we may promote the splice while disconnected and clear" part. Isn't the fact that we've sent channel_reestablish indicate we are connected? Should this have said, "we may have promoted the splice while disconnected and have cleared"?
Commit message is similarly confusing to me.
This PR features commits to fix three different issues found by the
chanmon_consistencytarget around the interaction ofsplice_lockedandchannel_reestablish. It fixes the following payloads from #4363: