Open
Conversation
Channel usability (is_usable) was checked at four separate points: htlc_intercepted, peer_connected, process_pending_htlcs, and calculate_htlc_actions_for_peer. Each had its own deferral logic, and they had to coordinate (the timer skipped channel opens assuming peer_connected already handled them). This coordination broke: PR #9 made peer_connected call process_htlcs_for_peer during reestablish, which saw an empty capacity map because non-usable channels were filtered out, and emitted a spurious OpenChannel on every reconnect with a pending HTLC. Move the usability check to execute_htlc_actions, right before forward_intercepted_htlc. If no usable channel exists, the forward is skipped and the HTLC stays in store for the timer to retry. htlc_intercepted, peer_connected, and process_pending_htlcs now all call process_htlcs_for_peer unconditionally. calculate_htlc_actions_for_peer includes all channels in the capacity map regardless of is_usable, so it correctly sees that a reestablishing channel has sufficient capacity and does not request a spurious new channel. Change the pre-forward guard from is_peer_connected to has_usable_channel, which covers the disconnect+reconnect race where the peer is connected but the channel has not finished reestablishing.
After the previous commit moved usability checks to execute time, the timer can call process_htlcs_for_peer repeatedly while a channel is still opening. calculate_htlc_actions_for_peer sees no is_channel_ready channels and requests a new one each time, producing duplicate OpenChannel events. Add a pending_channel_opens set (RwLock<HashSet<PublicKey>>). execute_htlc_actions inserts the peer when it emits OpenChannel, and channel_ready removes it. If the set already contains the peer, the OpenChannel is suppressed. calculate_htlc_actions_for_peer now filters by is_channel_ready instead of including all channels. Channels still opening (is_channel_ready=false) report outbound_capacity_msat but reject forwards with "Channel is still opening", consuming the InterceptId and losing the HTLC. These are zero-conf channels, so on-chain confirmation is not the issue; the channel simply hasn't finished its opening handshake yet. Reestablishing channels (is_channel_ready=true, is_usable=false) can forward once reestablish completes and are included, preserving the spurious-open fix from the previous commit.
| pub fn channel_ready( | ||
| &self, counterparty_node_id: &PublicKey, | ||
| ) -> Result<(), APIError> { | ||
| self.pending_channel_opens.write().unwrap().remove(counterparty_node_id); |
There was a problem hiding this comment.
If this never executes, the node could end up stuck indefinitely and unable to forward HTLCs.
We should probably add a timeout, for example removing it after a minute.
Also, we likely need to listen for a channel_failed (or similar) event. If the channel fails to open, we should remove it from pending_channel_opens as well.
Author
There was a problem hiding this comment.
I think listening for the failure side should be sufficient or can this be stuck in between somehow?
There was a problem hiding this comment.
what happens with long lived nodes that never get the peer_disconnected but still can get a channel failure and get stuck?
If a channel open is in flight and the peer disconnects, the open is dead (LDK can't complete the funding handshake without a connected peer). Without this cleanup, the pending_channel_opens set would block future OpenChannel events for that peer permanently, since channel_ready never fires for a failed open. This does not reintroduce the duplicate-open problem from the previous commit. That bug was caused by the timer firing repeatedly while the peer stays connected and the channel is still opening. A disconnect/reconnect is a genuine restart of the channel lifecycle, so re-emitting OpenChannel is correct.
5bfbc90 to
dc10470
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This does two things primarily. First commit simplifies the flow by limiting the places where we gate on channel re-usability and replaces a lingering peer_connected check. The second adds a set to track pending channels as a way to prevent unnecessary JIT channels when one is already in flight.