Spurious JIT channel Fix by amackillop · Pull Request #10 · moneydevkit/rust-lightning

amackillop · 2026-03-23T15:02:29Z

This does two things primarily. First commit simplifies the flow by limiting the places where we gate on channel re-usability and replaces a lingering peer_connected check. The second adds a set to track pending channels as a way to prevent unnecessary JIT channels when one is already in flight.

Channel usability (is_usable) was checked at four separate points: htlc_intercepted, peer_connected, process_pending_htlcs, and calculate_htlc_actions_for_peer. Each had its own deferral logic, and they had to coordinate (the timer skipped channel opens assuming peer_connected already handled them). This coordination broke: PR #9 made peer_connected call process_htlcs_for_peer during reestablish, which saw an empty capacity map because non-usable channels were filtered out, and emitted a spurious OpenChannel on every reconnect with a pending HTLC. Move the usability check to execute_htlc_actions, right before forward_intercepted_htlc. If no usable channel exists, the forward is skipped and the HTLC stays in store for the timer to retry. htlc_intercepted, peer_connected, and process_pending_htlcs now all call process_htlcs_for_peer unconditionally. calculate_htlc_actions_for_peer includes all channels in the capacity map regardless of is_usable, so it correctly sees that a reestablishing channel has sufficient capacity and does not request a spurious new channel. Change the pre-forward guard from is_peer_connected to has_usable_channel, which covers the disconnect+reconnect race where the peer is connected but the channel has not finished reestablishing.

After the previous commit moved usability checks to execute time, the timer can call process_htlcs_for_peer repeatedly while a channel is still opening. calculate_htlc_actions_for_peer sees no is_channel_ready channels and requests a new one each time, producing duplicate OpenChannel events. Add a pending_channel_opens set (RwLock<HashSet<PublicKey>>). execute_htlc_actions inserts the peer when it emits OpenChannel, and channel_ready removes it. If the set already contains the peer, the OpenChannel is suppressed. calculate_htlc_actions_for_peer now filters by is_channel_ready instead of including all channels. Channels still opening (is_channel_ready=false) report outbound_capacity_msat but reject forwards with "Channel is still opening", consuming the InterceptId and losing the HTLC. These are zero-conf channels, so on-chain confirmation is not the issue; the channel simply hasn't finished its opening handshake yet. Reestablishing channels (is_channel_ready=true, is_usable=false) can forward once reestablish completes and are included, preserving the spurious-open fix from the previous commit.

martinsaposnic · 2026-03-23T15:07:51Z

lightning-liquidity/src/lsps4/service.rs

 	pub fn channel_ready(
 		&self, counterparty_node_id: &PublicKey,
 	) -> Result<(), APIError> {
+		self.pending_channel_opens.write().unwrap().remove(counterparty_node_id);


If this never executes, the node could end up stuck indefinitely and unable to forward HTLCs.

We should probably add a timeout, for example removing it after a minute.

Also, we likely need to listen for a channel_failed (or similar) event. If the channel fails to open, we should remove it from pending_channel_opens as well.

I think listening for the failure side should be sufficient or can this be stuck in between somehow?

what happens with long lived nodes that never get the peer_disconnected but still can get a channel failure and get stuck?

If a channel open is in flight and the peer disconnects, the open is dead (LDK can't complete the funding handshake without a connected peer). Without this cleanup, the pending_channel_opens set would block future OpenChannel events for that peer permanently, since channel_ready never fires for a failed open. This does not reintroduce the duplicate-open problem from the previous commit. That bug was caused by the timer firing repeatedly while the peer stays connected and the channel is still opening. A disconnect/reconnect is a genuine restart of the channel lifecycle, so re-emitting OpenChannel is correct.

amackillop added 2 commits March 21, 2026 19:21

martinsaposnic reviewed Mar 23, 2026

View reviewed changes

amackillop force-pushed the austin_spurious-jit-channel-fix-2 branch from 5bfbc90 to dc10470 Compare March 24, 2026 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spurious JIT channel Fix#10

Spurious JIT channel Fix#10
amackillop wants to merge 3 commits intolsp-0.2.0from
austin_spurious-jit-channel-fix-2

amackillop commented Mar 23, 2026

Uh oh!

martinsaposnic Mar 23, 2026

Uh oh!

amackillop Mar 23, 2026

Uh oh!

amackillop Mar 24, 2026

Uh oh!

martinsaposnic Mar 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amackillop commented Mar 23, 2026

Uh oh!

martinsaposnic Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

amackillop Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

amackillop Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

martinsaposnic Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

martinsaposnic Mar 24, 2026 •

edited

Loading