fix: resolve Dandelion++ ABBA deadlock in CheckDandelionEmbargoes (Bug #29)#394
Merged
JaredTate merged 1 commit intoDigiByte-Core:feature/digidollar-v1from Apr 2, 2026
Conversation
…DigiByte-Core#29) CheckDandelionEmbargoes() held m_dandelion_embargo_mutex while calling usingDandelion() and localDandelionDestinationPushInventory(), both of which acquire m_nodes_mutex internally. This violates the established lock ordering (m_nodes_mutex → m_dandelion_embargo_mutex) used by DandelionShuffle() and CloseDandelionConnections(), creating an ABBA deadlock when the shuffle timer and embargo check fire concurrently. Symptoms: sendtoaddress hangs at "Processing Dandelion relay", RPC timeout, shutdown hangs on threadDandelionShuffle.join(). Reported by DanGB on Windows 11 (RC26, ~1 week uptime). Probabilistic — requires two timer threads to race. Fix: restructure CheckDandelionEmbargoes() into two phases: Phase 1: scan embargo map under m_dandelion_embargo_mutex, collect txids needing stem routing into a local vector. Phase 2: release embargo lock, then perform stem routing (which acquires m_nodes_mutex safely), re-acquire embargo lock briefly to mark each tx as routed. usingDandelion() moved before the embargo lock acquisition. The bool may be one cycle stale — no correctness impact. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Looks good. Thank you for this! |
a7f23b9
into
DigiByte-Core:feature/digidollar-v1
1 of 2 checks passed
JaredTate
added a commit
that referenced
this pull request
Apr 2, 2026
…ull commit list - Add Bug #29 (Dandelion++ ABBA deadlock fix, PR #394 by JohnnyLawDGB) - Add Bug #33 (wrong mint tooltip limits + silent decimal truncation) - Add Post-Quantum Cryptography plan documentation section - Update commit list with all commits since RC26 - Update test suite status to reflect full validation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sendtoaddresshangs forever at "Processing Dandelion relay", shutdown hangs onthreadDandelionShuffle.join(), RPC times out. Reported by DanGB (Windows 11, RC26, ~1 week uptime). Probabilistic — requires the shuffle timer and embargo check timer to race.m_nodes_mutexandm_dandelion_embargo_mutex. The established lock ordering ism_nodes_mutex→m_dandelion_embargo_mutex(used byDandelionShuffle,CloseDandelionConnections,DisconnectNodes).CheckDandelionEmbargoes()violated this by holdingm_dandelion_embargo_mutexfirst, then callingusingDandelion()andlocalDandelionDestinationPushInventory()which acquirem_nodes_mutexinternally.CheckDandelionEmbargoes()into two phases — scan the embargo map under the embargo lock and collect txids needing stem routing, then release the lock and perform routing (which safely acquiresm_nodes_mutex). Single file change, zero changes todandelion.cpp/net.cpp/net.h.Deadlock diagram
Lock acquisition map (5 deadlock points identified)
DandelionShuffle()m_nodes_mutexm_dandelion_embargo_mutexCloseDandelionConnections()m_nodes_mutexm_dandelion_embargo_mutexDisconnectNodes()m_nodes_mutexm_dandelion_embargo_mutexCheckDandelionEmbargoes()→usingDandelion()m_dandelion_embargo_mutexm_nodes_mutexCheckDandelionEmbargoes()→localDandelionDestinationPushInventory()m_dandelion_embargo_mutexm_nodes_mutexNote: Commit
0caa0e84a1(Feb 10) made the deadlock more likely by ensuringCloseDandelionConnections()always acquires the embargo mutex, eliminating any chance of the lock acquisition being skipped.Test plan
dandelion_testsunit tests: 2/2 passp2p_dandelion.pyfunctional test (blocked by missingdigibyte_scryptPython module — pre-existing environment issue)🤖 Generated with Claude Code