Skip to content

Fix issue #14#15

Merged
lohedges merged 1 commit intodevelfrom
fix_14
Mar 24, 2026
Merged

Fix issue #14#15
lohedges merged 1 commit intodevelfrom
fix_14

Conversation

@lohedges
Copy link
Copy Markdown
Contributor

This PR closes #14.

When a GCMC simulation crashes (e.g. running out of waters), the CUDA context stack was left in a dirty state, causing a PyCUDA ERROR: The context stack was not empty upon module cleanup error and a core dump. The root cause was that
CUDAPlatform.cleanup() always popped the context exactly once, but a crash occurring mid-move leaves an extra unpopped entry on the stack. This is fixed by tracking the number of outstanding pushes in _push_count and popping that many times in cleanup(). Additionally, all push/pop pairs around GCMC operations in the SOMD2 repex and base runners are now wrapped in try/finally to keep _push_count accurate even when an exception propagates, ensuring cleanup always leaves the stack balanced.

@lohedges lohedges added the bug Something isn't working label Mar 24, 2026
@lohedges lohedges merged commit 480fffc into devel Mar 24, 2026
4 checks passed
@lohedges lohedges deleted the fix_14 branch March 24, 2026 12:14
lohedges added a commit that referenced this pull request Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Detach primary CUDA context before deleting

1 participant