[Misc] QD_DUMP_IR dumps to files with the task_id added to the filename#441
[Misc] QD_DUMP_IR dumps to files with the task_id added to the filename#441hughperkins wants to merge 7 commits intomainfrom
Conversation
… tasks When a kernel contains multiple top-level for loops, each loop is compiled as a separate offloaded task. Previously all tasks wrote their LLVM IR to the same file (kernel->name + "_llvm.ll"), so only the last task to finish was visible. Use the per-task name instead so each gets its own file. Made-with: Cursor
…tiple tasks Same issue as the _llvm.ll fix: moduleToDumpName returns only the first function name, so multi-task kernels produce only one dump file. Use new moduleDumpNames helper to iterate over all defined functions and dump one file per task.
- Apply clang-format fix in codegen_llvm.cpp - Avoid string-concatenation temporaries in jit_cuda.cpp PTX dump path to satisfy clang-tidy performance-inefficient-string-concatenation Made-with: Cursor
…asks Made-with: Cursor # Conflicts: # quadrants/runtime/cuda/jit_cuda.cpp
Made-with: Cursor
|
@claude review |
…tional log - Capture dump name before compile_module_to_ptx so LOAD_PTX uses pre-rename names matching the dump path (fixes name divergence when convert() mangles symbols) - Dump module/PTX once per add_module call instead of N identical copies per task, since the whole module is written each time anyway - Move "PTX dumped to:" message inside the file-open check so it only prints on success - Remove unused moduleDumpNames() helper
|
@claude review |
There was a problem hiding this comment.
Previous review feedback has been addressed — rename-timing mismatch, duplicate per-task dumps, and unconditional success log are all fixed; the remaining nit about DUMP_IR/LOAD_IR empty-tasks asymmetry is minor and debug-only.
Extended reasoning...
Overview
This PR modifies debug-only IR dumping paths in two files: codegen_llvm.cpp (LLVM IR dump/load on the CPU/generic codegen side) and jit_cuda.cpp (PTX and pre-PTX LLVM IR dump/load on the CUDA side). The changes are gated on QUADRANTS_DUMP_IR and QUADRANTS_LOAD_PTX environment variables and have no impact on compiled kernel correctness.
Security risks
None. The changes only affect developer-facing debug output paths gated on environment variables. No auth, crypto, permissions, or user-input parsing is touched. Filenames are derived from internal kernel/task/function names, not from user-controlled input.
Level of scrutiny
Low — this is a debug-only feature. The original overwrite problem (all tasks writing to the same file and only the last one being visible) is resolved by using offloaded_tasks[0].name on the LLVM side and the pre-compile function name on the CUDA side. The LOAD side now uses the same name captured pre-compile, fixing the asymmetry I flagged previously.
Other factors
The remaining inline nit (DUMP_IR has a kernel->name fallback for empty offloaded_tasks while LOAD_IR hard-asserts) is cosmetic — in practice offloaded_tasks is populated before this block runs, and even if the divergence were hit, the impact is a debug crash instead of a silent load failure. Not worth blocking the PR; the author can decide whether to tighten the asymmetry in a follow-up. The previous high/medium-severity bugs have been resolved, so I'm comfortable approving.
Both paths now consistently assert rather than having a defensive fallback in DUMP_IR that LOAD_IR can never consume.
|
@claude review |
There was a problem hiding this comment.
LGTM — all prior review feedback has been addressed: dump_name is captured before convert() runs, per-task filename duplication is gone, the PTX success message is gated on a successful open, and DUMP_IR/LOAD_IR now assert symmetrically.
Extended reasoning...
Overview
This is a small debug-only change to two files (codegen_llvm.cpp, jit_cuda.cpp) that renames the LLVM IR / PTX dump filename from a kernel-level name to a per-task name so multi-task kernels no longer clobber each other's dump files. The LOAD_IR / LOAD_PTX paths are updated in lockstep so that what is written is what is read back.
Security risks
None. The affected code is gated on QD_DUMP_IR, QD_LOAD_IR, and QUADRANTS_LOAD_PTX env vars intended for developer debugging. No auth, crypto, permission, or externally-reachable input path is touched. The filenames are derived from internal LLVM module/function names and written to a configured debug_dump_path.
Level of scrutiny
Low. Debug-only path, small diff (~20 changed lines), straightforward intent, no effect on compiled kernel correctness. The only user-visible failure mode is a developer getting fewer (or no) dump files in a misconfigured run, which falls back cleanly.
Other factors
Every prior bug I flagged on earlier revisions is now addressed in the current diff: the jit_cuda.cpp rename-timing asymmetry is fixed by capturing dump_name pre-compile_module_to_ptx, the per-task loop that previously produced N bit-identical files is gone in favor of a single offloaded_tasks[0].name file, the misleading unconditional "PTX dumped to:" std::cout is now inside the successful-open branch, and DUMP_IR/LOAD_IR both assert !offloaded_tasks.empty() so they cannot disagree on filename handling for a degenerate empty-tasks case.
… tasks
When a kernel contains multiple top-level for loops, each loop is compiled as a separate offloaded task. Previously all tasks wrote their LLVM IR to the same file (kernel->name + "_llvm.ll"), so only the last task to finish was visible. Use the per-task name instead so each gets its own file.
Made-with: Cursor
Issue: #
Brief Summary
copilot:summary
Walkthrough
copilot:walkthrough