Change ThreadedLoop to default to 2 inner axes by Lestropie · Pull Request #3293 · MRtrix3/mrtrix3

Lestropie · 2026-04-01T06:26:02Z

Extension of #3280.

Closes #3284.

As proven in #3280, where an image processing operation is comparatively cheap, it can be substantially faster to place 2 image axes rather than 1 within the inner loop of the multi-threading mechanism, as there is less time spent achieving synchronisation across threads.
Conversely, I believe the purpose of processing 1D stripes of image data by default rather than slices is that, for a large number of threads relative to the number of slices, a substantial fraction of the runtime could be spent waiting for the last slices to complete, during which time not all threads available are utilised.
As such, my expectation is that fastest execution will be achieved using:

1 axis in the inner loop for very expensive voxel-wise operations
2 axes in the inner loop for inexpensive operations

Exactly where the threshold lies I do not know.
I did however find after a first attempt at refining that it is likely a minority of multi-threaded image operations that are sufficiently expensive to warrant the use of one axis in the inner loop. Currently this is exclusively:

dwi2fod
dwi2tensor
dwidenoise
FOD reorientation

If there's contention about what should & shouldn't be in this list we can do speed tests; but it needs to be with realistic data, not the CI test data.

(@MRtrix3/mrtrix3-devs note also use of std::optional for function arguments; facilitates differentiation between explicit and default values)

Original code compiled and ran in debug mode locally, but failed CI tests due to assertion failure in std::optional usage.

Lestropie added 2 commits April 1, 2026 17:11

Change ThreadedLoop to default to 2 inner axes

ea45ca0

threaded_copy() to obey inner loop axis count defaults

4a7de11

Lestropie requested a review from jdtournier April 1, 2026 06:26

Lestropie self-assigned this Apr 1, 2026

Lestropie added the performance label Apr 1, 2026

This comment was marked as outdated.

Sign in to view

ThreadedLoop: Change usage of std::optional

234be5c

Original code compiled and ran in debug mode locally, but failed CI tests due to assertion failure in std::optional usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change ThreadedLoop to default to 2 inner axes#3293

Change ThreadedLoop to default to 2 inner axes#3293
Lestropie wants to merge 3 commits intodevfrom
threaded_loop_inner_axes

Lestropie commented Apr 1, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lestropie commented Apr 1, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant