Weaning off std::complex arithmetic by TysonRayJones · Pull Request #729 · QuEST-Kit/QuEST

TysonRayJones · 2026-04-17T19:54:12Z

Experiment with fully custom cpu_qcomp and gpu_qcomp types to avoid...

compiler-specific flags to overcome qcomp = std::complex performance pitfalls
HIP-specific functions to overcome cu_qcomp correctness pitfalls

just to isolate this big diff - structural changes to gpu_qcomp are coming

and changed &arr[ind] to arr + ind, for visual clarity

but there's so much boilerplate overlap with cpu_qcomp, I wonder if we should unify the two! We can retain separate cpu_qcomp and gpu_qcomp types (for clarity) through a typedef

compiler stack-overflows when OpenMP is enabled - possibly due to thread-private instantiation of this 2D array?

to debug MSVC + OpenMP compilation failure in CI

story of my life bruddah

nice one chatgpt

TysonRayJones · 2026-04-20T16:21:05Z

@JPRichings @otbrown Here's a first draft of the proposed qcomp changes (ignore that it provokes a stack-overflow bug in multithreaded MSVC, grr). One only really needs to look at cpu_types.hpp and gpu_types.cuh, and at an example of how cpu_qcomp is used within cpu_subroutines.cpp, like here.

You'll see most of the cpu_types.hpp and gpu_types.cuh boilerplate is identical, merely expressing the arithmetic operators of (c|g)pu_qcomp via their qreal components. I propose further simplification by defining a new complex.hpp (or similarly named, perhaps basetypes.hpp?) in core/ (rather than in cpu/ or gpu/) which defines all the operator overloads of a templated type, and (c|g)pu_qcomp gets defined in a typedef of that "ancestor type" in cpu/ or gpu/. Then adding a new operator overload to all backends is trivial (just add one function to complex.hpp), as is adding a backend specific function (define it exclusively within cpu_types.hpp or gpu_types.hpp). Noting this comment here in case you object to the unification (so I can see what to roll-back if I find time to implement beforehand - which is alas unlikely!)

otbrown · 2026-04-20T17:44:11Z

Thanks Tyson!

I have 5(!) hours of meetings tomorrow, so unlikely to get to this then, but will make time later in the week!

otbrown · 2026-04-21T14:24:06Z

Hi Tyson, one of my meetings got cancelled 🥳

Moving away from the standard library entirely is philosphically upsetting, but clearly has practical benefits!

I propose further simplification by defining a new complex.hpp

I strongly support this proposal. Structurally it would be nice to have the common interface in core to then be specialised/overwritten in cpu/gpu. I am still a bit nervous about needing to maintain a set of maths functions but if we can also contain them in one place in core (relying on the concrete implementations of arithmetic primitives elsewhere if needed), that would be good.

 * TODO:
 * OLD UNPACKERS
 * which I am hestitant to switch to the CPU-style until I better
 * understand why the explicit gpu_qcomp instantiation is necessary
 * (iirc static HIP structs have a different alignment than qcomp?!)

If it's helpful I can volunteer @eessmann as ~~tribute~~ a second pair of eyes and hands to dig into this? We now have a task in QATCH (phase 2 of the Quantum Software Lab) which is broadly "help maintain QuEST (and other scalable emulators)", that Erich is assigned to.

TysonRayJones · 2026-04-22T06:17:45Z

Moving away from the standard library entirely is philosphically upsetting, but clearly has practical benefits!

Yea it's outrageous but at least it won't keep us up at night about esoteric performance pitfalls! (We'll trade that for alignment nightmares)

I strongly support this proposal. Structurally it would be nice to have the common interface in core to then be specialised/overwritten in cpu/gpu. I am still a bit nervous about needing to maintain a set of maths functions but if we can also contain them in one place in core (relying on the concrete implementations of arithmetic primitives elsewhere if needed), that would be good.

Made common definitions in the previous commit. We can still discretionarily fallback to the std::complex operators within the CPU custom maths, paying the NaN performance penalties, when convenient (as pow in cpu_qcomp.hpp currently does).

If it's helpful I can volunteer @eessmann as ~~tribute~~ a second pair of eyes and hands to dig into this? We now have a task in QATCH (phase 2 of the Quantum Software Lab) which is broadly "help maintain QuEST (and other scalable emulators)", that Erich is assigned to.

That'd be super helpful! I preserved the method (used in relation to that comment) in the GPU backend, so I'm hoping things just work on HIP. I don't have an AMD machine handy to test on, so it would be terrific if Erich can give it a whirl! I can also re-setup the (paid) GPU Github Action runners if billing has been sorted out.

Note the new cpu_qcomp type might introduce a similar issue (misalignment of static arrays); currently, the qcomp[2] and qcomp[4] within a DiagMatr1 and DiagMatr2 are being converted/decayed to cpu_qcomp pointers. An explicit copy might be necessary, like in the GPU case, but I've so far been unable to trigger the alignment problem.

TysonRayJones · 2026-04-22T06:20:43Z

You can see the custom backend-specific maths/overloads here and here - so far, it's just pow!

created cpu_qcomp

15028ba

TysonRayJones marked this pull request as draft April 17, 2026 19:54

TysonRayJones added 14 commits April 17, 2026 16:51

renamed cu_qcomp to gpu_qcomp

ab8215c

just to isolate this big diff - structural changes to gpu_qcomp are coming

Restoring gpu_qcomp in-place operators

7353cbf

changed gpu_qcomp getter names to match CPU

43ab41a

and changed &arr[ind] to arr + ind, for visual clarity

created gpu_qcomp

d86c126

but there's so much boilerplate overlap with cpu_qcomp, I wonder if we should unify the two! We can retain separate cpu_qcomp and gpu_qcomp types (for clarity) through a typedef

patch CPU MSVC compiler pitfall

1d62e75

woopsiedoodle

4a3b5af

CUDA patch

7707622

patch HIP

e1dc06c

patch Linux

fddc7f2

silence unused warning

b4fa4c4

attempt patch MSVC

91f39ee

compiler stack-overflows when OpenMP is enabled - possibly due to thread-private instantiation of this 2D array?

temp disabling test compilation

7fc91d8

to debug MSVC + OpenMP compilation failure in CI

more msvc debug

35dcd7c

story of my life bruddah

syntax error

9a4d11a

nice one chatgpt

Unify common cpu_qcomp and gpu_qcomp definitions into base_qcomp

799b9f5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weaning off std::complex arithmetic#729

Weaning off std::complex arithmetic#729
TysonRayJones wants to merge 16 commits intodevelfrom
custom-complex-types

TysonRayJones commented Apr 17, 2026 •

edited

Loading

Uh oh!

TysonRayJones commented Apr 20, 2026

Uh oh!

otbrown commented Apr 20, 2026

Uh oh!

otbrown commented Apr 21, 2026

Uh oh!

TysonRayJones commented Apr 22, 2026

Uh oh!

TysonRayJones commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TysonRayJones commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TysonRayJones commented Apr 20, 2026

Uh oh!

otbrown commented Apr 20, 2026

Uh oh!

otbrown commented Apr 21, 2026

Uh oh!

TysonRayJones commented Apr 22, 2026

Uh oh!

TysonRayJones commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TysonRayJones commented Apr 17, 2026 •

edited

Loading