Skip to content

Relativistic 2e integrals#399

Draft
kshitij-05 wants to merge 19 commits intomasterfrom
kshitij/feature/2e_rkb_ints
Draft

Relativistic 2e integrals#399
kshitij-05 wants to merge 19 commits intomasterfrom
kshitij/feature/2e_rkb_ints

Conversation

@kshitij-05
Copy link
Collaborator

@kshitij-05 kshitij-05 commented Feb 9, 2026

  • Implement 2-electron 4-center relativistic integrals with restricted kinetic balance condition (RKB).

    • (LL|SS)
    • (SS|SS)
  • Implement 2e 3-center relativistic integrals with RKB.

    • (X|SS)
    • (X|LS)

Copy link
Collaborator

@loriab loriab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this and saw some tweaks to propose. You might want to add the new class to INSTALL.md, too.

message(VERBOSE "setting components ${_amlist}")

foreach(_cls ONEBODY;ERI;ERI3;ERI2;G12;G12DKH)
foreach(_cls ONEBODY;ERI;RKB_ERI;ERI3;ERI2;G12;G12DKH)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a slight disadv to the underscore if ppl are splitting the integral codes (e.g., rkb_eri_ffff_d1) on underscore, but I think RKB_ERI is fine.

… unique am shell sets and phase change for this operator
…+ progress bar + sign fix

- ShellQuartetSetPredicate: add braket-swap tiebreaker for bra_ket_coswappable
  operators (σpσpCoulombσpσp). When la+lb == lc+ld, use max(la,lb) <= lc to
  pick one canonical representative, reducing duplicate quartet generation.

- Engine (engine.impl.h): update swap_braket logic for opop_coulomb_opop to
  match the new predicate tiebreaker. Add coupled-swap sign correction in the
  swap_braket branch (was missing — exposed by d-shell testing).

- build_libint.cc: disable CSE (do_cse/condense_expr) for multi-component
  operators since their 16 components share no intermediates at the expression
  level. This eliminates the superlinear optimize_rr_out bottleneck (e.g.,
  8.8s → 71ms for (ss|ds) prerequisite DAG).

- build_libint.cc: fix compilation when only LIBINT_INCLUDE_RKB_ERI is defined
  (without LIBINT_INCLUDE_ERI): extend #ifdef guards for build_TwoPRep_2b_2k,
  add forward declaration, move make_descr to detail namespace, use if constexpr
  for component descriptor construction.

- buildtest.h: add CodeGenProgress spinner showing elapsed time, function count,
  and current task name on stderr during code generation.

- int_am.cmake: fix typo in OPT_AM variable reference.
Add a static type_index → visit function cache in optimal_rr(). After the
first vertex of each C++ type is matched via the linear mpl::for_each scan
over MasterIntegralTypeList (~48 types with dynamic_pointer_cast each),
subsequent vertices of the same type dispatch directly to the matching
handler. This eliminates ~31 wasted dynamic_pointer_cast calls per vertex
in RKB prerequisite DAGs where all vertices are TwoPRep_11_11_sq.
Add LIBINT_NUM_WORKERS/LIBINT_WORKER_ID env vars to partition shell quartets
across multiple build_libint processes. Each worker generates code for its
subset of RKB quartets and writes iface fragments to separate files. Worker 0
then merges all fragments and produces the final interface headers.

Non-RKB integrals (onebody, ERI) are generated by all workers (duplicated
but fast). Only the RKB quartet loop is partitioned since it dominates
generation time at higher AM.

Includes bin/build_libint_parallel.sh wrapper script that manages the
two-phase workflow: workers 1..N-1 run in parallel, then worker 0 merges.

Measured 2.8x speedup with 4 workers at RKB_MAX_AM=2 (334s -> 118s).
When LIBINT2_NUM_WORKERS > 1, the export step runs build_libint via the
build_libint_parallel.sh wrapper, launching N parallel workers.

Usage: cmake -S . -B build -DLIBINT2_NUM_WORKERS=4

Default is 1 (serial, same behavior as before).
Removes LIBINT2_NUM_WORKERS, WorkerConfig, build_libint_parallel.sh, and
all worker partitioning logic. The process-level parallelism produced
incomplete output (missing CR header files) because generate_rr_code needs
external symbols from ALL quartets but workers only discover their subset.

Retains: type dispatch cache, CSE disable, braket tiebreaker, progress bar.
@kshitij-05 kshitij-05 force-pushed the kshitij/feature/2e_rkb_ints branch from 25265a6 to 41f1dad Compare March 22, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants