Draft
Conversation
loriab
reviewed
Feb 9, 2026
Collaborator
loriab
left a comment
There was a problem hiding this comment.
I just noticed this and saw some tweaks to propose. You might want to add the new class to INSTALL.md, too.
| message(VERBOSE "setting components ${_amlist}") | ||
|
|
||
| foreach(_cls ONEBODY;ERI;ERI3;ERI2;G12;G12DKH) | ||
| foreach(_cls ONEBODY;ERI;RKB_ERI;ERI3;ERI2;G12;G12DKH) |
Collaborator
There was a problem hiding this comment.
there's a slight disadv to the underscore if ppl are splitting the integral codes (e.g., rkb_eri_ffff_d1) on underscore, but I think RKB_ERI is fine.
… unique am shell sets and phase change for this operator
…l differentiator when on MacOS
…t and more cleanup
…+ progress bar + sign fix - ShellQuartetSetPredicate: add braket-swap tiebreaker for bra_ket_coswappable operators (σpσpCoulombσpσp). When la+lb == lc+ld, use max(la,lb) <= lc to pick one canonical representative, reducing duplicate quartet generation. - Engine (engine.impl.h): update swap_braket logic for opop_coulomb_opop to match the new predicate tiebreaker. Add coupled-swap sign correction in the swap_braket branch (was missing — exposed by d-shell testing). - build_libint.cc: disable CSE (do_cse/condense_expr) for multi-component operators since their 16 components share no intermediates at the expression level. This eliminates the superlinear optimize_rr_out bottleneck (e.g., 8.8s → 71ms for (ss|ds) prerequisite DAG). - build_libint.cc: fix compilation when only LIBINT_INCLUDE_RKB_ERI is defined (without LIBINT_INCLUDE_ERI): extend #ifdef guards for build_TwoPRep_2b_2k, add forward declaration, move make_descr to detail namespace, use if constexpr for component descriptor construction. - buildtest.h: add CodeGenProgress spinner showing elapsed time, function count, and current task name on stderr during code generation. - int_am.cmake: fix typo in OPT_AM variable reference.
Add a static type_index → visit function cache in optimal_rr(). After the first vertex of each C++ type is matched via the linear mpl::for_each scan over MasterIntegralTypeList (~48 types with dynamic_pointer_cast each), subsequent vertices of the same type dispatch directly to the matching handler. This eliminates ~31 wasted dynamic_pointer_cast calls per vertex in RKB prerequisite DAGs where all vertices are TwoPRep_11_11_sq.
Add LIBINT_NUM_WORKERS/LIBINT_WORKER_ID env vars to partition shell quartets across multiple build_libint processes. Each worker generates code for its subset of RKB quartets and writes iface fragments to separate files. Worker 0 then merges all fragments and produces the final interface headers. Non-RKB integrals (onebody, ERI) are generated by all workers (duplicated but fast). Only the RKB quartet loop is partitioned since it dominates generation time at higher AM. Includes bin/build_libint_parallel.sh wrapper script that manages the two-phase workflow: workers 1..N-1 run in parallel, then worker 0 merges. Measured 2.8x speedup with 4 workers at RKB_MAX_AM=2 (334s -> 118s).
When LIBINT2_NUM_WORKERS > 1, the export step runs build_libint via the build_libint_parallel.sh wrapper, launching N parallel workers. Usage: cmake -S . -B build -DLIBINT2_NUM_WORKERS=4 Default is 1 (serial, same behavior as before).
Removes LIBINT2_NUM_WORKERS, WorkerConfig, build_libint_parallel.sh, and all worker partitioning logic. The process-level parallelism produced incomplete output (missing CR header files) because generate_rr_code needs external symbols from ALL quartets but workers only discover their subset. Retains: type dispatch cache, CSE disable, braket tiebreaker, progress bar.
25265a6 to
41f1dad
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement 2-electron 4-center relativistic integrals with restricted kinetic balance condition (RKB).
Implement 2e 3-center relativistic integrals with RKB.