Skip to content

{2025.06}[foss/2025b] SPH-EXA-0.96.1 CUDA-12.9.1#1453

Open
pescobar wants to merge 2 commits intoEESSI:mainfrom
pescobar:sph-exa-2025.6
Open

{2025.06}[foss/2025b] SPH-EXA-0.96.1 CUDA-12.9.1#1453
pescobar wants to merge 2 commits intoEESSI:mainfrom
pescobar:sph-exa-2025.6

Conversation

@pescobar
Copy link
Contributor

No description provided.

@ocaisa
Copy link
Member

ocaisa commented Mar 20, 2026

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Mar 20, 2026

New job on instance eessi-bot-surf for repository eessi.io-2025.06-software
Building on: intel-icelake and accelerator nvidia/cc80
Building for: x86_64/intel/icelake and accelerator nvidia/cc80
Job dir: /projects/eessibot/eessi-bot-surf/jobs/2026.03/pr_1453/20996749

date job status comment
Mar 20 12:25:56 UTC 2026 submitted job id 20996749 will be eligible to start in about 20 seconds
Mar 20 12:26:07 UTC 2026 received job awaits launch by Slurm scheduler
Mar 20 12:26:21 UTC 2026 running job 20996749 is running

@ocaisa
Copy link
Member

ocaisa commented Mar 20, 2026

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Mar 20, 2026

New job on instance eessi-bot-surf for repository eessi.io-2025.06-software
Building on: intel-icelake and accelerator nvidia/cc80
Building for: x86_64/intel/icelake and accelerator nvidia/cc80
Job dir: /projects/eessibot/eessi-bot-surf/jobs/2026.03/pr_1453/20996760

date job status comment
Mar 20 12:27:02 UTC 2026 submitted job id 20996760 will be eligible to start in about 20 seconds
Mar 20 12:27:11 UTC 2026 received job awaits launch by Slurm scheduler
Mar 20 12:27:27 UTC 2026 running job 20996760 is running
Mar 20 12:29:29 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-20996760.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-x86_64-intel-icelake-accel-nvidia-cc80-17740097090.tar.zstsize: 0 MiB (22 bytes)
entries: 0
modules under 2025.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80/modules/all
no module files in tarball
software under 2025.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80/software
no software packages in tarball
reprod directories under 2025.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80/reprod
no reprod directories in tarball
other under 2025.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80
no other files in tarball
Mar 20 12:29:29 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2024a-CUDA-12.6.0 %scale=1_4_node %device_type=gpu /526cd259 @BotBuildTests:gpu_a100+default [Skipping GPU test : only 1 GPU available for this test case]
[ SKIP ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2024a-CUDA-12.6.0 %scale=1_4_node %device_type=gpu /416eaee1 @BotBuildTests:gpu_a100+default [Skipping GPU test : only 1 GPU available for this test case]
[ SKIP ] (3/4) EESSI_OSU_pt2pt_GPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2024a-CUDA-12.6.0 %scale=1_4_node /73a202f1 @BotBuildTests:gpu_a100+default [Skipping test : 1 GPU(s) available for this test case, need exactly 2]
[ SKIP ] (4/4) EESSI_OSU_pt2pt_GPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2024a-CUDA-12.6.0 %scale=1_4_node /7f04eb2b @BotBuildTests:gpu_a100+default [Skipping test : 1 GPU(s) available for this test case, need exactly 2]
[ PASSED ] Ran 0/4 test case(s) from 4 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-20996760.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Comment on lines +2 to +3
- UCX-CUDA-1.19.0-GCCcore-14.3.0-CUDA-12.9.1.eb:
- UCC-CUDA-1.4.4-GCCcore-14.3.0-CUDA-12.9.1.eb:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- UCX-CUDA-1.19.0-GCCcore-14.3.0-CUDA-12.9.1.eb:
- UCC-CUDA-1.4.4-GCCcore-14.3.0-CUDA-12.9.1.eb:

No need to include these, they will get built automatically as dependencies

@ocaisa
Copy link
Member

ocaisa commented Mar 20, 2026

5 out of 60 required modules missing:

* GDRCopy/2.5-GCCcore-14.3.0 (GDRCopy-2.5-GCCcore-14.3.0.eb)
* UCX-CUDA/1.19.0-GCCcore-14.3.0-CUDA-12.9.1 (UCX-CUDA-1.19.0-GCCcore-14.3.0-CUDA-12.9.1.eb)
* NCCL/2.27.7-GCCcore-14.3.0-CUDA-12.9.1 (NCCL-2.27.7-GCCcore-14.3.0-CUDA-12.9.1.eb)
* UCC-CUDA/1.4.4-GCCcore-14.3.0-CUDA-12.9.1 (UCC-CUDA-1.4.4-GCCcore-14.3.0-CUDA-12.9.1.eb)
* SPH-EXA/0.96.1-foss-2025b-CUDA-12.9.1 (SPH-EXA-0.96.1-foss-2025b-CUDA-12.9.1.eb)

It's failing because when we build for a GPU, we only allow GPU dependendencies to be built automatically. This requires GDRCopy-2.5-GCCcore-14.3.0.eb which does not need a GPU, a separate CPU PR will have to be created for that first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants