Skip to content

Allow EESSI to re-initialised in a subshell (since a subshell may restore a system module tool)#195

Open
ocaisa wants to merge 6 commits intoEESSI:mainfrom
ocaisa:reset_in_new_shell
Open

Allow EESSI to re-initialised in a subshell (since a subshell may restore a system module tool)#195
ocaisa wants to merge 6 commits intoEESSI:mainfrom
ocaisa:reset_in_new_shell

Conversation

@ocaisa
Copy link
Copy Markdown
Member

@ocaisa ocaisa commented Apr 1, 2026

No description provided.

@ocaisa
Copy link
Copy Markdown
Member Author

ocaisa commented Apr 1, 2026

@boegel I could not figure out to add a test for this with csh, but the other cases seem to be ok. I did test this with csh and it works as expected

Copy link
Copy Markdown
Contributor

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Copy Markdown
Contributor

boegel commented Apr 3, 2026

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws bot commented Apr 3, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2023.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_195/145409

date job status comment
Apr 03 13:24:15 UTC 2026 submitted job id 145409 awaits release by job manager
Apr 03 13:24:33 UTC 2026 released job awaits launch by Slurm scheduler
Apr 03 14:45:43 UTC 2026 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job145409.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 03 14:45:43 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job145409.test does not exist in job directory or reading it failed.

@eessi-bot-aws
Copy link
Copy Markdown

eessi-bot-aws bot commented Apr 3, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2026.04/pr_195/145410

date job status comment
Apr 03 13:24:20 UTC 2026 submitted job id 145410 awaits release by job manager
Apr 03 13:24:31 UTC 2026 released job awaits launch by Slurm scheduler
Apr 03 14:45:41 UTC 2026 finished
🤷 UNKNOWN (click triangle for detailed information)
  • Job results file _bot_job145410.result does not exist in job directory or reading it failed.
  • No artefacts were found/reported.
Apr 03 14:45:41 UTC 2026 test result
🤷 UNKNOWN (click triangle for detailed information)
  • Job test file _bot_job145410.test does not exist in job directory or reading it failed.

@boegel
Copy link
Copy Markdown
Contributor

boegel commented Apr 3, 2026

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-deucalion for:arch=aarch64/a64fx

@eessi-bot-deucalion
Copy link
Copy Markdown

eessi-bot-deucalion bot commented Apr 3, 2026

New job on instance eessi-bot-deucalion for repository eessi.io-2023.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2026.04/pr_195/1117959

date job status comment
Apr 03 14:43:47 UTC 2026 submitted job id 1117959 awaits release by job manager
Apr 03 14:44:51 UTC 2026 released job awaits launch by Slurm scheduler
Apr 03 14:45:54 UTC 2026 running job 1117959 is running
Apr 03 14:55:21 UTC 2026 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-1117959.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-17752276910.tar.zstsize: 0 MiB (4400 bytes)
entries: 6
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2023.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2023.06/software/linux/aarch64/a64fx
2023.06/init/lmod/bash
2023.06/init/lmod/csh
2023.06/init/lmod/fish
2023.06/init/lmod/ksh
2023.06/init/lmod/sh
2023.06/init/lmod/zsh
Apr 03 14:55:21 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] ( 1/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 2/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 3/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ SKIP ] ( 4/10) Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed
[ OK ] ( 5/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos %scale=1_node /aeb2d9df @BotBuildTests:a64fx+default
P: perf: 579.708 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos %scale=1_node /04ff9ece @BotBuildTests:a64fx+default
P: perf: 524.34 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /15cad6c4 @BotBuildTests:a64fx+default
P: latency: 1.66 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /6672deda @BotBuildTests:a64fx+default
P: latency: 1.74 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %scale=1_node /2a9a47b1 @BotBuildTests:a64fx+default
P: bandwidth: 8074.57 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %scale=1_node /1b24ab8e @BotBuildTests:a64fx+default
P: bandwidth: 8125.19 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 6/10 test case(s) from 10 check(s) (0 failure(s), 4 skipped, 0 aborted)
Details
✅ job output file slurm-1117959.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Apr 04 09:21:11 UTC 2026 uploaded transfer of eessi-2023.06-software-linux-aarch64-a64fx-17752276910.tar.zst to S3 bucket succeeded

@eessi-bot-deucalion
Copy link
Copy Markdown

eessi-bot-deucalion bot commented Apr 3, 2026

New job on instance eessi-bot-deucalion for repository eessi.io-2025.06-software
Building on: a64fx
Building for: aarch64/a64fx
Job dir: /home/eessibot/new-bot/jobs/2026.04/pr_195/1117960

date job status comment
Apr 03 14:43:53 UTC 2026 submitted job id 1117960 awaits release by job manager
Apr 03 14:44:48 UTC 2026 released job awaits launch by Slurm scheduler
Apr 03 14:45:57 UTC 2026 running job 1117960 is running
Apr 03 14:53:17 UTC 2026 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-1117960.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-aarch64-a64fx-17752276400.tar.zstsize: 0 MiB (4476 bytes)
entries: 6
modules under 2025.06/software/linux/aarch64/a64fx/modules/all
no module files in tarball
software under 2025.06/software/linux/aarch64/a64fx/software
no software packages in tarball
reprod directories under 2025.06/software/linux/aarch64/a64fx/reprod
no reprod directories in tarball
other under 2025.06/software/linux/aarch64/a64fx
2025.06/init/lmod/bash
2025.06/init/lmod/csh
2025.06/init/lmod/fish
2025.06/init/lmod/ksh
2025.06/init/lmod/sh
2025.06/init/lmod/zsh
Apr 03 14:53:17 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ SKIP ] (1/5) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:a64fx+default [Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed]
[ SKIP ] (2/5) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:a64fx+default [Skipping test: nodes in this partition only have 30720 MiB memory available (per node) according to the current ReFrame configuration, but 49152 MiB is needed]
[ OK ] (3/5) EESSI_LAMMPS_lj %device_type=cpu %module_name=LAMMPS/22Jul2025-foss-2024a-kokkos %scale=1_node /ade8cad7 @BotBuildTests:a64fx+default
P: perf: 558.809 timesteps/s (r:0, l:None, u:None)
[ OK ] (4/5) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:a64fx+default
P: latency: 0.89 us (r:0, l:None, u:None)
[ OK ] (5/5) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:a64fx+default
P: bandwidth: 7941.78 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 3/5 test case(s) from 5 check(s) (0 failure(s), 2 skipped, 0 aborted)
Details
✅ job output file slurm-1117960.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Apr 04 09:21:19 UTC 2026 uploaded transfer of eessi-2025.06-software-linux-aarch64-a64fx-17752276400.tar.zst to S3 bucket succeeded

@boegel
Copy link
Copy Markdown
Contributor

boegel commented Apr 3, 2026

There's trouble in Terraform Cloud currently (see https://status.hashicorp.com/incidents/01KN93PZZ0VD0NZ6NW577RH51K), so the zen2 nodes are not being spun up.
I've cancelled those jobs, using a64fx as escape hatch (CPU target doesn't matter for changes to init/*).

@boegel boegel added 2025.06-software.eessi.io 2025.06 version of software.eessi.io 2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy labels Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023.06-software.eessi.io 2023.06 version of software.eessi.io 2025.06-software.eessi.io 2025.06 version of software.eessi.io bot:deploy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants