Skip to content

A100 scoring changes#916

Merged
priyakasimbeg merged 9 commits intodevfrom
a100_scoring_changes
Mar 27, 2026
Merged

A100 scoring changes#916
priyakasimbeg merged 9 commits intodevfrom
a100_scoring_changes

Conversation

@aahladc
Copy link
Copy Markdown
Contributor

@aahladc aahladc commented Mar 19, 2026

Updating scripts to work well with the new round of submissions (which run on A100) on slurm.

priyakasimbeg and others added 7 commits March 9, 2026 13:13
1. Add finewebedu workload.
2. Update num_trials and num_studies to be flag defined (since they can vary between self and external tuning rulesets)
3. Have every run use a different seed for more variability
1. Ensure any variable can be passed in via flags. Folks shouldn't have to edit the file and hardcode variables for any reason.
2. Pass max global steps via a flag.
3. Update some default values for the new submission (repo/image/config file/logs bucket)
The script is meant to be used only in the slurm cluster. It forces a specific directory structure, and checks for it right in the beginning. If the dir structure is not as expected, it throws an error and explains the structure it expects.

It also includes a dry run flag which runs the job for 10 steps, and includes a command on how to use it at the top of the file.
Also update the readme file to explain this script.
1. Update base workloads to 9 (with finewebedu).
2. Remove all logic related to test targets, since they are no longer used. Work only with validation targets.
3. Fix step time computation.
@aahladc aahladc requested a review from a team as a code owner March 19, 2026 20:50
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

aahladc added 2 commits March 24, 2026 15:21
1. Fix linter issue in make_job_config
2. Fix typo in docker image creation script
3. Add importlib resources to ensure backwards compatibilty. Without that the tests seem to fail on ModuleNotFoundError.
@priyakasimbeg priyakasimbeg merged commit d77c538 into dev Mar 27, 2026
30 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants