-
Notifications
You must be signed in to change notification settings - Fork 32
Add generic module reload validation and harden fastrpc unload hang coverage #336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| metadata: | ||
| name: Module_Reload_Validation | ||
| format: "Lava-Test Test Definition 1.0" | ||
| description: "Generic profiled kernel module unload/reload regression validation" | ||
| maintainer: | ||
| - Srikanth kumar | ||
| os: | ||
| - linux | ||
| scope: | ||
| - functional | ||
|
|
||
| params: | ||
| PROFILE: "" | ||
| ITERATIONS: "3" | ||
| MODE: "" | ||
| TIMEOUT_UNLOAD: "30" | ||
| TIMEOUT_LOAD: "30" | ||
| TIMEOUT_SETTLE: "20" | ||
| ENABLE_SYSRQ_HANG_DUMP: "1" | ||
|
|
||
| run: | ||
| steps: | ||
| - REPO_PATH=$PWD | ||
| - cd Runner/suites/Kernel/Baseport/Module_Reload_Validation | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Append || true
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It appears that, aside from run.sh, this approach isn't being followed in this repository. |
||
| - SYSRQ_ARG="" | ||
| - if [ "${ENABLE_SYSRQ_HANG_DUMP}" = "0" ]; then SYSRQ_ARG="--disable-sysrq-hang-dump"; fi | ||
| - ./run.sh --module "${PROFILE}" --iterations "${ITERATIONS}" --mode "${MODE}" --timeout-unload "${TIMEOUT_UNLOAD}" --timeout-load "${TIMEOUT_LOAD}" --timeout-settle "${TIMEOUT_SETTLE}" ${SYSRQ_ARG} || true | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any specific reason why ${SYSRQ_ARG}, instead we can directly use --enable-sysrq-hang-dump 0 which will reduce the number of CLI parameters and also the condition in line#26 I see u are using the same ENABLE_SYSRQ_HANG_DUMP to set the value as 1/0 anyway
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The current runner CLI uses explicit boolean flags (--enable-sysrq-hang-dump / --disable-sysrq-hang-dump) and does not accept --enable-sysrq-hang-dump 0, so I’ve kept the YAML-side mapping to the optional disable flag unchanged. |
||
| - $REPO_PATH/Runner/utils/send-to-lava.sh Module_Reload_Validation.res | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Append || true
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Run.sh already handles the exit code. I don't believe the .res file step specifically requires this. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,227 @@ | ||
| # Module_Reload_Validation | ||
|
|
||
| ## Overview | ||
| `Module_Reload_Validation` is a generic, profile-driven kernel module unload/reload regression suite. | ||
|
|
||
| It is intended to catch issues such as: | ||
| - module unload hangs, | ||
| - failed reloads, | ||
| - service/device rebind regressions after reload, | ||
| - issues that reproduce on the 1st, 2nd, or later reload iteration. | ||
|
|
||
| The suite uses: | ||
| - a **generic engine** in `run.sh`, | ||
| - shared helper logic in `Runner/utils/lib_module_reload.sh`, | ||
| - **module-specific profiles** under `profiles/`. | ||
|
|
||
| ## Folder layout | ||
|
|
||
| ```text | ||
| Runner/suites/Kernel/Baseport/Module_Reload_Validation/ | ||
| ├── run.sh | ||
| ├── Module_Reload_Validation.yaml | ||
| ├── profiles/ | ||
| │ ├── enabled.list | ||
smuppand marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| │ └── fastrpc.profile | ||
|
|
||
| Runner/utils/ | ||
| └── lib_module_reload.sh | ||
| ``` | ||
|
|
||
| ## Main components | ||
|
|
||
| ### `run.sh` | ||
| Thin orchestration layer that: | ||
| - parses CLI arguments, | ||
| - resolves the selected profile(s), | ||
| - invokes the generic library engine, | ||
| - writes `Module_Reload_Validation.res`. | ||
|
|
||
| ### `lib_module_reload.sh` | ||
| Shared module reload engine that handles: | ||
| - module state checks, | ||
| - timeout-controlled unload/load execution, | ||
| - per-iteration evidence collection, | ||
| - timeout-path hang evidence, | ||
| - profile hook dispatch, | ||
| - result handling. | ||
|
|
||
| ### `profiles/*.profile` | ||
| Each profile provides module-specific metadata and optional hook logic. | ||
|
|
||
| Example profile fields: | ||
| - `PROFILE_NAME` | ||
| - `PROFILE_DESCRIPTION` | ||
| - `MODULE_NAME` | ||
| - `PROFILE_MODE_DEFAULT` | ||
| - `PROFILE_REQUIRED_CMDS` | ||
| - `PROFILE_SERVICES` | ||
| - `PROFILE_DEVICE_PATTERNS` | ||
| - `PROFILE_SYSFS_PATTERNS` | ||
|
|
||
| Optional hooks: | ||
| - `profile_prepare` | ||
| - `profile_warmup` | ||
| - `profile_quiesce` | ||
| - `profile_post_unload` | ||
| - `profile_post_load` | ||
| - `profile_smoke` | ||
| - `profile_finalize` | ||
|
|
||
| ## Current starter profile | ||
|
|
||
| ### `fastrpc.profile` | ||
| Current profile covers FastRPC unload/reload validation and supports service lifecycle-based testing. | ||
|
|
||
| Default mode: | ||
| - `daemon_lifecycle` | ||
|
|
||
| Relevant services: | ||
| - `adsprpcd.service` | ||
| - `cdsprpcd.service` | ||
|
|
||
| ## Execution flow | ||
| For each selected profile: | ||
|
|
||
| 1. Validate the profile. | ||
| 2. Ensure the module is loaded before starting iteration work. | ||
| 3. Run warmup hook. | ||
| 4. Capture pre-state logs. | ||
| 5. Run quiesce hook. | ||
| 6. Attempt module unload with timeout. | ||
| 7. Validate module absence. | ||
| 8. Run post-unload hook. | ||
| 9. Attempt module reload with timeout. | ||
| 10. Validate module presence. | ||
| 11. Run post-load hook. | ||
| 12. Run smoke hook. | ||
| 13. Capture post-load state. | ||
| 14. Repeat for all iterations. | ||
| 15. Run finalize hook. | ||
|
|
||
| ## Hang handling policy | ||
| Sysrq dump is **not** triggered on normal passing iterations. | ||
|
|
||
| It is triggered only when an unload/load action actually times out. | ||
|
|
||
| Current behavior: | ||
| - normal pass -> no sysrq dump | ||
| - quick non-timeout failure -> normal failure evidence only | ||
| - timeout / hang -> hang evidence bundle + optional sysrq dump | ||
|
|
||
| Default behavior in current suite: | ||
| - sysrq hang dump enabled | ||
| - but only used on timeout paths | ||
|
|
||
| ## Evidence collected | ||
| Per profile / iteration, the suite can capture: | ||
| - command logs, | ||
| - `lsmod`, | ||
| - `modinfo`, | ||
| - `ps`, | ||
| - `dmesg`, | ||
| - service status and recent journal, | ||
| - profiled device path presence, | ||
| - profiled sysfs path presence, | ||
| - `/sys/module/<module>/holders`, | ||
| - timeout PID `/proc` snapshots, | ||
| - optional sysrq task/block dumps. | ||
|
|
||
| Results are stored under: | ||
|
|
||
| ```text | ||
| results/Module_Reload_Validation/<profile>/iter_XX/ | ||
| ``` | ||
|
|
||
| ## CLI usage | ||
|
|
||
| ### Run one profile | ||
| ```sh | ||
| ./run.sh --module fastrpc | ||
| ``` | ||
|
|
||
| ### Run one profile with more iterations | ||
| ```sh | ||
| ./run.sh --module fastrpc --iterations 5 | ||
| ``` | ||
|
|
||
| ### Override mode | ||
| ```sh | ||
| ./run.sh --module fastrpc --mode daemon_lifecycle | ||
| ./run.sh --module fastrpc --mode basic | ||
| ``` | ||
|
|
||
| ### Override timeouts | ||
| ```sh | ||
| ./run.sh --module fastrpc --timeout-unload 60 --timeout-load 60 --timeout-settle 30 | ||
| ``` | ||
|
|
||
| ### Disable sysrq timeout-path dumps | ||
| ```sh | ||
| ./run.sh --module fastrpc --disable-sysrq-hang-dump | ||
| ``` | ||
|
|
||
| ### Run all enabled profiles | ||
| ```sh | ||
| ./run.sh | ||
| ``` | ||
|
|
||
| If `--module` is empty or not given, the suite runs all profiles listed in `profiles/enabled.list`. | ||
|
|
||
| If `--mode` is empty or not given, the profile default mode is used. | ||
|
|
||
| ## YAML usage in LAVA | ||
| Current YAML is generic and takes profile input from the test plan. | ||
|
|
||
| Important params: | ||
| - `PROFILE` | ||
| - `ITERATIONS` | ||
| - `MODE` | ||
| - `TIMEOUT_UNLOAD` | ||
| - `TIMEOUT_LOAD` | ||
| - `TIMEOUT_SETTLE` | ||
| - `ENABLE_SYSRQ_HANG_DUMP` | ||
|
|
||
| Behavior: | ||
| - `PROFILE="fastrpc"` -> runs only `fastrpc.profile` | ||
| - `PROFILE=""` -> runs all enabled profiles | ||
| - `MODE=""` -> uses the profile default mode | ||
|
|
||
| ## How to add a new profile | ||
| 1. Create a new file under `profiles/`, for example: | ||
| - `profiles/ath11k_pci.profile` | ||
| 2. Define the required metadata: | ||
| - `PROFILE_NAME` | ||
| - `MODULE_NAME` | ||
| 3. Add hooks only if module-specific lifecycle handling is needed. | ||
| 4. Add the profile basename to `profiles/enabled.list`. | ||
smuppand marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| 5. Run locally with: | ||
|
|
||
| ```sh | ||
| ./run.sh --module ath11k_pci | ||
| ``` | ||
|
|
||
| No YAML duplication is needed for new profiles. | ||
|
|
||
| ## Result policy | ||
|
|
||
| ### PASS | ||
| - all requested iterations for a profile pass successfully. | ||
|
|
||
| ### FAIL | ||
| - unload/load timeout, | ||
| - unload/load command failure, | ||
| - module state validation failure, | ||
| - profile hook failure, | ||
| - smoke validation failure. | ||
|
|
||
| ### SKIP | ||
| - module not present, | ||
| - module built into the kernel, | ||
| - required commands not available, | ||
| - profile explicitly not reloadable. | ||
|
|
||
| ## Notes | ||
| - This suite is intended for **profiled, supported modules**, not blind reload of every loaded kernel module. | ||
| - The current structure avoids hidden run.sh-to-library globals as much as possible by passing explicit arguments and hook context. | ||
| - Profile hooks receive context arguments from the engine, so module-specific logic can store logs in the correct iteration directory without depending on hidden globals. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| fastrpc |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| #!/bin/sh | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause-Clear | ||
|
|
||
| PROFILE_NAME="fastrpc" | ||
| PROFILE_DESCRIPTION="FastRPC module reload validation" | ||
| MODULE_NAME="fastrpc" | ||
|
|
||
| MODULE_RELOAD_SUPPORTED="yes" | ||
| PROFILE_MODE_DEFAULT="daemon_lifecycle" | ||
|
|
||
| PROFILE_REQUIRED_CMDS="modprobe rmmod systemctl ps sed" | ||
| PROFILE_SERVICES="adsprpcd.service cdsprpcd.service" | ||
| PROFILE_PROC_PATTERNS="/usr/bin/adsprpcd /usr/bin/cdsprpcd" | ||
| PROFILE_DEVICE_PATTERNS="/dev/fastrpc-* /dev/*dsp*" | ||
| PROFILE_SYSFS_PATTERNS="/sys/module/fastrpc /sys/class/remoteproc/*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Append || true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that, aside from run.sh, this approach isn't being followed in this repository.