Add generic module reload validation and harden fastrpc unload hang coverage #336
Add generic module reload validation and harden fastrpc unload hang coverage #336smuppand wants to merge 2 commits intoqualcomm-linux:mainfrom
Conversation
|
Is there an example job using this code? It almost seems like there is too much code for one commit. |
|
I'm also puzzled by the complexity of this. I was expecting <30 lines in total! For example, while I agree it makes sense to make the load/unload test generic of the specific module being tested, why is |
Thanks — this is a fair concern. I agree the current split is not ideal if future profiles have to repeat this much code. The right end state is for a profile to be mostly declarative (module name, services, patterns, mode, a few knobs), with the common service/process wait/kill helpers moved into lib_module_reload.sh. For this PR, I started with the fastrpc-specific flow to make the unload-hang regression reproducible first. I can follow up by shrinking fastrpc.profile and moving the reusable pieces into the shared library so that adding a second module profile is much smaller and cleaner. |
|
…ence capture Avoid wedging the test harness when unload or load commands time out by returning immediately after timeout evidence collection. Also narrow module state capture to the target module and improve timeout-path stdout reporting for easier debugging. Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
I’ve pushed changes to reduce the amount of procedural logic living in fastrpc.profile. The common service/process lifecycle handling is being kept in the shared module-reload layer so future profiles can stay smaller and mostly parameter-driven, with only the module-specific lifecycle kept in the profile. Please take another look. |
| - cd Runner/suites/Kernel/Baseport/Module_Reload_Validation | ||
| - SYSRQ_ARG="" | ||
| - if [ "${ENABLE_SYSRQ_HANG_DUMP}" = "0" ]; then SYSRQ_ARG="--disable-sysrq-hang-dump"; fi | ||
| - ./run.sh --module "${PROFILE}" --iterations "${ITERATIONS}" --mode "${MODE}" --timeout-unload "${TIMEOUT_UNLOAD}" --timeout-load "${TIMEOUT_LOAD}" --timeout-settle "${TIMEOUT_SETTLE}" ${SYSRQ_ARG} || true |
There was a problem hiding this comment.
Any specific reason why ${SYSRQ_ARG}, instead we can directly use --enable-sysrq-hang-dump 0 which will reduce the number of CLI parameters and also the condition in line#26
I see u are using the same ENABLE_SYSRQ_HANG_DUMP to set the value as 1/0 anyway
There was a problem hiding this comment.
Any specific reason why ${SYSRQ_ARG}, instead we can directly use --enable-sysrq-hang-dump 0 which will reduce the number of CLI parameters and also the condition in line#26
I see u are using the same ENABLE_SYSRQ_HANG_DUMP to set the value as 1/0 anyway
The current runner CLI uses explicit boolean flags (--enable-sysrq-hang-dump / --disable-sysrq-hang-dump) and does not accept --enable-sysrq-hang-dump 0, so I’ve kept the YAML-side mapping to the optional disable flag unchanged.
Runner/suites/Kernel/Baseport/Module_Reload_Validation/Module_Reload_Validation_README.md
Show resolved
Hide resolved
Runner/suites/Kernel/Baseport/Module_Reload_Validation/Module_Reload_Validation_README.md
Show resolved
Hide resolved
Runner/suites/Kernel/Baseport/Module_Reload_Validation/Module_Reload_Validation.yaml
Outdated
Show resolved
Hide resolved
|
|
||
| run: | ||
| steps: | ||
| - REPO_PATH=$PWD |
There was a problem hiding this comment.
Append || true
It appears that, aside from run.sh, this approach isn't being followed in this repository.
| run: | ||
| steps: | ||
| - REPO_PATH=$PWD | ||
| - cd Runner/suites/Kernel/Baseport/Module_Reload_Validation |
There was a problem hiding this comment.
Append || true
It appears that, aside from run.sh, this approach isn't being followed in this repository.
| - SYSRQ_ARG="" | ||
| - if [ "${ENABLE_SYSRQ_HANG_DUMP}" = "0" ]; then SYSRQ_ARG="--disable-sysrq-hang-dump"; fi | ||
| - ./run.sh --module "${PROFILE}" --iterations "${ITERATIONS}" --mode "${MODE}" --timeout-unload "${TIMEOUT_UNLOAD}" --timeout-load "${TIMEOUT_LOAD}" --timeout-settle "${TIMEOUT_SETTLE}" ${SYSRQ_ARG} || true | ||
| - $REPO_PATH/Runner/utils/send-to-lava.sh Module_Reload_Validation.res |
There was a problem hiding this comment.
Append || true
Run.sh already handles the exit code. I don't believe the .res file step specifically requires this.
Update the fastrpc profile to stop and mask rpc daemons, verify remaining daemon processes, and provide clearer stdout context when quiesce or unload paths fail. Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
Add a new generic Module_Reload_Validation suite as per the new requirement #326 and wire in an initial fastrpc profile for unload/reload regression coverage.
Key changes in this PR:
The first target is fastrpc, where unload can hang after daemon activity.This suite is meant to make that regression reproducible in automation andto preserve useful evidence when it happens.
Necessary logs will be available