Skip to content

FROMLIST: iommu/arm-smmu: Use pm_runtime in fault handlers#796

Open
bibekpatro wants to merge 1 commit intoqualcomm-linux:tech/mem/iommufrom
bibekpatro:tech/mem/iommu
Open

FROMLIST: iommu/arm-smmu: Use pm_runtime in fault handlers#796
bibekpatro wants to merge 1 commit intoqualcomm-linux:tech/mem/iommufrom
bibekpatro:tech/mem/iommu

Conversation

@bibekpatro
Copy link

Commit d4a44f0 ("iommu/arm-smmu: Invoke pm_runtime across the driver") enabled pm_runtime for the arm-smmu device. On systems where the SMMU sits in a power domain, all register accesses must be done while the device is runtime active to avoid unclocked register reads and potential NoC errors.

So far, this has not been an issue for most SMMU clients because stall-on-fault is enabled by default. While a translation fault is being handled, the SMMU stalls further translations for that context bank, so the fault handler would not race with a powered-down SMMU.

Adreno SMMU now disables stall-on-fault in the presence of fault storms to avoid saturating SMMU resources and hanging the GMU. With stall-on-fault disabled, the SMMU can generate faults while its power domain may no longer be enabled, which makes unclocked accesses to fault-status registers in the SMMU fault handlers possible.

Guard the context and global fault handlers with pm_runtime_get_if_active() and pm_runtime_put_autosuspend() so that all SMMU fault register accesses are done with the SMMU powered. In case pm_runtime is not active we can safely ignore the fault as for pm runtime resume the smmu device is reset and fault registers are cleared.

List: https://lore.kernel.org/all/20260313-smmu-rpm-v2-1-8c2236b402b0@oss.qualcomm.com/
Fixes: b130440 ("drm/msm: Temporarily disable stall-on-fault after a page fault")
Co-developed-by: Pratyush Brahma pratyush.brahma@oss.qualcomm.com

Commit d4a44f0 ("iommu/arm-smmu: Invoke pm_runtime across the driver")
enabled pm_runtime for the arm-smmu device. On systems where the SMMU
sits in a power domain, all register accesses must be done while the
device is runtime active to avoid unclocked register reads and
potential NoC errors.

So far, this has not been an issue for most SMMU clients because
stall-on-fault is enabled by default. While a translation fault is
being handled, the SMMU stalls further translations for that context
bank, so the fault handler would not race with a powered-down
SMMU.

Adreno SMMU now disables stall-on-fault in the presence of fault
storms to avoid saturating SMMU resources and hanging the GMU. With
stall-on-fault disabled, the SMMU can generate faults while its power
domain may no longer be enabled, which makes unclocked accesses to
fault-status registers in the SMMU fault handlers possible.

Guard the context and global fault handlers with pm_runtime_get_if_active()
and pm_runtime_put_autosuspend() so that all SMMU fault register accesses
are done with the SMMU powered. In case pm_runtime is not active we can
safely ignore the fault as for pm runtime resume the smmu device is
reset and fault registers are cleared.

List: https://lore.kernel.org/all/20260313-smmu-rpm-v2-1-8c2236b402b0@oss.qualcomm.com/
Fixes: b130440 ("drm/msm: Temporarily disable stall-on-fault after a page fault")
Co-developed-by: Pratyush Brahma <pratyush.brahma@oss.qualcomm.com>
Signed-off-by: Pratyush Brahma <pratyush.brahma@oss.qualcomm.com>
Signed-off-by: Prakash Gupta <prakash.gupta@oss.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants