-
Notifications
You must be signed in to change notification settings - Fork 884
Open
Labels
module: qnnIssues related to Qualcomm's QNN delegate and code under backends/qualcomm/Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/partner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, QualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm
Description
Hello,
I was work with some old platform like SA8295 with htp version v68. In this repo and the aihub repo, When i use w4a16 recipe to quantize the model like smolvlm/qwen, it was error with:
[QNN Partitioner Op Support]: aten.view_copy.default | True
[QNN Partitioner Op Support]: aten.linear.default | True
[QNN Partitioner Op Support]: aten.view_copy.default | True
[QNN Partitioner Op Support]: aten.view_copy.default | True
[QNN Partitioner Op Support]: aten.permute_copy.default | True
[QNN Partitioner Op Support]: aten.view_copy.default | True
[QNN Partitioner Op Support]: aten.permute_copy.default | True
[QNN Partitioner Op Support]: aten.view_copy.default | True
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR] [Qnn ExecuTorch]: <E> [4294967295] has incorrect Value 68, expected >= 73.
[ERROR] [Qnn ExecuTorch]: <E> QnnBackend_validateOpConfig failed 3110
[ERROR] [Qnn ExecuTorch]: <E> Failed to validate op aten_native_layer_norm_default_24 with error 0xc26
But in some paper like AutoNeural, they use the ViT: W8A16, language model:W4A16 (page 10, in the table2).
it seems the v68 arch support this type of ops. I hope you can give me some infomation about how I can use w4a16 in llm/lvm to support larger model. :)
cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
module: qnnIssues related to Qualcomm's QNN delegate and code under backends/qualcomm/Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/partner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, QualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm