-
Notifications
You must be signed in to change notification settings - Fork 730
add triton backend sampler #7015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/2.4
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -50,7 +50,7 @@ | |
| # Set attention backend. "NATIVE_ATTN", "APPEND_ATTN" | ||
| # and "MLA_ATTN" can be set currently. | ||
| "FD_ATTENTION_BACKEND": lambda: os.getenv("FD_ATTENTION_BACKEND", "APPEND_ATTN"), | ||
| # Set sampling class. "base", "base_non_truncated", "air" and "rejection" can be set currently. | ||
| # Set sampling class. "base", "base_non_truncated", "air", "rejection" and "triton" can be set currently. | ||
| "FD_SAMPLING_CLASS": lambda: os.getenv("FD_SAMPLING_CLASS", "base"), | ||
|
Comment on lines
+53
to
54
|
||
| # Set moe backend."cutlass","marlin" and "triton" can be set currently. | ||
| "FD_MOE_BACKEND": lambda: os.getenv("FD_MOE_BACKEND", "cutlass"), | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.flake8 对整个 top_k_top_p_triton.py 忽略 E121/E131 这类缩进/对齐规则会掩盖真实的缩进错误(不仅是表格对齐)。如果只是为了常量表的列对齐,建议改成在表格段落局部使用 # noqa: E241 等,或仅关闭必要的规则,避免对整文件放开缩进检查。