Skip to content

feat: add new kep to provider cpuidle capability for besteffort QoS pod#5904

Open
hahahaheihei wants to merge 2 commits intokubernetes:masterfrom
hahahaheihei:feat/cpu-idle-kep
Open

feat: add new kep to provider cpuidle capability for besteffort QoS pod#5904
hahahaheihei wants to merge 2 commits intokubernetes:masterfrom
hahahaheihei:feat/cpu-idle-kep

Conversation

@hahahaheihei
Copy link

  • One-line PR description:
    add new kep about enable cpu.idle for besteffort QoS pod
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Feb 9, 2026
@k8s-ci-robot
Copy link
Contributor

Welcome @hahahaheihei!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 9, 2026
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 9, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hahahaheihei
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @hahahaheihei. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 9, 2026
@hahahaheihei
Copy link
Author

@ffromani CC

@ffromani
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 10, 2026
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial review

replaces: []

# The target maturity stage in the current dev cycle for this KEP.
stage: stable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "beta", this tracks the current feature maturity, which starts as beta

The feature require cgroupV2 & linux kernel 5.4 or later(cpu.idle enable)
Users can configure the use of the CPU.idle function according to their needs or continue using cpu.shares/cpu.weight.

The feature prototype code is ready. [code](https://github.com/kubernetes/kubernetes/pull/136458)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the KEP represent the design of the feature, so a PoC can help to better illustrate the design. However, the code is transitory (evolves and changes), while the design should be stable and fixed in time.
I don't think the PR helps too much here.

Safe Colocation: It allows safe colocation of BestEffort batch jobs with latency-sensitive services without the risk of performance degradation for the latter


### Non-Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess non-goals are

  • change the cgroups weighting except the minimal changes required for this work
  • review QOS class handling in general


the new design , besteffort levels pod cpu.idle values set to 1

![](./new-qos.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we plan to implement the enable/disable toggle hinted in the "Proposal" above?

Comment on lines +180 to +182
cat besteffort qos level cpu.idle value
0 mean the feature is disabled
1 mean the feature is enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need automated tests here

@bleon-ethical
Copy link

/retest

@bleon-ethical
Copy link

Thank you so much for contributing to Kubernetes, I'm gathering information to be able to help you better..

@bleon-ethical
Copy link

Hi @hahahaheihei, after reviewing the test failures (pull-enhancements-test and verify), here's what's needed for them to pass:

Fix the kep.yaml:

Change stage: stable to stage: beta (as suggested by @ffromani).

Make sure the latest-milestone field matches the current Kubernetes version.

Complete the PRR questionnaire:

In the README.md file, find the Production Readiness Review Questionnaire section.

You must answer all the questions (Enablement, Rollback, Monitoring, etc.). The tests are failing because the validator sees that these sections are empty or contain the default answers.

Update the Table of Contents:

Run make update-toc in your terminal and push the changes. This will fix the failure in the verify-toc.sh script.

Cleanup:

Remove the link to the prototype code from README.md, as the KEP should focus on the design, not a temporary implementation.

Thank you again for your help, have a great day.

@bleon-ethical
Copy link

/assign

@AutuSnow
Copy link

/cc

@k8s-ci-robot
Copy link
Contributor

@AutuSnow: GitHub didn't allow me to request PR reviews from the following users: AutuSnow.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

@hahahaheihei: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-enhancements-test 457c414 link true /test pull-enhancements-test
pull-enhancements-verify 457c414 link true /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@chzhj
Copy link

chzhj commented Mar 9, 2026

@hahahaheihei Is there any update ?

@hahahaheihei
Copy link
Author

@hahahaheihei Is there any update ?

The design remains unchanged, with only the document format and obvious errors modified.

@chzhj
Copy link

chzhj commented Mar 10, 2026

@hahahaheihei Is there any update ?

The design remains unchanged, with only the document format and obvious errors modified.

Hi @hahahaheihei @ffromani ,

Since this KEP missed the v1.36 release window, I would love to see how we can keep the momentum going.

Is it possible to target the upcoming v1.37 release?

@ffromani
Copy link
Contributor

Is it possible to target the upcoming v1.37 release?

Yes, totally (note is too early to set the labels and we'd need a sig-node lead anyway)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants