KEP-5915: Standardize async trace context propagation#5916
KEP-5915: Standardize async trace context propagation#5916artem-tkachuk wants to merge 4 commits intokubernetes:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: artem-tkachuk The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @artem-tkachuk! |
|
Hi @artem-tkachuk. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
cc @dashpole - incorporated your feedback from the design doc! |
keps/sig-instrumentation/5915-async-trace-context-propagation/README.md
Outdated
Show resolved
Hide resolved
|
|
||
| **Phase 1: Injection (Producer)** | ||
|
|
||
| When a component creates or updates an object's **spec** based on an incoming request, it is responsible for "stamping" the object with the current trace context. |
There was a problem hiding this comment.
Not sure how broadly "components" is meant here. I assume users calling the API are also responsible for stamping objects if they are making updates in the context of a span?
| **When to inject:** | ||
| - Creating a new object (e.g., API Server creates a Pod from user request) | ||
| - Updating the **spec** (e.g., changing a Deployment's replica count, updating a label) | ||
| - Making semantic changes that represent user intent or control plane decisions |
There was a problem hiding this comment.
Can you give an example of this kind of change that doesn't update the spec? I assume things like updating a label wouldn't apply, right?
| **When NOT to inject:** | ||
| - Updating **status** fields (e.g., Pod phase transitions, condition updates) | ||
| - Heartbeats or leader election updates | ||
| - Internal housekeeping that doesn't represent a new "action" |
There was a problem hiding this comment.
an example would be helpful here as well
| The guideline: **Inject when the update represents a new action or decision**, not for routine status reporting. | ||
|
|
||
| **Who is responsible for injection:** | ||
| - **API Server**: When handling user requests that create objects (e.g., `kubectl create pod`) |
There was a problem hiding this comment.
This doesn't make sense to me. If a user creates a pod, then the APIServer is responsible for adding the annotation. But if a user updates a pod's spec, then the user is responsible for addition the annotation? Seems like it would be nice for either the user (client) or the APIServer to be responsible for adding the annotation.
|
|
||
| // Option 1: Don't inject (preserve creation context) | ||
| pod.Spec.NodeName = node | ||
| s.client.Update(ctx, pod) |
There was a problem hiding this comment.
I believe the scheduler doesn't actually write to the pod object -- it uses a subresource callled binding, IIRC. I wonder if we should exclude subresources generally (which also includes status)... But that would mean no context propagation on the resize subresource. Because subresources can only make specific updates (e.g. binding can only update the pod's nodeName, and resize can only update the number of replicas, etc.), you probably can't update the object annotation in the same request. Controllers that update via subresources may not even have RBAC permissions to update annotations!
|
|
||
| #### Beta | ||
|
|
||
| - [ ] Adoption by at least one core component (e.g., Scheduler or Controller Manager). |
There was a problem hiding this comment.
For beta, I would like to demonstrate at least two controllers working together. E.g. scheduler and kubelet, or controller manager and kubelet, etc. Otherwise, i'm not sure we can trust that the propagation mechanisms work reliably and don't cause issues.
|
|
||
| This feature is for observability and does not have direct SLOs. It should not impact existing API Server SLOs. | ||
|
|
||
| ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? |
There was a problem hiding this comment.
This might be a bit detailed for what we are doing, and how often it occurs. I think it is OK for this feature to not have self-obs metrics given the result can be seen from the resulting trace instrumentation.
|
|
||
| - **Tracing enabled in Kubernetes components**: Any W3C TraceContext-compliant tracing implementation. Kubernetes currently uses OpenTelemetry (see KEP-647 for API Server, KEP-2831 for Kubelet). | ||
| - **Trace collection pipeline**: Any system that can collect and forward traces (e.g., OpenTelemetry Collector, Jaeger Agent, Zipkin Collector). | ||
| - **Trace storage backend**: Any backend that supports W3C TraceContext and Span Links (e.g., Jaeger, Tempo, Lightstep, Honeycomb). |
There was a problem hiding this comment.
| - **Trace storage backend**: Any backend that supports W3C TraceContext and Span Links (e.g., Jaeger, Tempo, Lightstep, Honeycomb). | |
| - **Trace storage backend**: Any backend that supports OTLP traces (e.g., Jaeger, Tempo, Lightstep, Honeycomb). |
The trace backends actually don't depend on W3C at all. Span links are part of the OTLP payload, so lets just require OTLP support
|
|
||
| ### Automatic Instrumentation via Controller Runtime | ||
|
|
||
| Instead of requiring explicit calls to `InjectContext` and `ExtractContext`, the controller runtime (e.g., controller-runtime or client-go's `SharedInformer`) could automatically inject/extract trace context. |
There was a problem hiding this comment.
I was thinking controller runtime would only automatically extract context, but injection would be up to the controller code. But i'm also OK to start with making controllers call extract in code. It isn't too bad...
a41ffb2 to
a5fb584
Compare
Proposes standardized mechanism for propagating W3C TraceContext across asynchronous boundaries in Kubernetes using object annotations and OpenTelemetry Span Links. Adds helper functions to component-base for controllers to inject/extract trace context and create linked spans. Signed-off-by: Artem Tkachuk <artemtkachuk@yahoo.com>
a5fb584 to
c8f7757
Compare
Signed-off-by: Artem Tkachuk <artemtkachuk@yahoo.com>
Signed-off-by: Artem Tkachuk <artemtkachuk@yahoo.com>
Per reviewer feedback: drop the webhook bullet since that approach is not planned; the Alternatives section still documents why it was not chosen. Signed-off-by: Artem Tkachuk <artemtkachuk@yahoo.com>
Proposes a standardized mechanism for propagating W3C TraceContext across asynchronous boundaries in Kubernetes using object annotations and OpenTelemetry Span Links. Adds helper functions to component-base for controllers to inject/extract trace context and create linked spans.
One-line PR description: Initial KEP proposal for standardizing async trace context propagation
Issue link: #5915
Other comments: This KEP proposes a standard way to propagate trace context across async boundaries in Kubernetes by storing W3C TraceContext in object annotations (tracing.k8s.io/traceparent) and using OpenTelemetry Span Links. Includes library functions in component-base for easy adoption by controllers.