Skip to content

HDDS-12542. Support S3 signed multi chunks payload verification#10006

Draft
chungen0126 wants to merge 1 commit intoapache:masterfrom
chungen0126:HDDS-12542-design-doc
Draft

HDDS-12542. Support S3 signed multi chunks payload verification#10006
chungen0126 wants to merge 1 commit intoapache:masterfrom
chungen0126:HDDS-12542-design-doc

Conversation

@chungen0126
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This is a design doc for supporting s3 multi chunks upload verification.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12542

How was this patch tested?

None.

@chungen0126 chungen0126 changed the title s3 multi-chunks verification design doc HDDS-12542. Support S3 signed multi chunks payload verification Mar 30, 2026
@errose28 errose28 added s3 S3 Gateway design labels Mar 30, 2026

# Context & Motivation

Ozone S3 Gateway (S3G) currently utilizes SignedChunksInputStream to handle aws-chunked content-encoding for AWS Signature V4. However, it doesn’t do any signature verification now. This proposal aims to complete the existing SignedChunksInputStream to make sure signature verification is correct and minimize performance overhead.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Ozone S3 Gateway (S3G) currently utilizes SignedChunksInputStream to handle aws-chunked content-encoding for AWS Signature V4. However, it doesn’t do any signature verification now. This proposal aims to complete the existing SignedChunksInputStream to make sure signature verification is correct and minimize performance overhead.
Ozone S3 Gateway (S3G) currently utilizes SignedChunksInputStream to handle aws-chunked content-encoding for AWS Signature V4. However, it doesn’t do any signature verification now. This proposal aims to complete the existing SignedChunksInputStream to make sure signature verification is correct and minimize performance overhead.

To guarantee the correctness, stability, and security of the newly introduced chunk verification logic, a comprehensive testing strategy will be executed. This plan covers both granular unit testing for the stream parsing logic and end-to-end integration testing using official AWS SDKs.

- Unit Tests (SignedChunksInputStream.java): There is an existing test class for SignedChunksInputStream. To complete it, we should make sure it works well on different signatures.
- Integration Tests:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should likely also add integration tests for STREAMING-AWS4-HMAC-SHA256-PAYLOAD and STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER.


## Incremental Hashing

To maintain a low memory footprint during the continuous buffering process, the signature calculation utilizes Mac.update() (Incremental Hashing) directly on the incoming byte streams. This ensures that we validate the payload on the fly without allocating massive temporary byte arrays, avoiding Garbage Collection (GC) spikes during large multi-gigabyte uploads.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the signature calculation utilizes Mac.update() (Incremental Hashing) directly on the incoming byte streams won't work.

I was reading Task 3: Calculate Signature, and I saw that the signing is done through StringToSign, which needs to be generated by following the instructions in Create a string to sign

- Native Module Compilation: Similar to Ozone's existing rocks-native module, we will introduce a dedicated module (e.g., hdds-aws-crt-native). This module will compile the AWS CRT C/C++ source code using CMake directly during the Ozone Maven build process.
- Dynamic Loading: We will leverage Ozone's existing NativeLibraryLoader infrastructure to safely extract and load the compiled dynamic libraries (.so, .dylib) at runtime.
- JNI Integration: A Java wrapper will be implemented to pass the secret key to the native KDF function to derive the ECDSA public key, which will then be used by a new ECDSAChunkSignatureVerifier.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's also worth calling out in the design that for -PAYLOAD-TRAILER types, e.g., we need to:

  1. include x-amz-trailer in the header and specify the trailing header names as a string in a comma-separated list
  2. all trailing headers are written after the final chunk. In the case of uploading the data in multiple chunks, we must send a final chunk with 0 bytes of data before sending the trailing header.

Ref: https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html

@peterxcli peterxcli self-requested a review March 31, 2026 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

design s3 S3 Gateway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants