Skip to content

fix: Attempting to serialize bytes type causes error (5660)#5684

Closed
aviruthen wants to merge 2 commits intoaws:masterfrom
aviruthen:fix/attempting-to-serialize-bytes-type-causes-error-5660
Closed

fix: Attempting to serialize bytes type causes error (5660)#5684
aviruthen wants to merge 2 commits intoaws:masterfrom
aviruthen:fix/attempting-to-serialize-bytes-type-causes-error-5660

Conversation

@aviruthen
Copy link
Copy Markdown
Collaborator

Description

The bytes type is not included in the primitive type checks in sagemaker-core/src/sagemaker/core/utils/utils.py. There are three functions that define what counts as a 'primitive': is_not_primitive(), is_primitive_list() (which delegates to is_not_primitive), and is_primitive_class(). When serialize() encounters a bytes value, is_not_primitive(b'1') returns True, causing it to fall into the _serialize_shape() branch which calls vars() on the bytes object, raising a TypeError. The fix is to add bytes to the primitive type tuples in is_not_primitive() and is_primitive_class().

Related Issue

Fixes GitHub issue 5660

Changes Made

  • sagemaker-core/src/sagemaker/core/utils/utils.py
  • sagemaker-core/tests/unit/generated/test_utils.py

AI-Generated PR

This PR was automatically generated by the PySDK Issue Agent.

  • Confidence score: 85%
  • Classification: bug
  • SDK version target: V3

Merge Checklist

  • Changes are backward compatible
  • Commit message follows prefix: description format
  • Unit tests added/updated
  • Integration tests added (if applicable)
  • Documentation updated (if applicable)

Copy link
Copy Markdown
Member

@mufaddal-rohawala mufaddal-rohawala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR correctly fixes a bug where bytes type was not recognized as a primitive, causing serialize() to fail with a TypeError. The logic change is sound and backward compatible. However, there is an indentation issue in the modified source file that needs to be fixed before merging.


def is_not_primitive(obj):
return not isinstance(obj, (int, float, str, bool, datetime.datetime))
return not isinstance(obj, (int, float, str, bool, bytes, datetime.datetime))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation error: This line has 8 spaces of indentation (two levels) but should have 4 spaces (one level) to match the original function body. This will still work in Python since it's consistently indented within the function, but it breaks the project's style conventions and will likely fail linting.

Suggested change
return not isinstance(obj, (int, float, str, bool, bytes, datetime.datetime))
return not isinstance(obj, (int, float, str, bool, bytes, datetime.datetime))


def is_primitive_class(cls):
return cls in (str, int, bool, float, datetime.datetime)
return cls in (str, int, bool, float, bytes, datetime.datetime)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same indentation error here: 8 spaces instead of 4. Please fix to match the rest of the codebase.

Suggested change
return cls in (str, int, bool, float, bytes, datetime.datetime)
return cls in (str, int, bool, float, bytes, datetime.datetime)

@@ -373,6 +373,31 @@ def test_serialize_method_nested_shape():
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: There's an extra blank line here (two blank lines between test functions inside the same module is fine per PEP 8 for top-level definitions, but the rest of this test file uses single blank lines between test functions). Minor, but worth keeping consistent with the existing file style.

result = serialize({"body": b"1"})
assert result == {"body": b"1"}


Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test, but the comment describes an implementation detail (walrus operator in _serialize_dict) that could become stale if the implementation changes. Consider rephrasing to describe the behavior rather than the implementation:

Suggested change
# Empty bytes is falsy, so serialize omits it from the result
result = serialize({"body": b""})
assert result == {}

@aviruthen aviruthen closed this Mar 26, 2026
@aviruthen aviruthen deleted the fix/attempting-to-serialize-bytes-type-causes-error-5660 branch March 26, 2026 20:29
Copy link
Copy Markdown
Collaborator

@sagemaker-bot sagemaker-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR correctly fixes a bug where bytes type was not recognized as a primitive, causing serialize() to fail with a TypeError. The code changes are minimal, backward compatible, and well-tested. A few minor observations on the tests and an extra blank line, but overall this is a clean fix.

result = serialize({"body": b"1"})
assert result == {"body": b"1"}


Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The comment here is helpful for understanding the behavior, but it's worth noting that this test documents a potentially surprising behavior — users might expect b"" to be preserved in the output (similar to how 0 or False might be treated). If the _serialize_dict walrus operator skips all falsy values, that could also affect 0, False, "", etc. This isn't a problem with your PR, but worth flagging as a pre-existing behavior that may warrant a follow-up discussion.


def is_not_primitive(obj):
return not isinstance(obj, (int, float, str, bool, datetime.datetime))
return not isinstance(obj, (int, float, str, bool, bytes, datetime.datetime))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor: there are now two consecutive blank lines after the function body (the added blank line plus the existing one). PEP 8 expects exactly two blank lines between top-level definitions, so this results in three blank lines total. CI formatting checks (black/flake8) should catch this, but worth fixing proactively.

Same issue applies at line 290 (is_primitive_class).


def is_primitive_class(cls):
return cls in (str, int, bool, float, datetime.datetime)
return cls in (str, int, bool, float, bytes, datetime.datetime)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same extra blank line issue here — three blank lines between is_primitive_class and class Unassigned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants