Conversation
The dateTimeStamp XML standard type supports fractional second notation: https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp Before this patch, the tool crashed when fractional timestamps were given in e.g. CreationInfo. Current implementation is limited to microsecond resolution. More fine-grained timestamps (nanosecond), will be silently truncated to microsecond resolution. Signed-off-by: Zalan Blenessy <zalan.blenessy@volvocars.com>
|
PR Checks pass, please review. |
|
Hi @armintaenzertng ! Afaik. the ISO8601 spec support fractional seconds too. Would you be ok if we fix the schemas (all versions) instead for consistency? Example (v2.2.1):
|
|
Changing the spec is not for me to decide. @goneall, what's your stance on this? Should we allow fractional seconds? |
Changing the spec would involve creating an issue in the spdx-model repo and updating quite a few libraries.
If the proposal is to update the schemas to reflect the SPDX subset of the XML datetimestamp, that seems like a reasonable approach. |
Based on the proposed regex in this issue you seem to be adopting fractional seconds format (more ISO8601 compatibility). Do you really want a new issue that would propose quite the opposite (maintaining SPDX 2.3 timestamp => less ISO8601 compatibility)?
To be clear, I would incorporate the format restriction from the HTML specs: |
Good catch - I totally forgot about this prior suggestion - no need to introduce another issue. The proposal will likely only impact version 3.1 or later on the spec, so updating the schemas still makes sense for the prior versions. After thinking about this a bit more, creating a pull request directly in the SPDX spec repo for the 2.X versions would make more sense than creating an issue in the model repo. For 3.1, we should decide in our weekly tech call if we're going to support the more granular timestamps. |
|
My strong preference would be to retain the I see little value in having more granularity. For SPDXv2, we should update unqualified references to "ISO 8601" (in the schemas). For SPDXv3, we can improve the DateTime dataclass format restriction, as per spdx/spdx-3-model#865. |
|
On this specific patch, the initial description says:
This is correct. The specification does not allow arbitrary Unfortunately, it seems that Datatypes are not (yet) visible in the generated JSON-LD schema. |
|
Agree to improve the datetime format in SPDX 3.x. For SPDX 2.x spec, I tend to think we should not touch it, unless we are quite certain that once we update the SPDX 2.x spec, tools maintainers of major programming languages will update the tools accordingly. If we can make sure that there will be no huge gap in datetime compatibilities between tools, it may be ok. A middle ground is probably the approach this PR takes: being strict in output but flexible in input. The spec should be strict (no microsecond allowed), but it can also suggest that implementations CAN truncated it WITH warnings and NO microsecond will be recorded in the SBOM. -- This can be written in the spec, along with the update of the description in the schema to say "ISO 8601 but ...". -- On the implementation side, as this PR increases robustness of the library, I see the value of it. But until the spec got update, it should not truncate microsecond silently. Warning should be raised to clearly communicate that the input is not by the spec. (if needed, this robustness can be toggled). |
|
I think this PR is wrong, since it adds support for microsecond resolution, while the spec does not allow it. For SPDXv2, the issue (as I understand it) is that, the spec only allows our restricted format (e.g. clearly stated in the Annotation date definition), but the wording copied into the schema simply mentions "ISO 8601 format". My suggestion was to update the wording in the schema to say "subset of ISO 8601 format" or something. This is essentially for people who do not read the spec and only follow the schema. Any tool that accepted or generated fractional seconds in SPDXv2 (or v3) is clearly not following the spec. |
| # Based on the https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp | ||
| # The secondFrag allows fractional second notation as well. | ||
| # normalize to micro seconds so that we can use %f with strptime | ||
| date_str = re.sub(r"\.(\d{1,6})\d*Z$", r".\1Z", date_str) |
There was a problem hiding this comment.
| date_str = re.sub(r"\.(\d{1,6})\d*Z$", r".\1Z", date_str) | |
| date_str = re.sub(r"\.(\d{1,6})\d*Z$", r".\1Z", date_str) | |
| warnings.warn( | |
| "Invalid date format. Expected YYYY-MM-DDThh:mm:ssZ " | |
| "Sub-second fractions have been discarded", | |
| category=UserWarning, | |
| stacklevel=2 | |
| ) |
Line 18 is fine. It increases robustness, as a tool.
But it should not just go silently. It should raise a warning.
Users should be notified that this is wrong.
| # The secondFrag allows fractional second notation as well. | ||
| # normalize to micro seconds so that we can use %f with strptime | ||
| date_str = re.sub(r"\.(\d{1,6})\d*Z$", r".\1Z", date_str) | ||
| return datetime.strptime(date_str, "%Y-%m-%dT%H:%M:%S.%fZ") # raises ValueError if format does not match |
There was a problem hiding this comment.
| return datetime.strptime(date_str, "%Y-%m-%dT%H:%M:%S.%fZ") # raises ValueError if format does not match | |
| return datetime.strptime(date_str, "%Y-%m-%dT%H:%M:%SZ") # raises ValueError if format does not match |
Line 19 is not fine. It returns a data format not supported by the spec.
Fractions should be dropped.
The dateTimeStamp XML standard type supports fractional second notation: https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp
Before this patch, the tool crashed when fractional timestamps were given in e.g. CreationInfo.
Current implementation is limited to microsecond resolution. More fine-grained timestamps (nanosecond), will be silently truncated to microsecond resolution.