Skip to content

[WIP] Add Bbox primitive directly to Feature next to geometry#485

Open
Jennings Anderson (jenningsanderson) wants to merge 3 commits intodevfrom
add-bbox-to-overture-feature
Open

[WIP] Add Bbox primitive directly to Feature next to geometry#485
Jennings Anderson (jenningsanderson) wants to merge 3 commits intodevfrom
add-bbox-to-overture-feature

Conversation

@jenningsanderson
Copy link
Copy Markdown
Collaborator

@jenningsanderson Jennings Anderson (jenningsanderson) commented Mar 31, 2026

Description

Testing the addition of bbox directly into the Overture Feature to see how it propagates to the documentation.

...The bbox primitive is different from the GeoParquet bbox that we create, so we'll need to re-think this — but great to see how a 1 line addition flows so nicely from schema to schema-docs!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

🗺️ Schema reference docs preview is live!

🌍 Preview https://staging.overturemaps.org/schema/pr/485/schema/index.html
🕐 Updated Apr 09, 2026 18:54 UTC
📝 Commit 678e653
🔧 env SCHEMA_PREVIEW true

Note

♻️ This preview updates automatically with each push to this PR.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the core OvertureFeature model to surface a bbox field alongside geometry.

Changes:

  • Add a bbox: BBox field to OvertureFeature.
  • Reformat imports and a textwrap.dedent(...) call for the ext_* JSON Schema description.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# this is an enum in the JSON Schema, but that prevents OvertureFeature from being extended
type: TypeT
geometry: Geometry
bbox: BBox
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bbox is being overridden here as a required BBox, which changes the GeoJSON/Feature contract and conflicts with Feature.bbox being optional (packages/overture-schema-system/src/overture/schema/system/feature.py:198-205). This will also break existing construction/validation paths that omit bbox (e.g., GeoJSON without a bbox). If bbox is meant to remain optional, remove this override and inherit the field from Feature, or type it as Omitable[BBox] (or BBox | None with an appropriate default) and keep the Field description consistent with Feature.

Suggested change
bbox: BBox

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +11
from overture.schema.system.primitive import BBox, Geometry
from overture.schema.system.ref import Id, Identified
from overture.schema.system.string import (
CountryCodeAlpha2,
)
from overture.schema.system.string import CountryCodeAlpha2
from pydantic import BaseModel, ConfigDict, Field, GetJsonSchemaHandler, model_validator
from pydantic.json_schema import JsonSchemaValue
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import grouping/order here appears to violate the repo’s Ruff isort convention (third-party imports before first-party). In other core modules (e.g., packages/overture-schema-core/src/overture/schema/core/cartography.py:5-12), pydantic is imported before overture.*. Reorder these imports into stdlib / third-party / first-party blocks to avoid ruff (I001) failures.

Copilot uses AI. Check for mistakes.
@jenningsanderson Jennings Anderson (jenningsanderson) changed the title Add Bbox primitive directly to Feature next to geometry [WIP] Add Bbox primitive directly to Feature next to geometry Mar 31, 2026
@vcschapp
Copy link
Copy Markdown
Collaborator

Victor Schappert (vcschapp) commented Apr 1, 2026

Do we need to do this? My understanding is the CDP is now using the Pydantic version of the schema.

I think we can be out of the business of updating the JSON Schema now?


Given the shapeshifting nature of bbox structure across data formats, it's ideal if we can avoid codifying a single format-dependent variant. The Pydantic version tries to keep bbox format-agnostic.

@jenningsanderson
Copy link
Copy Markdown
Collaborator Author

Do we need to do this? My understanding is the CDP is now using the Pydantic version of the schema.

I think we can be out of the business of updating the JSON Schema now?

Given the shapeshifting nature of bbox structure across data formats, it's ideal if we can avoid codifying a single format-dependent variant. The Pydantic version tries to keep bbox format-agnostic.

Sorry for the complete lack of context on this one — let me shed some light:

As we move away from the conceptual GeoJSON schema to the what-you-see-is-what-you-get mode, I think we should ensure that the bbox attribute actually shows up in the schema documentation — not the primitive one referenced here, but the actual implementation of it as a struct of xmin, ymin, xmax, ymax. While this is really a convenience column added to the data in the central data pipeline, it is the only column present in the release parquet that's not discussed in the schema docs explicitly.

I don't think we should alter the bbox primitive that's part of a Geometry, as this is clearly foundational to the geometry — but instead introduce a new concept of an Overture Feature's bbox, built from this bbox primitive where we can define things like precision, data type, etc.

If defined in Pydantic, this would be accessible to the data pipeline validations and prevent future schema-related errors.

Bonus: This also gives us an opportunity to define what makes a bounding box valuable across themes. Since it's primarily for data-filtering, does it makes sense to clamp it to some arbitrary precision? In many cases, it's probably the most-read column, which means going across the wire — maybe it doesn't need centimeter level precision? Does that impact query time?

@vcschapp
Copy link
Copy Markdown
Collaborator

Do we need to do this? My understanding is the CDP is now using the Pydantic version of the schema.
I think we can be out of the business of updating the JSON Schema now?
Given the shapeshifting nature of bbox structure across data formats, it's ideal if we can avoid codifying a single format-dependent variant. The Pydantic version tries to keep bbox format-agnostic.

Sorry for the complete lack of context on this one — let me shed some light:

As we move away from the conceptual GeoJSON schema to the what-you-see-is-what-you-get mode, I think we should ensure that the bbox attribute actually shows up in the schema documentation — not the primitive one referenced here, but the actual implementation of it as a struct of xmin, ymin, xmax, ymax. While this is really a convenience column added to the data in the central data pipeline, it is the only column present in the release parquet that's not discussed in the schema docs explicitly.

I don't think we should alter the bbox primitive that's part of a Geometry, as this is clearly foundational to the geometry — but instead introduce a new concept of an Overture Feature's bbox, built from this bbox primitive where we can define things like precision, data type, etc.

If defined in Pydantic, this would be accessible to the data pipeline validations and prevent future schema-related errors.

Bonus: This also gives us an opportunity to define what makes a bounding box valuable across themes. Since it's primarily for data-filtering, does it makes sense to clamp it to some arbitrary precision? In many cases, it's probably the most-read column, which means going across the wire — maybe it doesn't need centimeter level precision? Does that impact query time?

bbox is defined in Pydantic as a primitive.

Because the aim is to keep the schema portable across data formats (including GeoJSON), the "primitive" Bbox explicitly does not state what its format-dependent structure is, same as Geometry. If we want people to know the Parquet structure, I think what we can do is write up the canned content of the docs page for Bbox in a way that explains what it looks like in different formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants