Skip to content

REST Spec: Add single and batch endpoints for loading relational objects (table, view, and future MV)#15830

Open
stevenzwu wants to merge 3 commits intoapache:mainfrom
stevenzwu:rest-spec-universal-load
Open

REST Spec: Add single and batch endpoints for loading relational objects (table, view, and future MV)#15830
stevenzwu wants to merge 3 commits intoapache:mainfrom
stevenzwu:rest-spec-universal-load

Conversation

@stevenzwu
Copy link
Copy Markdown
Contributor

@stevenzwu stevenzwu commented Mar 30, 2026

Introduce GET /v1/{prefix}/namespaces/{namespace}/relations/{relation}
to resolve a namespace-qualified name as a table or view in one round
trip, returning a discriminated LoadRelationResult.

The response uses a nested structure where each object-type branch
wraps the type-specific payload under a named key (table or view).
This design avoids field-name collisions and allows future composite
types like materialized views to carry both view and storage-table
payloads in a single response.

Made-with: Cursor
Model: claude-4.6-opus-high-thinking
view:
$ref: '#/components/schemas/LoadViewResult'

LoadRelationResult:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future materialized-view would look like

{
  "object-type": "materialized-view",
  "view": { },
  "storage-table": { }
}

Add POST /v1/{prefix}/relations/batch-load to batch load relations
(tables and views) across namespaces in one request.

Each request item uses a TableIdentifier (namespace + name) with
optional per-item etag and snapshots parameters. Each response item
includes a status field (200, 304, 404) and a nested result that
reuses LoadRelationResult with its object-type discriminator.

Servers may return unprocessed-identifiers to cap computation or
response payload size; clients should retry them.

Made-with: Cursor
Model: claude-4.6-opus-high-thinking
Made-with: Cursor
@stevenzwu stevenzwu force-pushed the rest-spec-universal-load branch from fd3e2d9 to 9a1edfd Compare March 30, 2026 20:40
Comment on lines +2284 to +2285
"GET /v1/{prefix}/namespaces/{namespace}/relations/{relation}",
"POST /v1/{prefix}/relations/batch-load"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a minor objection to the use of relations here since we want this to be a general endpoint for resolution. Objects like table/view are considered relations, but something like a function would not (unless you're for a strictly relational algebra definition, but that's not consistent with sql usage).

We may also include other objects in the future, so a more general term like resolve, identifiers, resources, or entities might be better.

Copy link
Copy Markdown
Contributor Author

@stevenzwu stevenzwu Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is related to the other identifier conflict domain discussion where we want to allow the same identifier for a relational object (like table) and a function. With that assumption, the endpoint will need to have object category in the path to distinguish them. Otherwise, we would require identifier uniqueness across all object types, which is not the consensus from the identifier conflict discussion.

description:
"
Load metadata for multiple relations in one request. Identifiers may span different namespaces.
Each item includes a `TableIdentifier` and optional per-item parameters (`etag` and `snapshots`).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good time to introduce Identifier that is just the same as a table identifier. Seems odd we would continue to use an object specific identifier type to reference multiple.

Since it has the same structure, you could possibly just have then extend identifier (depending on how that affects the open api structure and generated code).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server resolves each identifier as a table or view.


The per-item `status` in the response indicates the outcome:
Copy link
Copy Markdown
Contributor

@danielcweeks danielcweeks Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels awkward because we're using HTTP status codes for non-request/internal results. I don't think that makes a lot of sense and prefer we indicate result behavior in a different way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about defining a enum schema in the REST spec? it will be similar to the http status name though.

    BatchLoadItemResultStatus:
      type: string
      description: |
        The outcome of loading a single item in a batch load response.
      enum:
        - success
        - not-modified
        - not-found

Open to other suggestions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, previously @gaborkaszab and @jbonofre suggested using http status code in the design doc comment.

Copy link
Copy Markdown
Contributor Author

@stevenzwu stevenzwu Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little more research. Two patterns are common.

  1. use http status code
  • Microsoft Graph API — JSON batching where each item has its own integer HTTP status, headers, and body. Docs
  • Facebook/Meta Graph API — Each item has a code field (HTTP integer) plus optional headers and body. Docs
  • Elasticsearch Bulk API — Each item has an integer status field (e.g. 200, 201, 404, 409) plus a string result field (e.g. "created", "updated", "not_found"). Docs
  1. split into separate lists. AWS services commonly use this pattern.
  • AWS DynamoDB BatchGetItem — Found items in Responses, absent items silently omitted, incomplete items in UnprocessedKeys. No status field. Docs
  • AWS SQS SendMessageBatch / DeleteMessageBatch — Results split into Successful and Failed lists. Failed entries have Code (string error code like "InvalidParameterValue"), not HTTP status integers. Docs
  • AWS S3 DeleteObjects — In verbose mode, successful deletes listed in Deleted, failures in Errors with string Code (e.g. "AccessDenied"). Docs

type: string
description: |
The type of a catalog object.
enum:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may add values such as materialized-view or function in the future.

…cription

Address review feedback to avoid referencing what might come in the
future. The materialized-view and function context is captured in PR
comments instead.

Made-with: Cursor
Model: claude-4.6-opus-high-thinking
@stevenzwu stevenzwu force-pushed the rest-spec-universal-load branch from cebd6ab to b093e29 Compare April 2, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants