Skip to content

Doc: storage.yaml#12609

Open
masaori335 wants to merge 6 commits intoapache:11-Devfrom
masaori335:doc-storage-yaml
Open

Doc: storage.yaml#12609
masaori335 wants to merge 6 commits intoapache:11-Devfrom
masaori335:doc-storage-yaml

Conversation

@masaori335
Copy link
Contributor

@masaori335 masaori335 commented Oct 24, 2025

Revive #11000 for the config project. Diffs from #11000 are

  1. This deprecates storage.config and volume.config, but keeps them for compatibility, so we can merge to master branch.
  2. Rename spans.id to spans.name for clarity
  3. Add volumes.avg_obj_size and volumes.fragment_size support

If this looks good, I'll work on implementation.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces documentation for the new unified storage.yaml cache configuration, and marks the legacy storage.config and volume.config formats as deprecated. It also updates the admin config index and documents per-volume avg_obj_size and fragment_size overrides.

Changes:

  • Add comprehensive storage.yaml documentation including schema (spans, volumes, volumes.spans), allocation rules, and migration/backwards compatibility information.
  • Mark storage.config and volume.config as deprecated in favor of storage.yaml and cross-link them appropriately.
  • Update the admin “Configuration Files” index to include storage.yaml and to reflect the new deprecation status of storage.config and volume.config.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 11 comments.

File Description
doc/admin-guide/files/storage.yaml.en.rst New, full reference and examples for storage.yaml, covering storage spans, volume definitions (including avg_obj_size and fragment_size), allocation behavior, and compatibility with existing storage.config/volume.config.
doc/admin-guide/files/storage.config.en.rst Adds an .. important:: block to mark storage.config as deprecated in favor of storage.yaml.
doc/admin-guide/files/volume.config.en.rst Adds an .. important:: block to mark volume.config as deprecated in favor of storage.yaml.
doc/admin-guide/files/index.en.rst Registers storage.yaml in the configuration files toctree and updates one-line descriptions for storage.config, storage.yaml, and volume.config to reflect the new deprecations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

masaori335 and others added 2 commits February 2, 2026 13:21
Co-authored-by: Alan M. Carroll <amc@apache.org>
@masaori335 masaori335 marked this pull request as ready for review February 2, 2026 04:21
@masaori335 masaori335 requested a review from Copilot February 2, 2026 04:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bryancall
Copy link
Contributor

Feature comparison: storage.config + volume.config vs storage.yaml

I compared the documented features in the existing config files against the new storage.yaml docs and found a few gaps:

# Missing from storage.yaml docs Notes
1 Human-readable size suffixes (K, M, G, T) storage.config documents these. The new YAML examples only show raw byte values (e.g., 134217728 instead of 128M). Should document whether YAML sizes support the same suffixes.
2 Fragment size maximum (4MB) volume.config documents "this setting has a maximum value of 4MB". Not mentioned in storage.yaml fragment_size description.
3 Cache invalidation warning Both storage.config and volume.config warn that changes effectively invalidate the cache. The new storage.yaml docs do not include this warning.
4 Minimum span/volume sizes storage.config says "a formatted or raw disk must be at least 128 MB". volume.config says "128 MB is the smallest value" and sizes must be multiples of 128 MB. Not mentioned in storage.yaml.
5 Striping behavior volume.config explains how volumes stripe across disks for parallel I/O. This is useful context missing from the new docs.

Items 1-3 are the most important -- users will hit these in practice. 4 and 5 are nice-to-have.

Otherwise the docs look thorough. The examples are clear, the backwards compatibility section is helpful, and the avg_obj_size / fragment_size per-volume overrides are well documented.

Copy link
Contributor

@bryancall bryancall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment on the PR about the changes in the documentation

@masaori335
Copy link
Contributor Author

masaori335 commented Feb 19, 2026

@bryancall

  1. Human-readable size suffixes (K, M, G, T)

Add an example with 64G

  1. Fragment size maximum (4MB)

Updated

  1. Cache invalidation warning

There is a warning after the key descriptions

  1. Minimum span/volume sizes

It's already mentioned exactly same as storage.config ( it was just copied )

  1. Striping behavior

Copied the sentence from volume.config

@masaori335
Copy link
Contributor Author

[approve ci centos]

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

| fragment_size | integer | Overrides the global :ts:cv:`proxy.config.cache.target_fragment_size` configuration for this volume. |
| | | This allows for a smaller, or larger, fragment size for a particular volume. This may be useful |
| | | together with ``avg_obj_size`` as well, since a larger fragment size could reduce the number of |
| | | directory entries needed for a large object. Note that this setting has a maximmum value of 4MB. |
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in this table cell: "maximmum" should be "maximum".

Suggested change
| | | directory entries needed for a large object. Note that this setting has a maximmum value of 4MB. |
| | | directory entries needed for a large object. Note that this setting has a maximum value of 4MB. |

Copilot uses AI. Check for mistakes.
which will effectively clear most of the cache. This can be a problem when drives fail and a system
reboot causes the path names to change.

The :arg:`name` option can be used to create a fixed string that an administrator can use to keep the
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment-table paragraph refers to the ":arg:name option" as controlling the assignment hash seed, but earlier the document defines hash_seed as the key for isolating lookups from path changes. This looks inconsistent with the key definitions (and with storage.config where id= seeds the assignment table). Consider updating this to refer to the hash_seed key (and use a role consistent with other YAML-key references, e.g., hash_seed / :code:).

Suggested change
The :arg:`name` option can be used to create a fixed string that an administrator can use to keep the
The :code:`hash_seed` key can be used to create a fixed string that an administrator can use to keep the

Copilot uses AI. Check for mistakes.
Comment on lines +195 to +214
/dev/disk2 volume=3 # storage.config
volume=3 scheme=http size=512 # volume.config

The corresponding configuration would be

.. code-block:: yaml

cache:
spans:
- name: disk.2
path: /dev/disk2
volumes:
- id: 3
spans:
- use: disk.2
size: 512

Because volume sizes that are percentages are computed on span storage not already explicitly allocated, this will
leave none of "disk.2" for such allocation and therefore "disk.2" will be used only by volume "1". Note this
configuration is more flexible. If it was useful to have two linear volumes, each using exclusively half of the
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backwards-compatibility example appears inconsistent: it defines volume id: 3 but the explanation says the span will be used only by volume "1". Also, the example uses size: 512 (bytes per earlier text) which conflicts with the 128MB allocation granularity described later and is likely meant to be 512M (matching the legacy volume.config example).

Copilot uses AI. Check for mistakes.
Comment on lines +93 to 96
Defines cache space usage by individual protocols. (Deprecated in favor of :file:`storage.yaml`)

:doc:`jsonrpc.yaml.en`
Defines some of the configurable arguments of the jsonrpc endpoint.
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two definition entries are indented one extra space compared to the rest of the list, which is likely to render oddly in Sphinx. Align the indentation of the volume.config.en and jsonrpc.yaml.en description lines with the other entries (4 spaces).

Suggested change
Defines cache space usage by individual protocols. (Deprecated in favor of :file:`storage.yaml`)
:doc:`jsonrpc.yaml.en`
Defines some of the configurable arguments of the jsonrpc endpoint.
Defines cache space usage by individual protocols. (Deprecated in favor of :file:`storage.yaml`)
:doc:`jsonrpc.yaml.en`
Defines some of the configurable arguments of the jsonrpc endpoint.

Copilot uses AI. Check for mistakes.
Comment on lines +462 to +471
cache:
spans:
- name: disk
path: "/dev/sdb"
- name: ram.1
path: "/dev/ram.1"
- name: ram.2
path: "/dev/ram.2"
volumes:
- id: 1
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The YAML indentation in this example is inconsistent with the other snippets (extra leading spaces before spans: / volumes:). Keeping indentation consistent makes the examples easier to copy/paste and compare.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants