Skip to content

refactor(orch): make base provision.sh idempotent#2435

Open
tvi wants to merge 1 commit intomainfrom
t/prov-sh
Open

refactor(orch): make base provision.sh idempotent#2435
tvi wants to merge 1 commit intomainfrom
t/prov-sh

Conversation

@tvi
Copy link
Copy Markdown
Contributor

@tvi tvi commented Apr 18, 2026

Make the base template provisioning script safe to re-run without producing duplicated configuration lines or failing on already-applied state changes. This enables the same script to be executed multiple times against the same rootfs (e.g. when re-running the base phase of template builds) without drift or errors.

Non-idempotent behaviors addressed:

  • chattr +i / -i on /etc/resolv.conf would fail (under set -eu) when the immutable bit was already in the target state. Both calls are now guarded by a new is_immutable helper that inspects lsattr output.

  • PS1 exports were blindly appended to /etc/profile and /root/.bashrc with >>, producing duplicate lines on every run. They are now appended only when the exact line is not already present.

  • The "source ~/.bashrc / ~/.profile" snippet appended to /etc/profile had the same duplication problem and is now guarded the same way.

  • The sshd_config heredoc appended PermitRootLogin, PermitEmptyPasswords, and PasswordAuthentication directives on every invocation, producing repeated entries. The directives are now appended individually via a loop, each guarded against duplication. A touch ensures the file exists before grep on first run.

  • fs.inotify.max_user_watches=65536 written to /etc/sysctl.conf via tee -a would duplicate on re-runs. Replaced with a guarded append.

  • rm -rf on the self-deletion paths switched to rm -f; semantics are unchanged but intent is clearer for single-file removal.

Introduced three POSIX-sh helpers (compatible with the #!/bin/sh shebang and BusyBox sh / dash):

  • is_package_installed (pre-existing, relocated up)
  • append_line_if_missing
  • is_immutable

Helpers are defined at the top of the script, before the first use site, so the immutable-bit check can leverage is_immutable.

Already-idempotent operations left unchanged: package install loop (skips when installed), passwd -d root, chrony config writes (use > truncate), systemctl mask invocations, rm -rf /etc/machine-id, ln -sf for the systemd init symlink, and the final RESULT_PATH write.

Verified with sh -n syntax check.

Make the base template provisioning script safe to re-run without
producing duplicated configuration lines or failing on already-applied
state changes. This enables the same script to be executed multiple
times against the same rootfs (e.g. when re-running the base phase of
template builds) without drift or errors.

Non-idempotent behaviors addressed:

- chattr +i / -i on /etc/resolv.conf would fail (under set -eu) when
  the immutable bit was already in the target state. Both calls are
  now guarded by a new is_immutable helper that inspects lsattr output.

- PS1 exports were blindly appended to /etc/profile and /root/.bashrc
  with >>, producing duplicate lines on every run. They are now
  appended only when the exact line is not already present.

- The "source ~/.bashrc / ~/.profile" snippet appended to /etc/profile
  had the same duplication problem and is now guarded the same way.

- The sshd_config heredoc appended PermitRootLogin,
  PermitEmptyPasswords, and PasswordAuthentication directives on every
  invocation, producing repeated entries. The directives are now
  appended individually via a loop, each guarded against duplication.
  A touch ensures the file exists before grep on first run.

- fs.inotify.max_user_watches=65536 written to /etc/sysctl.conf via
  tee -a would duplicate on re-runs. Replaced with a guarded append.

- rm -rf on the self-deletion paths switched to rm -f; semantics are
  unchanged but intent is clearer for single-file removal.

Introduced three POSIX-sh helpers (compatible with the #!/bin/sh
shebang and BusyBox sh / dash):

- is_package_installed <pkg> (pre-existing, relocated up)
- append_line_if_missing <line> <file>
- is_immutable <file>

Helpers are defined at the top of the script, before the first use
site, so the immutable-bit check can leverage is_immutable.

Already-idempotent operations left unchanged: package install loop
(skips when installed), passwd -d root, chrony config writes (use >
truncate), systemctl mask invocations, rm -rf /etc/machine-id,
ln -sf for the systemd init symlink, and the final RESULT_PATH write.

Verified with sh -n syntax check.
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean mechanical refactor with no logic changes beyond guarding appends and chattr calls.

Extended reasoning...

Overview

Single-file change to provision.sh: introduces two small POSIX-sh helpers (append_line_if_missing, is_immutable) and applies them consistently to replace unconditional >> appends and unguarded chattr calls. The rm -rfrm -f cleanup is also correct for single-file removal.

Security risks

None introduced. SSH directives, sysctl settings, and resolv.conf handling are unchanged in semantics; the refactor only prevents duplicates and guards against already-applied state.

Level of scrutiny

Low — this is a provisioning helper script for template builds, not production runtime code. The changes are purely mechanical: no new behavior, no new config values, no changed defaults. The idempotency improvements are the stated goal and are correctly implemented.

Other factors

The two bugs flagged are pre-existing (the source bash-ism in SOURCE_LINE) and a low-risk nit (no trailing-newline guard) with negligible practical impact given all target files are standard POSIX text files. Neither was introduced by this PR, and neither blocks merging.

Comment on lines +20 to +24
append_line_if_missing() {
line=$1
file=$2
grep -qxF "$line" "$file" 2>/dev/null || echo "$line" >>"$file"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The append_line_if_missing helper appends via echo "$line" >>"$file" without first verifying the file ends with a newline; if any target file lacks a trailing newline, the new line is concatenated directly onto the last line (e.g. kernel.core_uses_pid=1fs.inotify.max_user_watches=65536), silently corrupting both entries. A subsequent grep -xF would fail to find the merged line, also breaking idempotency. In practice this is unlikely given all target files are standard Ubuntu system files that POSIX convention guarantees end with newlines, but a defensive printf '%s\n' with a conditional leading newline would harden the helper against edge cases.

Extended reasoning...

What the bug is and how it manifests

The append_line_if_missing helper (lines 20–24) uses echo "$line" >>"$file" to append a new configuration line. The POSIX echo command writes the text followed by a newline character, but it does not prepend a newline before the text. If the target file's last byte is not a newline character, the new content is appended immediately after the last byte with no separator, effectively merging the two lines into one.

The specific code path that triggers it

Every call to append_line_if_missing is affected: appending PS1_LINE to /etc/profile and /root/.bashrc, appending SOURCE_LINE to /etc/profile, appending the three SSH directives to /etc/ssh/sshd_config, and appending fs.inotify.max_user_watches=65536 to /etc/sysctl.conf. The grep guard only prevents a duplicate when the line is found as an exact whole-line match (-xF); it does not protect against the newline-concatenation scenario.

Why existing code doesn't prevent it

The grep -qxF guard checks whether the exact line already exists in the file. If the file is missing a trailing newline and a previous append already corrupted the last line, the grep will not find the intended line (it finds a mangled version), so the guard will incorrectly allow another append, compounding the corruption on every re-run. There is no check anywhere in the script that ensures the file ends with a newline before appending.

What the impact would be

For /etc/sysctl.conf, two merged directives like kernel.core_uses_pid=1fs.inotify.max_user_watches=65536 would cause sysctl --system to silently ignore both settings (or emit parse errors), meaning the inotify watch limit would not be raised. For /etc/ssh/sshd_config, concatenated directives would cause sshd to fail to parse the config, potentially blocking SSH access entirely.

How to fix it

Before appending, check whether the file ends with a newline (e.g., [ -s "$file" ] && [ "$(tail -c1 "$file" | wc -l)"] -eq 0 ] or tail -c1) and conditionally prepend a newline. A simple defensive form: printf '%s\n' "$line" >>"$file" still has the same problem; the real fix is: { [ \! -s "$file" ] || [ "$(tail -c1 "$file")"=$'\n' ] || printf '\n'; printf '%s\n' "$line"; } >>"$file"`.

Step-by-step proof

  1. Suppose /etc/sysctl.conf has been manually edited and its last line kernel.core_uses_pid=1 is missing its trailing newline (final byte is 1, not \n).
  2. grep -qxF 'fs.inotify.max_user_watches=65536' /etc/sysctl.conf returns non-zero (line not found).
  3. echo 'fs.inotify.max_user_watches=65536' >> /etc/sysctl.conf appends: file now ends with kernel.core_uses_pid=1fs.inotify.max_user_watches=65536\n.
  4. On re-run, grep -qxF 'fs.inotify.max_user_watches=65536' again returns non-zero (only the merged line exists), and the append repeats: kernel.core_uses_pid=1fs.inotify.max_user_watches=65536\nfs.inotify.max_user_watches=65536\n — now corrupt and duplicated.

Addressing the refutation

The refutation correctly notes that all target files on a fresh Ubuntu rootfs are POSIX text files that virtually always end with newlines, and that touch /etc/ssh/sshd_config creates a safely empty file. This makes the real-world risk very low in the controlled provisioning environment. The bug is rated nit accordingly — not blocking, but the helper is fragile by design and a one-line fix would make it robust against edge cases such as custom base images or manually modified files.

Comment on lines +71 to +72
SOURCE_LINE='if [ -f ~/.bashrc ]; then source ~/.bashrc; fi; if [ -f ~/.profile ]; then source ~/.profile; fi'
append_line_if_missing "$SOURCE_LINE" /etc/profile
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 The SOURCE_LINE written to /etc/profile uses 'source ~/.bashrc' and 'source ~/.profile', which is a bash-ism not present in POSIX sh (dash, which is /bin/sh on Debian/Ubuntu). This is a pre-existing issue that the PR does not introduce — the identical line content was already being appended before the idempotency refactor — but the PR is a good opportunity to fix it by replacing 'source' with '.' (dot) in SOURCE_LINE.

Extended reasoning...

What the bug is

The SOURCE_LINE constant defined at line 71 is:

SOURCE_LINE='if [ -f ~/.bashrc ]; then source ~/.bashrc; fi; if [ -f ~/.profile ]; then source ~/.profile; fi'

The source keyword is a bash/ksh extension. It is not part of the POSIX sh specification. On Debian and Ubuntu systems, /bin/sh is dash, which strictly implements POSIX sh and does not have a source builtin. The POSIX-portable equivalent is . (dot).

The specific code path

This string is appended to /etc/profile via append_line_if_missing. The file /etc/profile is a system-wide login shell initialization file that is sourced by any POSIX-compatible login shell, not just bash. The script itself declares #\!/bin/sh at line 1, signaling intent for POSIX portability.

Why existing code doesn't prevent it

The idempotency change in this PR correctly prevents duplicate appends but does not alter the content of SOURCE_LINE itself. The string written to /etc/profile is byte-for-byte identical to what was previously appended with >> in the pre-PR code. No validation or substitution of source. is performed.

Impact

When any process sources /etc/profile via a non-bash sh interpreter (e.g., a PAM module invoking /bin/sh -l, a CI script using sh -l, or a system service), it will encounter:

/etc/profile: XX: source: not found

The intended ~/.bashrc and ~/.profile sourcing silently fails. In practice the primary use case (root SSH login with bash as the configured shell) is unaffected since bash understands source, but any non-bash login shell path is broken.

Concrete proof

  1. Provision runs; /etc/profile now contains the line: if [ -f ~/.bashrc ]; then source ~/.bashrc; fi; ...
  2. A PAM login or system script invokes dash -l (i.e., /bin/sh -l).
  3. Dash sources /etc/profile as part of login initialization.
  4. Dash reaches source ~/.bashrc → prints source: not found and exits the command with error.
  5. The ~/.bashrc customizations (PATH, aliases, etc.) are never applied for that session.

How to fix

Replace source with . in the SOURCE_LINE definition:

SOURCE_LINE='if [ -f ~/.bashrc ]; then . ~/.bashrc; fi; if [ -f ~/.profile ]; then . ~/.profile; fi'

This is functionally identical for bash (which supports both forms) and correct for all POSIX-compliant shells.

Pre-existing note

This bug predates the PR — the same string was appended unconditionally before the idempotency refactor. The PR is flagged because it directly touches this line and is an obvious opportunity to correct the content at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants