(docs) Add manual restore guidance for snapshots

Document the manual restore workflow in the README and surface snapshot-
specific restore commands on the snapshot detail page.

The guidance keeps restores intentionally manual for now: inspect the
snapshot data directory, run rsync with --dry-run, restore to staging
first, and treat hardlinked snapshot files as read-only.
This commit is contained in:
2026-05-21 02:01:40 +02:00
parent 8858e049ee
commit b78f102e9d
4 changed files with 95 additions and 0 deletions

View File

@@ -154,6 +154,37 @@ The UI includes:
- `/self-check/` for runtime checks
- `/logs/` for filtered pobsync service logs
## Restoring Data
pobsync 1.0 treats restores as an explicit manual operation. The control panel shows restore guidance on each snapshot
detail page, but it does not run restore commands for you yet. That is deliberate: restores should be inspected and
tested before data is copied back into a live system.
Each snapshot directory contains:
```
<snapshot>/data/ # backed-up filesystem contents
<snapshot>/meta/ # metadata and rsync logs
```
Use the `data/` directory as the rsync source. Start with a dry run and restore to a staging path first:
```
rsync -aHAX --numeric-ids --info=progress2 --dry-run /backups/example.org/scheduled/<snapshot>/data/ /restore/example.org/
rsync -aHAX --numeric-ids --info=progress2 /backups/example.org/scheduled/<snapshot>/data/ /restore/example.org/
```
After validating the staged files, copy the specific files or directories back to the target machine. For a full-host
restore, use another dry run before writing to the remote root:
```
rsync -aHAX --numeric-ids --info=progress2 --dry-run /backups/example.org/scheduled/<snapshot>/data/ root@example.org:/
```
Snapshots may use hardlinks for files that are unchanged between backups. That saves disk space and is safe for normal
restore copies, but do not edit files inside snapshot directories. Treat snapshots as read-only and copy data out with
rsync.
## SSH Keys
SSH keys can be managed from `/ssh-credentials/`. The recommended flow is to generate keys from Django or during the

View File

@@ -60,6 +60,38 @@
</section>
{% endif %}
<section class="panel">
<h2>Restore Guidance</h2>
<div class="stack spaced">
<div><strong>Snapshot data source:</strong> {{ restore.source_path }}</div>
<div><strong>Example staging destination:</strong> {{ restore.destination_path }}</div>
<div class="muted">
Restore from the snapshot's <code>data/</code> directory. Start with a dry run, restore to a staging path first,
and only then copy data back to a live host or service path.
</div>
</div>
<div class="stack spaced">
<div><strong>Inspect the snapshot:</strong></div>
<pre>{{ restore.inspect_command }}</pre>
</div>
<div class="stack spaced">
<div><strong>Dry-run restore to staging:</strong></div>
<pre>{{ restore.dry_run_command }}</pre>
</div>
<div class="stack spaced">
<div><strong>Restore to staging:</strong></div>
<pre>{{ restore.local_command }}</pre>
</div>
<div class="stack spaced">
<div><strong>Dry-run restore back to the source host:</strong></div>
<pre>{{ restore.remote_dry_run_command }}</pre>
</div>
<p class="muted">
Snapshots can contain hardlinks to files shared with earlier snapshots. Treat snapshot directories as read-only:
copy data out with rsync instead of editing files in place.
</p>
</section>
<section class="panel">
<h2>Backup Runs</h2>
<table>

View File

@@ -1421,6 +1421,13 @@ class ViewTests(TestCase):
self.assertContains(response, "Stats")
self.assertContains(response, "Files seen:</strong> 100")
self.assertContains(response, "Hardlinked files:</strong> 9")
self.assertContains(response, "Restore Guidance")
self.assertContains(response, f"{base.path}/data")
self.assertContains(response, f"/restore/{host.host}")
self.assertContains(response, "rsync -aHAX --numeric-ids --info=progress2 --dry-run")
self.assertContains(response, f"{base.path}/data/")
self.assertContains(response, "root@web-01.example.test:/")
self.assertContains(response, "Treat snapshot directories as read-only")
self.assertContains(response, child.dirname)
self.assertContains(response, f"Run {run.id}")
self.assertContains(response, reverse("run_detail", args=[run.id]))

View File

@@ -1,6 +1,7 @@
from __future__ import annotations
import json
import shlex
import shutil
import subprocess
from pathlib import Path
@@ -494,12 +495,14 @@ def snapshot_detail(request, snapshot_id: int):
SnapshotRecord.objects.select_related("host", "base").prefetch_related("derived_snapshots", "backup_runs"),
id=snapshot_id,
)
restore = _snapshot_restore_guidance(snapshot)
context = {
"snapshot": snapshot,
"stats": snapshot.metadata.get("stats") if isinstance(snapshot.metadata, dict) else {},
"metadata_json": _pretty_json(snapshot.metadata),
"backup_runs": snapshot.backup_runs.select_related("host").order_by("-created_at"),
"derived_snapshots": snapshot.derived_snapshots.select_related("host").order_by("-started_at", "dirname"),
"restore": restore,
}
return render(request, "pobsync_backend/snapshot_detail.html", context)
@@ -790,6 +793,28 @@ def _pretty_json(value: object) -> str:
return json.dumps(value or {}, indent=2, sort_keys=True)
def _snapshot_restore_guidance(snapshot: SnapshotRecord) -> dict[str, str]:
source_path = Path(snapshot.path) / "data"
destination_path = Path("/restore") / snapshot.host.host
quoted_source = _quote_path_with_trailing_slash(source_path)
quoted_destination = _quote_path_with_trailing_slash(destination_path)
quoted_remote_destination = shlex.quote(f"root@{snapshot.host.address or snapshot.host.host}:/")
common_args = "rsync -aHAX --numeric-ids --info=progress2"
return {
"source_path": str(source_path),
"destination_path": str(destination_path),
"inspect_command": f"ls -la {quoted_source}",
"dry_run_command": f"{common_args} --dry-run {quoted_source} {quoted_destination}",
"local_command": f"{common_args} {quoted_source} {quoted_destination}",
"remote_dry_run_command": f"{common_args} --dry-run {quoted_source} {quoted_remote_destination}",
}
def _quote_path_with_trailing_slash(path: Path) -> str:
return shlex.quote(str(path).rstrip("/") + "/")
def _run_rsync_log_path(run: BackupRun) -> Path | None:
if isinstance(run.result, dict):
log = run.result.get("log")