Files
pobsync/README.md
Peter van Arkel fe8e65e12e (feature) add queued backup worker foundation
Move backup execution out of the management command into a reusable
backup runner service that can execute an existing BackupRun record.

Add queue primitives and a run_pobsync_worker command so manual backup
requests can be recorded as queued SQL state and processed outside the
web request path.

Add a worker Docker service and pobsync worker CLI alias, with tests for
queued run creation, worker execution, manual run typing, and command
mapping.
2026-05-19 13:00:12 +02:00

4.8 KiB

pobsync

pobsync is a pull-based backup service. It runs on a central backup server and pulls data from remote machines via rsync over SSH.

The refactor direction is SQL-first:

  • Django is the management layer and source of truth.
  • SQLite is the default database; MariaDB is optional.
  • Backups still use the existing rsync snapshot engine internally.
  • Scheduling is handled by a Django/Docker scheduler process, not host cron.
  • Legacy YAML import/export exists only for migration and inspection.

Requirements

On the backup server or in the container:

  • Python 3.11+
  • rsync
  • ssh
  • SSH key-based access from the backup server to remotes

Local Development

python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install -e .
mkdir -p var
python3 manage.py migrate
python3 manage.py createsuperuser
python3 manage.py runserver

The admin is available at:

Staff-only JSON endpoints are available at:

SQL-First Setup

Create global config:

pobsync configure-global --backup-root /mnt/backups/pobsync

Create a host config:

pobsync configure-host <host> --address <host-or-ip>

Run a backup:

pobsync backup <host> --prune

Create or update a schedule:

pobsync schedule <host> --cron "15 2 * * *" --prune

Run the scheduler:

pobsync scheduler --loop --interval 60

Plan or apply retention manually:

pobsync retention <host>
pobsync retention <host> --apply --yes --max-delete 10

Discover snapshots already present on disk:

pobsync discover-snapshots --host <host>

The pobsync executable is a thin wrapper around Django management commands. Direct Django access is also available:

pobsync django check
python3 manage.py run_pobsync_backup <host> --prune

Migration Helpers

Import existing legacy YAML configs:

python3 manage.py import_pobsync_configs --prefix /opt/pobsync

Export SQL config to legacy runtime YAML for inspection or one-off compatibility:

python3 manage.py export_pobsync_configs --prefix /opt/pobsync

These commands are migration helpers, not the normal operating model.

Docker With SQLite

docker compose up --build web

This starts Django on:

Run the scheduler alongside the web admin:

docker compose up --build web scheduler worker

The container persists /opt/pobsync and the SQLite database in Docker volumes. Backup data is mounted at /backups inside the containers. By default this uses ./backups on the host. Override it with POBSYNC_BACKUP_ROOT:

POBSYNC_BACKUP_ROOT=/mnt/backups/pobsync docker compose up --build web scheduler worker

In the Django global config, set the backup root to /backups when running in Docker. For local, non-Docker use, set it directly to the host path, for example /mnt/backups/pobsync.

Docker With MariaDB

docker compose --profile mariadb up --build web-mariadb

With the scheduler:

docker compose --profile mariadb up --build web-mariadb scheduler-mariadb worker-mariadb

SQLite remains the default because it is enough for a single backup server and keeps deployment simple.

Current Architecture

The public command surface is Django-first. The old YAML/cron CLI has been retired from the pobsync entrypoint. Discovered snapshots are stored in SnapshotRecord, including the base snapshot metadata and a nullable SQL link to the base record when it is known. The Django retention command plans from SnapshotRecord instead of rediscovering snapshots from the filesystem. Post-backup pruning from Django also uses the SQL retention service after the completed snapshot is recorded. Staff-only JSON endpoints expose service status, hosts, snapshots, and backup runs for lightweight inspection. Staff-only dashboard views expose the same operational state through Django templates. Host pages include a safe snapshot discovery action that records existing snapshots into SQL. Host pages also include a read-only SQL retention plan view before any destructive pruning action. Schedules can be created or updated from host pages using the same SQL-backed scheduler model. Host config can be edited from host pages while keeping host identity stable.

The remaining internal engine code still contains reusable backup primitives:

  • snapshot naming and metadata
  • rsync command construction and execution
  • retention planning and pruning
  • host locking

Next refactor targets:

  • Move more snapshot lifecycle details into typed domain objects.
  • Replace remaining dictionary-shaped config at engine boundaries.
  • Remove legacy YAML import/export once production migration no longer needs it.