Files
pobsync/README.md
Peter van Arkel 1297a839d4 (config) Harden Docker deployment for remote servers
Run the Django control panel with Gunicorn instead of the development
runserver and serve static files through WhiteNoise.

Add restart policies, healthchecks, .env-driven production settings, and
a sample .env file for single-server deployments. Update the Docker
entrypoint to collect static assets and document the remote server
deployment and update flow in the README.
2026-05-19 15:33:09 +02:00

233 lines
6.4 KiB
Markdown

# pobsync
`pobsync` is a pull-based backup service. It runs on a central backup server and pulls data from remote machines via rsync over SSH.
The refactor direction is SQL-first:
- Django is the management layer and source of truth.
- SQLite is the default database; MariaDB is optional.
- Backups still use the existing rsync snapshot engine internally.
- Scheduling is handled by a Django/Docker scheduler process, not host cron.
- Legacy YAML import/export exists only for migration and inspection.
## Requirements
On the backup server or in the container:
- Python 3.11+
- rsync
- ssh
- SSH key-based access from the backup server to remotes
## Local Development
```
python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install -e .
mkdir -p var
python3 manage.py migrate
python3 manage.py createsuperuser
python3 manage.py runserver
```
The admin is available at:
- http://127.0.0.1:8000/
- http://127.0.0.1:8000/admin/
Staff-only JSON endpoints are available at:
- http://127.0.0.1:8000/api/
- http://127.0.0.1:8000/api/status/
## SQL-First Setup
Create global config:
```
pobsync configure-global --backup-root /mnt/backups/pobsync
```
Create a host config:
```
pobsync configure-host <host> --address <host-or-ip>
```
Run a backup:
```
pobsync backup <host> --prune
```
Create or update a schedule:
```
pobsync schedule <host> --cron "15 2 * * *" --prune
```
Run the scheduler:
```
pobsync scheduler --loop --interval 60
```
Plan or apply retention manually:
```
pobsync retention <host>
pobsync retention <host> --apply --yes --max-delete 10
```
Discover snapshots already present on disk:
```
pobsync discover-snapshots --host <host>
```
The `pobsync` executable is a thin wrapper around Django management commands. Direct Django access is also available:
```
pobsync django check
python3 manage.py run_pobsync_backup <host> --prune
```
## Migration Helpers
Import existing legacy YAML configs:
```
python3 manage.py import_pobsync_configs --prefix /opt/pobsync
```
Export SQL config to legacy runtime YAML for inspection or one-off compatibility:
```
python3 manage.py export_pobsync_configs --prefix /opt/pobsync
```
These commands are migration helpers, not the normal operating model.
## Docker With SQLite
```
docker compose up --build web
```
This starts Django on:
- http://127.0.0.1:8010/
- http://127.0.0.1:8010/admin/
- http://127.0.0.1:8010/api/
- http://127.0.0.1:8010/api/status/
Run the scheduler alongside the web admin:
```
docker compose up --build web scheduler worker
```
The web service runs Django through Gunicorn and serves static files with WhiteNoise. The container persists `/opt/pobsync`
and the SQLite database in Docker volumes.
Backup data is always available at `/backups` inside the containers. By default this uses `./backups` on the host.
Override the host-side mount with `POBSYNC_BACKUP_ROOT`:
```
POBSYNC_BACKUP_ROOT=/mnt/backups/pobsync docker compose up --build web scheduler worker
```
The Django setup UI keeps the backup root fixed at `/backups`; only the Docker mount decides which host directory
that points to.
## Remote Server Deployment
For a single backup server, use Docker Compose with the SQLite services and put a reverse proxy such as Caddy, nginx,
or Traefik in front of `web`.
Create a `.env` from the example:
```
cp .env.example .env
```
Set at least:
```
POBSYNC_BACKUP_ROOT=/mnt/backups/pobsync
POBSYNC_DJANGO_ALLOWED_HOSTS=backup.example.com,localhost,127.0.0.1
POBSYNC_DJANGO_SECRET_KEY=<long-random-secret>
POBSYNC_DJANGO_DEBUG=0
POBSYNC_WEB_BIND=127.0.0.1
```
Deploy or update:
```
git pull
docker compose build web scheduler worker
docker compose up -d --force-recreate web scheduler worker
docker compose exec web python manage.py migrate
```
Check service state:
```
docker compose ps
docker compose logs --tail=100 worker
docker compose logs --tail=100 scheduler
```
`web`, `scheduler`, and `worker` use `restart: unless-stopped` and Docker healthchecks. If `POBSYNC_WEB_BIND` is
`127.0.0.1`, expose the app through your reverse proxy instead of directly publishing it to the internet.
## Django-Managed SSH Keys
SSH keys can be managed from the Django UI at `/ssh-credentials/`. Add a private key there, optionally paste
`known_hosts` entries, and select the credential either as the global default or as a per-host override.
When a backup starts, the worker writes the selected key to `/opt/pobsync/state/ssh-credentials/<id>/identity`
inside the container with `0600` permissions and injects `IdentityFile` into the rsync SSH command. If `known_hosts`
is configured, the worker also writes a matching `known_hosts` file and injects `UserKnownHostsFile`.
## Docker With MariaDB
```
docker compose --profile mariadb up --build web-mariadb
```
With the scheduler:
```
docker compose --profile mariadb up --build web-mariadb scheduler-mariadb worker-mariadb
```
SQLite remains the default because it is enough for a single backup server and keeps deployment simple.
## Current Architecture
The public command surface is Django-first. The old YAML/cron CLI has been retired from the `pobsync` entrypoint.
Discovered snapshots are stored in `SnapshotRecord`, including the base snapshot metadata and a nullable SQL link to the
base record when it is known.
The Django retention command plans from `SnapshotRecord` instead of rediscovering snapshots from the filesystem.
Post-backup pruning from Django also uses the SQL retention service after the completed snapshot is recorded.
Staff-only JSON endpoints expose service status, hosts, snapshots, and backup runs for lightweight inspection.
Staff-only dashboard views expose the same operational state through Django templates.
Host pages include a safe snapshot discovery action that records existing snapshots into SQL.
Host pages also include a read-only SQL retention plan view before any destructive pruning action.
Schedules can be created or updated from host pages using the same SQL-backed scheduler model.
Host config can be edited from host pages while keeping host identity stable.
The remaining internal engine code still contains reusable backup primitives:
- snapshot naming and metadata
- rsync command construction and execution
- retention planning and pruning
- host locking
Next refactor targets:
- Move more snapshot lifecycle details into typed domain objects.
- Replace remaining dictionary-shaped config at engine boundaries.
- Remove legacy YAML import/export once production migration no longer needs it.