Files
pobsync/README.md
Peter van Arkel 372a857f15 (feature) Add full native installer and self-check page
Expand the systemd installer so it can perform a complete native
installation with sensible defaults: copy the checkout into the target
app directory, create runtime directories, write the environment file,
install dependencies, configure systemd units, and optionally configure
nginx.

Add a staff-only Django self-check page that verifies runtime settings,
required binaries, writable paths, database connectivity, global config
state, and systemd service status when available.

Document installer overrides and expose the self-check from the main
navigation.
2026-05-19 16:05:03 +02:00

285 lines
8.1 KiB
Markdown

# pobsync
`pobsync` is a pull-based backup service. It runs on a central backup server and pulls data from remote machines via rsync over SSH.
The refactor direction is SQL-first:
- Django is the management layer and source of truth.
- SQLite is the default database; MariaDB is optional.
- Backups still use the existing rsync snapshot engine internally.
- Scheduling is handled by a Django scheduler service, not host cron.
- Legacy YAML import/export exists only for migration and inspection.
## Requirements
On the backup server or in the container:
- Python 3.11+
- rsync
- ssh
- SSH key-based access from the backup server to remotes
- systemd for the recommended production deployment
## Local Development
```
python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install -e .
mkdir -p var
python3 manage.py migrate
python3 manage.py createsuperuser
python3 manage.py runserver
```
The admin is available at:
- http://127.0.0.1:8000/
- http://127.0.0.1:8000/admin/
Staff-only JSON endpoints are available at:
- http://127.0.0.1:8000/api/
- http://127.0.0.1:8000/api/status/
## SQL-First Setup
Create global config:
```
pobsync configure-global --backup-root /mnt/backups/pobsync
```
Create a host config:
```
pobsync configure-host <host> --address <host-or-ip>
```
Run a backup:
```
pobsync backup <host> --prune
```
Create or update a schedule:
```
pobsync schedule <host> --cron "15 2 * * *" --prune
```
Run the scheduler:
```
pobsync scheduler --loop --interval 60
```
Plan or apply retention manually:
```
pobsync retention <host>
pobsync retention <host> --apply --yes --max-delete 10
```
Discover snapshots already present on disk:
```
pobsync discover-snapshots --host <host>
```
The `pobsync` executable is a thin wrapper around Django management commands. Direct Django access is also available:
```
pobsync django check
python3 manage.py run_pobsync_backup <host> --prune
```
## Migration Helpers
Import existing legacy YAML configs:
```
python3 manage.py import_pobsync_configs --prefix /opt/pobsync
```
Export SQL config to legacy runtime YAML for inspection or one-off compatibility:
```
python3 manage.py export_pobsync_configs --prefix /opt/pobsync
```
These commands are migration helpers, not the normal operating model.
## Production With Systemd
The recommended production deployment is native systemd services on the backup server. This avoids Docker friction around
SSH, filesystems, large backup mounts, and host-level service logs.
Recommended layout:
```
/opt/pobsync/app # git checkout
/opt/pobsync/venv # Python virtualenv
/etc/pobsync/pobsync.env # settings and secrets
/var/lib/pobsync # SQLite database, state, runtime SSH key files, static files
/backups # backup storage, or set POBSYNC_BACKUP_ROOT to another absolute path
```
Install OS packages first:
```
apt install python3 python3-venv rsync openssh-client
```
From a checked-out copy of this repository, run:
```
sudo scripts/install-systemd
```
By default the installer copies the checkout to `/opt/pobsync/app`, creates `/opt/pobsync/venv`, writes
`/etc/pobsync/pobsync.env`, creates `/var/lib/pobsync` and `/backups`, installs dependencies, runs migrations, collects
static files, and starts the services.
Common overrides:
```
sudo scripts/install-systemd \
--app-dir /opt/pobsync/app \
--backup-root /mnt/backups/pobsync \
--allowed-hosts backup.example.com,localhost,127.0.0.1 \
--csrf-trusted-origins https://backup.example.com
```
Use `--force-env` when you intentionally want the installer to rewrite an existing `/etc/pobsync/pobsync.env`.
The installer creates or updates:
- `pobsync-web.service` for Gunicorn on `127.0.0.1:8010`
- `pobsync-worker.service` for queued backup runs
- `pobsync-scheduler.service` for SQL-backed schedules
- `/etc/pobsync/pobsync.env` if it does not exist
Edit `/etc/pobsync/pobsync.env` before exposing the service:
```
POBSYNC_DJANGO_ALLOWED_HOSTS=backup.example.com,localhost,127.0.0.1
POBSYNC_DJANGO_CSRF_TRUSTED_ORIGINS=https://backup.example.com
POBSYNC_BACKUP_ROOT=/backups
POBSYNC_WEB_BIND=127.0.0.1:8010
```
Restart after changes:
```
sudo systemctl restart pobsync-web pobsync-worker pobsync-scheduler
```
Check service state and logs:
```
systemctl status pobsync-web pobsync-worker pobsync-scheduler
journalctl -u pobsync-worker -f
```
The Django UI also has a staff-only `/self-check/` page that verifies runtime settings, required binaries, writable
paths, database connectivity, global config state, and systemd service state when systemd is available.
Update an existing native install:
```
git pull
sudo scripts/install-systemd --app-dir /opt/pobsync/app
```
Use an existing reverse proxy by forwarding to `http://127.0.0.1:8010`. To install a simple nginx site file as a
starting point:
```
sudo scripts/install-systemd --with-nginx --server-name backup.example.com
```
## Docker With SQLite
Docker Compose is still useful for local development and disposable test installs. Native systemd is preferred for
production backup servers.
```
docker compose up --build web
```
This starts Django on:
- http://127.0.0.1:8010/
- http://127.0.0.1:8010/admin/
- http://127.0.0.1:8010/api/
- http://127.0.0.1:8010/api/status/
Run the scheduler alongside the web admin:
```
docker compose up --build web scheduler worker
```
The web service runs Django through Gunicorn and serves static files with WhiteNoise. The container persists `/opt/pobsync`
and the SQLite database in Docker volumes.
Backup data is always available at `/backups` inside the containers. By default this uses `./backups` on the host.
Override the host-side mount with `POBSYNC_BACKUP_ROOT`:
```
POBSYNC_BACKUP_ROOT=/mnt/backups/pobsync docker compose up --build web scheduler worker
```
The Django setup UI keeps the backup root fixed at `/backups`; only the Docker mount decides which host directory
that points to.
## Django-Managed SSH Keys
SSH keys can be managed from the Django UI at `/ssh-credentials/`. Add a private key there, optionally paste
`known_hosts` entries, and select the credential either as the global default or as a per-host override.
When a backup starts, the worker writes the selected key to `$POBSYNC_HOME/state/ssh-credentials/<id>/identity`
with `0600` permissions and injects `IdentityFile` into the rsync SSH command. If `known_hosts` is configured, the
worker also writes a matching `known_hosts` file and injects `UserKnownHostsFile`.
## Docker With MariaDB
```
docker compose --profile mariadb up --build web-mariadb
```
With the scheduler:
```
docker compose --profile mariadb up --build web-mariadb scheduler-mariadb worker-mariadb
```
SQLite remains the default because it is enough for a single backup server and keeps deployment simple.
## Current Architecture
The public command surface is Django-first. The old YAML/cron CLI has been retired from the `pobsync` entrypoint.
Discovered snapshots are stored in `SnapshotRecord`, including the base snapshot metadata and a nullable SQL link to the
base record when it is known.
The Django retention command plans from `SnapshotRecord` instead of rediscovering snapshots from the filesystem.
Post-backup pruning from Django also uses the SQL retention service after the completed snapshot is recorded.
Staff-only JSON endpoints expose service status, hosts, snapshots, and backup runs for lightweight inspection.
Staff-only dashboard views expose the same operational state through Django templates.
Host pages include a safe snapshot discovery action that records existing snapshots into SQL.
Host pages also include a read-only SQL retention plan view before any destructive pruning action.
Schedules can be created or updated from host pages using the same SQL-backed scheduler model.
Host config can be edited from host pages while keeping host identity stable.
The remaining internal engine code still contains reusable backup primitives:
- snapshot naming and metadata
- rsync command construction and execution
- retention planning and pruning
- host locking
Next refactor targets:
- Move more snapshot lifecycle details into typed domain objects.
- Replace remaining dictionary-shaped config at engine boundaries.
- Remove legacy YAML import/export once production migration no longer needs it.