refactor: replace legacy CLI with Django command surface

Retire the old YAML and cron oriented pobsync CLI commands and expose a
SQL-first Django-backed command surface instead. Add schedule and
retention management commands, move shared defaults/parsing out of legacy
commands, remove obsolete command modules, and update documentation and
tests for the new workflow.
This commit is contained in:
2026-05-19 05:14:29 +02:00
parent 6d9ddc4457
commit e564262c72
22 changed files with 351 additions and 2043 deletions

226
README.md
View File

@@ -1,123 +1,25 @@
# pobsync
`pobsync` is a pull-based backup tool that runs on a central backup server and pulls data from remote servers via rsync over SSH.
`pobsync` is a pull-based backup service. It runs on a central backup server and pulls data from remote machines via rsync over SSH.
Key points:
The refactor direction is SQL-first:
- All backup data lives on the backup server.
- Snapshots are rsync-based and use hardlinking (--link-dest) for space efficiency.
- Designed for scheduled runs (cron) and manual runs.
- Minimal external dependencies (currently only PyYAML).
- Django is the management layer and source of truth.
- SQLite is the default database; MariaDB is optional.
- Backups still use the existing rsync snapshot engine internally.
- Scheduling is handled by a Django/Docker scheduler process, not host cron.
- Legacy YAML import/export exists only for migration and inspection.
## Requirements
On the backup server:
On the backup server or in the container:
- Python 3
- Python 3.11+
- rsync
- ssh
- SSH key-based access from the backup server to remotes
## Canonical installation (no venv, repo used only for deployment)
This project uses a simple and explicit deployment model:
- The git clone is only used as a deployment input (and later for updates).
- Runtime code is deployed into /opt/pobsync/lib.
- The canonical entrypoint is /opt/pobsync/bin/pobsync.
### Install
```git clone https://code.hosting.hippogrief.nl/hippogrief/pobsync.git
cd pobsync
sudo ./scripts/deploy --prefix /opt/pobsync
pobsync install --backup-root /mnt/backups/pobsync (install default configurations)
pobsync doctor (check if the installation was done correctly)
```
### Update
```
cd /path/to/pobsync
git pull
sudo ./scripts/deploy --prefix /opt/pobsync
sudo /opt/pobsync/bin/pobsync doctor
```
## Configuration
Global configuration is stored at:
- /opt/pobsync/config/global.yaml
Per-host configuration files are stored at:
- /opt/pobsync/config/hosts/<host>.yaml
## Some useful commands to get you started
Create a new host configuration:
`pobsync init-host <host>`
List configured remotes:
`pobsync list-remotes`
Inspect the effective configuration for a host:
`pobsync show-config <host>`
## Running backups
Run a scheduled backup for a host:
`pobsync run-scheduled <host>`
Optionally apply retention pruning after the run:
`pobsync run-scheduled <host> --prune`
## Scheduling (cron)
Create a cron schedule (writes into /etc/cron.d/pobsync by default):
`pobsync schedule create <host> --daily 02:15 --prune`
List existing schedules:
`pobsync schedule list`
Remove a schedule:
`pobsync schedule remove <host>`
Cron output is redirected to:
- /var/log/pobsync/<host>.cron.log
## Development (optional)
For development purposes you can still use an editable install, this is why pyproject.toml still exists. On systems with an externally managed Python installation, create a virtualenv first.
```
python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install -e .
pobsync --help
```
For production use, always use the canonical entrypoint:
/opt/pobsync/bin/pobsync
## Django backend (early refactor layer)
The Django backend is becoming the management layer and source of truth for pobsync. Structured SQL fields store backup, SSH, rsync, retention, schedule, run, and snapshot state; legacy JSON/YAML remains only as an import/export compatibility path while the engine is being refactored.
### Local SQLite development
## Local Development
```
python3 -m venv .venv
@@ -133,40 +35,69 @@ The admin is available at:
- http://127.0.0.1:8000/admin/
Import existing YAML configs into the database:
## SQL-First Setup
Create global config:
```
pobsync configure-global --backup-root /mnt/backups/pobsync
```
Create a host config:
```
pobsync configure-host <host> --address <host-or-ip>
```
Run a backup:
```
pobsync backup <host> --prune
```
Create or update a schedule:
```
pobsync schedule <host> --cron "15 2 * * *" --prune
```
Run the scheduler:
```
pobsync scheduler --loop --interval 60
```
Plan or apply retention manually:
```
pobsync retention <host>
pobsync retention <host> --apply --yes --max-delete 10
```
The `pobsync` executable is a thin wrapper around Django management commands. Direct Django access is also available:
```
pobsync django check
python3 manage.py run_pobsync_backup <host> --prune
```
## Migration Helpers
Import existing legacy YAML configs:
```
python3 manage.py import_pobsync_configs --prefix /opt/pobsync
```
Create SQL-backed configuration directly:
```
python3 manage.py configure_pobsync_global --backup-root /mnt/backups/pobsync
python3 manage.py configure_pobsync_host <host> --address <host-or-ip>
```
Run a backup through Django while still using the existing pobsync engine:
```
python3 manage.py run_pobsync_backup <host> --prefix /opt/pobsync --prune
```
The Django backup command reads backup and retention config from SQL directly. Runtime YAML export is kept as a compatibility tool for older CLI flows during the transition.
Export database configs to runtime YAML for legacy CLI compatibility:
Export SQL config to legacy runtime YAML for inspection or one-off compatibility:
```
python3 manage.py export_pobsync_configs --prefix /opt/pobsync
```
Run due schedules from the database:
These commands are migration helpers, not the normal operating model.
```
python3 manage.py run_pobsync_scheduler --loop --interval 60
```
### Docker with SQLite
## Docker With SQLite
```
docker compose up --build web
@@ -176,15 +107,15 @@ This starts Django on:
- http://127.0.0.1:8000/admin/
The container persists `/opt/pobsync` and the SQLite database in Docker volumes.
Run the Django scheduler alongside the web admin:
Run the scheduler alongside the web admin:
```
docker compose up --build web scheduler
```
### Docker with MariaDB
The container persists `/opt/pobsync` and the SQLite database in Docker volumes.
## Docker With MariaDB
```
docker compose --profile mariadb up --build web-mariadb
@@ -196,15 +127,22 @@ With the scheduler:
docker compose --profile mariadb up --build web-mariadb scheduler-mariadb
```
The MariaDB profile is optional. SQLite remains the default because it is enough for a single backup server and keeps deployment simple.
SQLite remains the default because it is enough for a single backup server and keeps deployment simple.
### Refactor direction
## Current Architecture
Recommended next steps:
The public command surface is Django-first. The old YAML/cron CLI has been retired from the `pobsync` entrypoint.
- Remove remaining legacy YAML-first commands after SQL-first setup covers all workflows.
- Record more engine-side run details into `BackupRun` and `SnapshotRecord`.
- Treat SQL as the source of truth and export YAML only as a compatibility layer for the current engine.
- Run schedules from Django/Docker instead of writing host cron files.
- Add a snapshot discovery command that syncs existing snapshot metadata into `SnapshotRecord`.
- Add tests around retention, scheduling, and config merge before deeper internal reshaping.
The remaining internal engine code still contains reusable backup primitives:
- snapshot naming and metadata
- rsync command construction and execution
- retention planning and pruning
- host locking
Next refactor targets:
- Record discovered snapshots into `SnapshotRecord`.
- Move more snapshot lifecycle details into typed domain objects.
- Replace remaining dictionary-shaped config at engine boundaries.
- Remove legacy YAML import/export once production migration no longer needs it.