e0df6dc2d8
Contient : stack, architecture, schéma BDD, routes API, configs agents, widgets Glance, CI/CD, bugs corrigés, références Gitea, règles de travail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
405 lines
16 KiB
Markdown
405 lines
16 KiB
Markdown
# Historique et mémoire complète — SentinelMesh
|
||
|
||
## Références dépôt
|
||
|
||
- **Gitea** : https://git.maison43gil.com/gilles/SentinelMesh.git
|
||
- **Utilisateur** : gilles
|
||
- **Token PAT** : 8bb9ee27860bd2f66c4113406dbcc0d545ba6ac6
|
||
- **Push** :
|
||
```bash
|
||
git remote set-url origin "https://gilles:8bb9ee27860bd2f66c4113406dbcc0d545ba6ac6@git.maison43gil.com/gilles/SentinelMesh.git"
|
||
git push origin main
|
||
git remote set-url origin "https://git.maison43gil.com/gilles/SentinelMesh.git"
|
||
```
|
||
|
||
---
|
||
|
||
## Règles de travail
|
||
|
||
- Toutes les réponses, commentaires dans le code et messages de commit sont **en français uniquement**
|
||
- Committer + pusher **après chaque étape logique**, pas seulement en fin de phase
|
||
- Toujours utiliser le préfixe `rtk` pour les commandes shell (ex: `rtk cargo build`, `rtk git status`)
|
||
|
||
---
|
||
|
||
## Stack technique
|
||
|
||
| Couche | Technologies |
|
||
|--------|-------------|
|
||
| Backend | Rust, Axum 0.8, Tokio, Serde JSON, SQLx 0.8, SQLite |
|
||
| Agents | Rust, Tokio, sysinfo 0.32, rumqttc 0.24, serde_yaml |
|
||
| Widgets | HTML/JS vanilla, Glance `custom-api` |
|
||
| API | REST JSON `/api/v1/`, OpenAPI (utoipa 5), SSE, Prometheus |
|
||
| MQTT | rumqttc, QoS 0 (realtime) / QoS 1 (events) |
|
||
| Déploiement | Docker Compose, multi-arch: amd64, arm64, armv7 |
|
||
| CI/CD | Gitea Actions (GitHub Actions compatible), cross-rs |
|
||
|
||
---
|
||
|
||
## Structure du workspace Cargo
|
||
|
||
```
|
||
Cargo.toml # workspace, resolver = "2"
|
||
├── backend/ # sentinelmesh-backend
|
||
├── agents/agent-scan-network/
|
||
└── agents/agent-metric/
|
||
```
|
||
|
||
Dépendances partagées (workspace.dependencies) : tokio (features=full), serde (derive), serde_json, anyhow, tracing, tracing-subscriber (env-filter)
|
||
|
||
---
|
||
|
||
## Backend (`backend/`)
|
||
|
||
### Dépendances Cargo
|
||
|
||
```toml
|
||
axum = "0.8"
|
||
sqlx = { version = "0.8", features = ["sqlite", "runtime-tokio", "macros", "migrate"] }
|
||
tower-http = { version = "0.6", features = ["cors", "trace"] }
|
||
utoipa = { version = "5", features = ["axum_extras"] }
|
||
chrono = { version = "0.4", features = ["serde"] }
|
||
tokio-stream = { version = "0.1", features = ["sync"] }
|
||
futures = "0.3"
|
||
```
|
||
|
||
### Variables d'environnement
|
||
|
||
| Variable | Défaut | Description |
|
||
|----------|--------|-------------|
|
||
| `DATABASE_URL` | `sqlite://sentinelmesh.sqlite` | Chemin SQLite |
|
||
| `LISTEN_ADDR` | `0.0.0.0:8080` | Adresse d'écoute |
|
||
| `RUST_LOG` | `info` | Niveau de log |
|
||
|
||
### Architecture interne
|
||
|
||
```
|
||
src/
|
||
├── main.rs # AppState::new(), router, listener
|
||
├── db.rs # SqliteConnectOptions::create_if_missing(true) + migrate!()
|
||
├── error.rs # AppError(anyhow::Error) → JSON {"error":"..."}
|
||
├── models.rs # Agent, Device, Metric, Event + structs Push*
|
||
├── state.rs # AppState { db, metrics_tx, network_tx } + FromRef<AppState> for SqlitePool
|
||
└── routes/
|
||
├── mod.rs # api_router() — 14 routes
|
||
├── health.rs # GET /api/v1/health → {"status":"ok","version":"0.1.0"}
|
||
├── agents.rs # GET+POST /api/v1/agents, GET /api/v1/agents/{id}
|
||
├── network.rs # GET+POST /api/v1/network, GET /api/v1/network/{ip}
|
||
├── metrics.rs # GET+POST /api/v1/metrics, GET /api/v1/metrics/{agent_id}
|
||
├── history.rs # GET /api/v1/history/{agent_id}?hours=N (max 168h, rétention 7j)
|
||
├── events.rs # GET+POST /api/v1/events
|
||
├── widgets.rs # GET /api/v1/widgets/network + /metrics
|
||
├── sse.rs # GET /api/v1/stream — SSE metrics+network fusionnés
|
||
└── prometheus.rs # GET /metrics — format text/plain Prometheus
|
||
```
|
||
|
||
### Routes complètes
|
||
|
||
| Méthode | Endpoint | Description |
|
||
|---------|----------|-------------|
|
||
| GET | `/api/v1/health` | Santé + version |
|
||
| GET/POST | `/api/v1/agents` | Liste / enregistrement |
|
||
| GET | `/api/v1/agents/{id}` | Détail agent |
|
||
| GET/POST | `/api/v1/network` | Équipements / push scan |
|
||
| GET | `/api/v1/network/{ip}` | Détail device |
|
||
| GET/POST | `/api/v1/metrics` | Métriques courantes / push |
|
||
| GET | `/api/v1/metrics/{agent_id}` | Métriques d'un agent |
|
||
| GET | `/api/v1/history/{agent_id}` | Historique (param: `?hours=24`) |
|
||
| GET/POST | `/api/v1/events` | Événements / push |
|
||
| GET | `/api/v1/widgets/network` | Widget Glance réseau |
|
||
| GET | `/api/v1/widgets/metrics` | Widget Glance métriques |
|
||
| GET | `/api/v1/stream` | SSE temps réel |
|
||
| GET | `/metrics` | Prometheus scrape |
|
||
| GET | `/api-docs/openapi.json` | Spec OpenAPI |
|
||
|
||
### Schéma SQLite (`migrations/`)
|
||
|
||
**001_init.sql** — Tables principales :
|
||
- `agents` (id PK, hostname, agent_type CHECK IN ('scan-network','metric'), ip, version, status CHECK IN ('online','offline'), last_seen, created_at)
|
||
- `devices` (ip PK, mac, hostname, vendor, state CHECK IN ('online','offline','sleep','unknown'), services JSON, open_ports JSON, last_seen, first_seen)
|
||
- `metrics` (agent_id PK REFERENCES agents, timestamp, cpu_percent, ram_percent, load_avg, temperature_c, disk_percent, net_rx_bps, net_tx_bps, extra JSON) — 1 ligne par agent, écrasée à chaque push
|
||
- `events` (id AUTOINCREMENT, agent_id, event_type, timestamp, data JSON)
|
||
- Index : idx_events_agent, idx_events_ts, idx_devices_state
|
||
|
||
**002_history.sql** — Série temporelle :
|
||
- `metrics_history` (id AUTOINCREMENT, agent_id, timestamp, cpu_percent, ram_percent, disk_percent, temperature_c, net_rx_bps, net_tx_bps)
|
||
- Index : idx_mh_agent_ts ON (agent_id, timestamp DESC)
|
||
- Rétention 7 jours, purge probabiliste 1/60 (~1 fois/minute avec push à 1s)
|
||
|
||
### Pattern AppState / FromRef
|
||
|
||
```rust
|
||
// state.rs — permet aux routes State<SqlitePool> de coexister avec State<AppState>
|
||
impl FromRef<AppState> for SqlitePool {
|
||
fn from_ref(state: &AppState) -> Self { state.db.clone() }
|
||
}
|
||
// Canaux broadcast SSE : capacité 128 messages
|
||
let (metrics_tx, _) = broadcast::channel(128);
|
||
let (network_tx, _) = broadcast::channel(128);
|
||
```
|
||
|
||
### SSE (`routes/sse.rs`)
|
||
|
||
- Fusionne deux `BroadcastStream` (metrics + network) via `StreamExt::merge()`
|
||
- Événements nommés : `metrics` et `network`
|
||
- `Sse::new(merged).keep_alive(KeepAlive::default())`
|
||
- Client JS : `new EventSource("/api/v1/stream")` puis `es.addEventListener("metrics", ...)`
|
||
|
||
### Prometheus (`routes/prometheus.rs`)
|
||
|
||
- Format text/plain manuel (pas de crate externe)
|
||
- Header `text/plain; version=0.0.4; charset=utf-8`
|
||
- Métriques : `sentinelmesh_cpu_percent`, `ram_percent`, `disk_percent`, `temperature_c`, `net_rx_bps`, `net_tx_bps`
|
||
- Labels : `agent="<id>",hostname="<hostname>"`
|
||
- Compatible `prometheus.yml` : `targets: ['sentinelmesh:8080']`
|
||
|
||
---
|
||
|
||
## Agent scan-network (`agents/agent-scan-network/`)
|
||
|
||
### Dépendances supplémentaires
|
||
|
||
```toml
|
||
serde_yaml = "*"
|
||
reqwest = { version = "0.12", features = ["json"] }
|
||
chrono = "0.4"
|
||
ipnetwork = "0.20"
|
||
rumqttc = "0.24"
|
||
```
|
||
|
||
### Modules
|
||
|
||
```
|
||
src/
|
||
├── main.rs # boucle scan + push MQTT
|
||
├── config.rs # Config: backend, agent, scan, api, mqtt: Option<MqttConfig>
|
||
├── scanner.rs # scan_all(), ping(), read_arp_table(), scan_ports(), detect_services()
|
||
├── oui.rs # LazyLock<HashMap<&str,&str>> ~70 préfixes OUI
|
||
├── backend.rs # BackendClient: register(), push_devices(); local_ip() via UDP 8.8.8.8:80
|
||
├── api.rs # SharedState = Arc<RwLock<Vec<DiscoveredDevice>>>, GET /devices sur :9100
|
||
└── mqtt.rs # publish_scan() → sentinelmesh/<host>/network/scan (QoS 0)
|
||
```
|
||
|
||
### Configuration (`config.example.yaml`)
|
||
|
||
```yaml
|
||
backend:
|
||
url: http://localhost:8080
|
||
token: ""
|
||
agent:
|
||
id: "" # auto: scan-<hostname>
|
||
hostname: "" # auto: /etc/hostname
|
||
scan:
|
||
subnets: [10.0.0.0/22]
|
||
interval_seconds: 60
|
||
ping_timeout_ms: 1000
|
||
service_timeout_ms: 300
|
||
concurrency: 50
|
||
ports: [22, 80, 443, 445, 2049, 1883, 2375, 8006, 8123, 3000, 9090, 9100]
|
||
api:
|
||
listen: "0.0.0.0:9100"
|
||
mqtt:
|
||
enabled: false
|
||
broker: "localhost"
|
||
port: 1883
|
||
topic_prefix: "sentinelmesh"
|
||
client_id: "" # auto: sentinelmesh-scan-<hostname>
|
||
```
|
||
|
||
### Technique ping sans root
|
||
|
||
Pas de ICMP (nécessite CAP_NET_RAW) — utilise des connexions TCP sur ports 80/22/443/8080 avec timeout configurable. Combiné avec la lecture de `/proc/net/arp` pour les hosts silencieux.
|
||
|
||
### Services détectés sur ports
|
||
|
||
22→SSH, 80→HTTP, 443→HTTPS, 445→SMB, 2049→NFS, 1883→MQTT, 2375→Docker-API, 8006→Proxmox, 8123→Home-Assistant, 3000→Grafana, 9090→Prometheus, 9100→Node-Exporter
|
||
|
||
### MQTT publié
|
||
|
||
Topic : `sentinelmesh/<hostname>/network/scan`
|
||
Payload : `{"total": N, "online": M, "devices": [...]}`
|
||
|
||
---
|
||
|
||
## Agent metric (`agents/agent-metric/`)
|
||
|
||
### Dépendances supplémentaires
|
||
|
||
```toml
|
||
serde_yaml = "*"
|
||
reqwest = { version = "0.12", features = ["json"] }
|
||
chrono = "0.4"
|
||
sysinfo = "0.32"
|
||
rumqttc = "0.24"
|
||
```
|
||
|
||
### Modules
|
||
|
||
```
|
||
src/
|
||
├── main.rs # tokio::select! rt_ticker + med_ticker
|
||
├── config.rs # Config: backend, agent, intervals, api, mqtt
|
||
├── backend.rs # push_realtime/medium/static/event
|
||
├── api.rs # AgentState {hardware,realtime,medium}, GET /metrics sur :9101
|
||
├── mqtt.rs # publish_realtime/medium/event
|
||
└── collectors/
|
||
├── realtime.rs # CPU, RAM, réseau, charge (toutes les 1s)
|
||
├── medium.rs # disques, hwmon, SMART (toutes les 30min)
|
||
└── static_info.rs # DMI, CPU model, RAM total (boot + 12h)
|
||
```
|
||
|
||
### Fréquences de collecte
|
||
|
||
| Intervalle | Données |
|
||
|-----------|---------|
|
||
| 1s (realtime_ms: 1000) | CPU %, RAM %, débit réseau rx/tx bps, load avg |
|
||
| 30min (medium_s: 1800) | Disques usage %, températures hwmon, SMART |
|
||
| Boot + 12h | hostname, DMI, CPU model/cores, RAM total, interfaces réseau, BIOS, OS |
|
||
| Instantané | Événements boot, shutdown, sleep, wake |
|
||
|
||
### Sources système
|
||
|
||
- **CPU/RAM** : `sysinfo::System::new_all()`, `refresh_cpu_usage()`, `refresh_memory()`
|
||
- **Réseau** : `sysinfo::Networks::new_with_refreshed_list()`, delta octets/s
|
||
- **Disques** : `sysinfo::Disks::new_with_refreshed_list()`, `refresh()` (sans args en 0.32)
|
||
- **Températures** : `/sys/class/hwmon/hwmonN/tempN_input` (divisé par 1000 → °C)
|
||
- **SMART** : subprocess `smartctl -H /dev/sdX` → status ok/warn/unknown
|
||
- **DMI** : `/sys/devices/virtual/dmi/id/` (board_name, sys_vendor, bios_version, etc.)
|
||
- **hostname** : `/etc/hostname`
|
||
|
||
### MQTT topics
|
||
|
||
```
|
||
sentinelmesh/<hostname>/metrics/realtime (JSON, 1s, QoS 0)
|
||
sentinelmesh/<hostname>/metrics/medium (JSON, 30min, QoS 1)
|
||
sentinelmesh/<hostname>/events (JSON, boot/shutdown…, QoS 1)
|
||
```
|
||
|
||
---
|
||
|
||
## Widgets Glance (`widgets/`)
|
||
|
||
### widget-network-scan
|
||
|
||
- Type Glance : `custom-api`
|
||
- Endpoint : `http://sentinelmesh/api/v1/widgets/network`
|
||
- Cache recommandé : `30s`
|
||
- Affichage : état (point coloré), IP, hostname, vendor, services (badges), tri online/offline
|
||
- CSS classes : `sm-state-dot`, `sm-badge`, `collapsible-container` (collapse après 8 items)
|
||
|
||
### widget-agent-metrics
|
||
|
||
- Type Glance : `custom-api`
|
||
- Endpoint : `http://sentinelmesh/api/v1/widgets/metrics`
|
||
- Cache recommandé : `1s`
|
||
- Barres avec seuils couleur :
|
||
- CPU : ok <60%, warn <85%, crit ≥85%
|
||
- RAM : ok <70%, warn <90%, crit ≥90%
|
||
- Disque : ok <75%, warn <90%, crit ≥90%
|
||
- Temp : ok <70°C, warn <85°C, crit ≥85°C
|
||
|
||
### CSS custom (`sentinelmesh.css`)
|
||
|
||
- `.sm-state-dot` : point coloré (online = var(--color-positive), offline = opacité 0.5)
|
||
- `.sm-badge` : badge texte service
|
||
- `.sm-bar-track` / `.sm-bar` : barre animée avec transition CSS
|
||
- `.sm-bar-ok/warn/crit` : couleurs selon seuil
|
||
|
||
---
|
||
|
||
## Déploiement
|
||
|
||
### Docker Compose production (`docker-compose.yml`)
|
||
|
||
```yaml
|
||
context: . # racine workspace (IMPORTANT)
|
||
dockerfile: backend/Dockerfile
|
||
ports: ["8080:8080"]
|
||
volumes: [sentinelmesh-data:/data]
|
||
environment:
|
||
DATABASE_URL: sqlite:///data/sentinelmesh.sqlite
|
||
LISTEN_ADDR: 0.0.0.0:8080
|
||
healthcheck: wget -qO- http://localhost:8080/api/v1/health
|
||
restart: unless-stopped
|
||
```
|
||
|
||
### Dockerfile backend
|
||
|
||
- Image base : `rust:1.86-alpine` (édition 2024 + dépendances requièrent ≥1.86)
|
||
- Pattern layer cache : stubs `fn main(){}` pour tous les membres workspace → build deps → copie sources réelles → rebuild
|
||
- Image finale : `alpine:3.21`, mkdir `/data`, binary + migrations copiés
|
||
|
||
### Agents Dockerfiles
|
||
|
||
- Multi-arch via `ARG TARGETPLATFORM` + `case` shell → target Rust musl
|
||
- `agent-scan-network` : image finale alpine + iputils (ping)
|
||
- `agent-metric` : image finale alpine + smartmontools ; nécessite `--privileged` pour `/sys`
|
||
|
||
### Script install (`install/install.sh`)
|
||
|
||
Usage : `curl -fsSL http://<backend>:8080/install.sh | sudo bash -s -- --server URL --token TOKEN --agent-type TYPE`
|
||
|
||
Étapes :
|
||
1. Détection arch (x86_64/aarch64/armv7l) → téléchargement binaire depuis Gitea releases
|
||
2. Création config YAML dans `/etc/sentinelmesh/<type>.yaml` (chmod 600)
|
||
3. Création service systemd `/etc/systemd/system/sentinelmesh-<type>.service`
|
||
- `agent-scan-network` : `AmbientCapabilities=CAP_NET_RAW CAP_NET_ADMIN`
|
||
4. `daemon-reload + enable + restart`
|
||
5. Enregistrement POST `/api/v1/agents`
|
||
|
||
---
|
||
|
||
## CI/CD Gitea Actions
|
||
|
||
### `.gitea/workflows/ci.yaml` (sur push/PR main)
|
||
|
||
- `cargo check --workspace`
|
||
- `cargo clippy --workspace -- -D warnings`
|
||
- `cargo fmt --all -- --check`
|
||
- `cargo test --workspace` (DATABASE_URL: sqlite://:memory:)
|
||
|
||
### `.gitea/workflows/release.yaml` (sur tag `v*`)
|
||
|
||
- Matrix 3 targets : x86_64-unknown-linux-gnu, aarch64-unknown-linux-gnu, armv7-unknown-linux-gnueabihf
|
||
- Build via `cross-rs` (cross build --release)
|
||
- 9 binaires produits : backend + agent-scan-network + agent-metric × 3 archs
|
||
- Release Gitea créée automatiquement avec tableau markdown des binaires
|
||
|
||
---
|
||
|
||
## Bugs corrigés (référence)
|
||
|
||
| Bug | Cause | Solution |
|
||
|----|-------|----------|
|
||
| `Router: From<SwaggerUi>` trait error | utoipa-swagger-ui 8 incompatible axum 0.8 | Supprimé, JSON servi manuellement |
|
||
| `refresh(true)` ne compile pas | API sysinfo 0.32 changée | `refresh()` sans argument |
|
||
| `edition2024` non supportée | Rust 1.82 trop vieux | Dockerfile → rust:1.86-alpine |
|
||
| `idna_adapter requires rustc 1.86` | Dépendance transitive | Rust 1.86 requis |
|
||
| SQLite "unable to open database file" | Fichier non créé au premier démarrage | `SqliteConnectOptions::create_if_missing(true)` |
|
||
| Build Docker échoue (workspace) | Contexte `./backend` sans Cargo.lock racine | Contexte `.` (racine), stubs workspace |
|
||
| `/data` introuvable dans container | Répertoire non créé dans image | `mkdir -p /data` dans Dockerfile |
|
||
| Commit parasite pushé sur Gitea | Fichiers `.claude/skills/` non gitignorés | Reset + force push + `.gitignore` étendu |
|
||
| Push HTTPS échoue (caractère spécial) | Mot de passe avec `*` dans l'URL | Utilisation d'un token PAT à la place |
|
||
| `publish_device_online` warning | Méthode jamais appelée | `#[allow(dead_code)]` |
|
||
| Prometheus format string error | `{}` positionnel sans argument | Nommé : `{name}` |
|
||
|
||
---
|
||
|
||
## État au moment de la rédaction (mai 2026)
|
||
|
||
- Toutes les 6 phases complètes et pushées sur Gitea
|
||
- Tag `v0.1.0` créé → pipeline release multi-arch déclenché
|
||
- Test de déploiement Docker réussi : backend opérationnel, endpoints validés
|
||
- Fichiers exclus du dépôt via `.gitignore` : `.claude/`, `doc_brainstorming/`, `repo_glance/`, `consigne_claude_project_sentinelmesh_md.md`
|
||
- `tokengite.md` versionné (token en clair, choix explicite de l'utilisateur)
|
||
|
||
## Fonctionnalités futures (ROADMAP déferrées)
|
||
|
||
- Home Assistant MQTT auto-discovery
|
||
- PostgreSQL (SQLite suffisant en homelab)
|
||
- WebSocket bidirectionnel
|
||
- InfluxDB / Grafana direct
|
||
- Popups détaillés Glance (nécessite widget `extension` avec serveur HTTP séparé)
|
||
- Icônes locales par type d'équipement
|
||
- Résolution DNS inverse (PTR)
|