Implémentation complète de la stack d'observabilité pour le monitoring de la plateforme multi-tenant Classeo. ## Error Tracking (GlitchTip) - Intégration Sentry SDK avec GlitchTip auto-hébergé - Scrubber PII avant envoi (RGPD: emails, tokens JWT, NIR français) - Contexte enrichi: tenant_id, user_id, correlation_id - Configuration backend (sentry.yaml) et frontend (sentry.ts) ## Metrics (Prometheus) - Endpoint /metrics avec restriction IP en production - Métriques HTTP: requests_total, request_duration_seconds (histogramme) - Métriques sécurité: login_failures_total par tenant - Métriques santé: health_check_status (postgres, redis, rabbitmq) - Storage Redis pour persistance entre requêtes ## Logs (Loki) - Processors Monolog: CorrelationIdLogProcessor, PiiScrubberLogProcessor - Détection PII: emails, téléphones FR, tokens JWT, NIR français - Labels structurés: tenant_id, correlation_id, level ## Dashboards (Grafana) - Dashboard principal: latence P50/P95/P99, error rate, RPS - Dashboard par tenant: métriques isolées par sous-domaine - Dashboard infrastructure: santé postgres/redis/rabbitmq - Datasources avec UIDs fixes pour portabilité ## Alertes (Alertmanager) - HighApiLatencyP95/P99: SLA monitoring (200ms/500ms) - HighErrorRate: error rate > 1% pendant 2 min - ExcessiveLoginFailures: détection brute force - ApplicationUnhealthy: health check failures ## Infrastructure - InfrastructureHealthChecker: service partagé (DRY) - HealthCheckController: endpoint /health pour load balancers - Pre-push hook: make ci && make e2e avant push
53 lines
1.3 KiB
YAML
53 lines
1.3 KiB
YAML
# Prometheus Configuration for Classeo
|
|
# Scrapes metrics from PHP backend and other services
|
|
|
|
global:
|
|
scrape_interval: 15s
|
|
evaluation_interval: 15s
|
|
external_labels:
|
|
environment: ${ENVIRONMENT:-development}
|
|
project: classeo
|
|
|
|
# Alerting configuration
|
|
alerting:
|
|
alertmanagers:
|
|
- static_configs:
|
|
- targets:
|
|
- alertmanager:9093
|
|
|
|
# Load alert rules
|
|
rule_files:
|
|
- /etc/prometheus/alerts.yml
|
|
|
|
# Scrape configurations
|
|
scrape_configs:
|
|
# Prometheus self-monitoring
|
|
- job_name: 'prometheus'
|
|
static_configs:
|
|
- targets: ['localhost:9090']
|
|
|
|
# PHP Backend metrics
|
|
- job_name: 'classeo-backend'
|
|
metrics_path: '/metrics'
|
|
static_configs:
|
|
- targets: ['php:8000']
|
|
relabel_configs:
|
|
- source_labels: [__address__]
|
|
target_label: instance
|
|
replacement: 'classeo-backend'
|
|
|
|
# Redis metrics (via redis_exporter would be added in production)
|
|
# For now, we rely on application-level metrics
|
|
|
|
# PostgreSQL metrics (via postgres_exporter would be added in production)
|
|
# For now, we rely on application-level metrics
|
|
|
|
# RabbitMQ metrics
|
|
- job_name: 'rabbitmq'
|
|
static_configs:
|
|
- targets: ['rabbitmq:15692']
|
|
relabel_configs:
|
|
- source_labels: [__address__]
|
|
target_label: instance
|
|
replacement: 'classeo-rabbitmq'
|