Implémentation complète de la stack d'observabilité pour le monitoring de la plateforme multi-tenant Classeo. ## Error Tracking (GlitchTip) - Intégration Sentry SDK avec GlitchTip auto-hébergé - Scrubber PII avant envoi (RGPD: emails, tokens JWT, NIR français) - Contexte enrichi: tenant_id, user_id, correlation_id - Configuration backend (sentry.yaml) et frontend (sentry.ts) ## Metrics (Prometheus) - Endpoint /metrics avec restriction IP en production - Métriques HTTP: requests_total, request_duration_seconds (histogramme) - Métriques sécurité: login_failures_total par tenant - Métriques santé: health_check_status (postgres, redis, rabbitmq) - Storage Redis pour persistance entre requêtes ## Logs (Loki) - Processors Monolog: CorrelationIdLogProcessor, PiiScrubberLogProcessor - Détection PII: emails, téléphones FR, tokens JWT, NIR français - Labels structurés: tenant_id, correlation_id, level ## Dashboards (Grafana) - Dashboard principal: latence P50/P95/P99, error rate, RPS - Dashboard par tenant: métriques isolées par sous-domaine - Dashboard infrastructure: santé postgres/redis/rabbitmq - Datasources avec UIDs fixes pour portabilité ## Alertes (Alertmanager) - HighApiLatencyP95/P99: SLA monitoring (200ms/500ms) - HighErrorRate: error rate > 1% pendant 2 min - ExcessiveLoginFailures: détection brute force - ApplicationUnhealthy: health check failures ## Infrastructure - InfrastructureHealthChecker: service partagé (DRY) - HealthCheckController: endpoint /health pour load balancers - Pre-push hook: make ci && make e2e avant push
62 lines
1.2 KiB
YAML
62 lines
1.2 KiB
YAML
# Loki Configuration for Classeo
|
|
# NFR-OB4: Log retention 30 days
|
|
|
|
auth_enabled: false
|
|
|
|
server:
|
|
http_listen_port: 3100
|
|
grpc_listen_port: 9096
|
|
log_level: info
|
|
|
|
common:
|
|
instance_addr: 127.0.0.1
|
|
path_prefix: /loki
|
|
storage:
|
|
filesystem:
|
|
chunks_directory: /loki/chunks
|
|
rules_directory: /loki/rules
|
|
replication_factor: 1
|
|
ring:
|
|
kvstore:
|
|
store: inmemory
|
|
|
|
query_range:
|
|
results_cache:
|
|
cache:
|
|
embedded_cache:
|
|
enabled: true
|
|
max_size_mb: 100
|
|
|
|
schema_config:
|
|
configs:
|
|
- from: 2024-01-01
|
|
store: tsdb
|
|
object_store: filesystem
|
|
schema: v13
|
|
index:
|
|
prefix: index_
|
|
period: 24h
|
|
|
|
ruler:
|
|
alertmanager_url: http://alertmanager:9093
|
|
|
|
# NFR-OB4: 30 days retention
|
|
limits_config:
|
|
retention_period: 720h # 30 days
|
|
max_query_length: 721h
|
|
max_query_parallelism: 32
|
|
max_entries_limit_per_query: 10000
|
|
ingestion_rate_mb: 4
|
|
ingestion_burst_size_mb: 6
|
|
|
|
compactor:
|
|
working_directory: /loki/compactor
|
|
compaction_interval: 10m
|
|
retention_enabled: true
|
|
retention_delete_delay: 2h
|
|
retention_delete_worker_count: 150
|
|
delete_request_store: filesystem
|
|
|
|
analytics:
|
|
reporting_enabled: false
|