# Data Retention Service A lightweight, standalone microservice responsible for automated data cleanup based on tenant retention policies. ## Overview This service runs as a separate Docker container and performs the following functions: - **Automated Cleanup**: Daily scheduled cleanup at 2:00 AM UTC - **Tenant-Aware**: Respects individual tenant retention policies - **Lightweight**: Minimal resource footprint (~64-128MB RAM) - **Resilient**: Continues operation even if individual tenant cleanups fail - **Logged**: Comprehensive logging and health monitoring ## Architecture ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Main Backend │ │ Data Retention │ │ PostgreSQL │ │ Container │ │ Service │ │ Database │ │ │ │ │ │ │ │ • API Endpoints │ │ • Cron Jobs │◄──►│ • tenant data │ │ • Business Logic│ │ • Data Cleanup │ │ • detections │ │ • Rate Limiting │ │ • Health Check │ │ • heartbeats │ └─────────────────┘ └─────────────────┘ └─────────────────┘ ``` ## Features ### 🕒 Scheduled Operations - Runs daily at 2:00 AM UTC via cron job - Configurable immediate cleanup for development/testing - Graceful shutdown handling ### 🏢 Multi-Tenant Support - Processes all active tenants - Respects individual retention policies: - `-1` = Unlimited retention (no cleanup) - `N` = Delete data older than N days - Default: 90 days if not specified ### 🧹 Data Cleanup - **Drone Detections**: Historical detection records - **Heartbeats**: Device connectivity logs - **Security Logs**: Audit trail entries (if applicable) ### 📊 Monitoring & Health - Built-in health checks for Docker - Memory usage monitoring - Cleanup statistics tracking - Error logging with tenant context ## Configuration ### Environment Variables ```bash # Database Connection DB_HOST=postgres # Database host DB_PORT=5432 # Database port DB_NAME=drone_detection # Database name DB_USER=postgres # Database user DB_PASSWORD=password # Database password # Service Settings NODE_ENV=production # Environment mode IMMEDIATE_CLEANUP=false # Run cleanup on startup LOG_LEVEL=info # Logging level ``` ### Docker Compose Integration ```yaml data-retention: build: context: ./data-retention-service container_name: drone-detection-data-retention restart: unless-stopped environment: DB_HOST: postgres DB_PORT: 5432 DB_NAME: drone_detection DB_USER: postgres DB_PASSWORD: your_secure_password depends_on: postgres: condition: service_healthy deploy: resources: limits: memory: 128M ``` ## Usage ### Start with Docker Compose ```bash # Start all services including data retention docker-compose up -d # Start only data retention service docker-compose up -d data-retention # View logs docker-compose logs -f data-retention ``` ### Manual Container Build ```bash # Build the container cd data-retention-service docker build -t data-retention-service . # Run the container docker run -d \ --name data-retention \ --env-file .env \ --network drone-network \ data-retention-service ``` ### Development Mode ```bash # Install dependencies npm install # Run with immediate cleanup IMMEDIATE_CLEANUP=true npm start # Run in development mode npm run dev ``` ## Logging Output ### Startup ``` 🗂️ Starting Data Retention Service... 📅 Environment: production 💾 Database: postgres:5432/drone_detection ✅ Database connection established ⏰ Scheduled cleanup: Daily at 2:00 AM UTC ✅ Data Retention Service started successfully ``` ### Cleanup Operation ``` 🧹 Starting data retention cleanup... ⏰ Cleanup started at: 2024-09-23T02:00:00.123Z 🏢 Found 5 active tenants to process 🧹 Cleaning tenant acme-corp - removing data older than 90 days ✅ Tenant acme-corp: Deleted 1250 detections, 4500 heartbeats, 89 logs ⏭️ Skipping tenant enterprise-unlimited - unlimited retention ✅ Data retention cleanup completed ⏱️ Duration: 2347ms 📊 Deleted: 2100 detections, 8900 heartbeats, 156 logs ``` ### Health Monitoring ``` 💚 Health Check - Uptime: 3600s, Memory: 45MB, Last Cleanup: 2024-09-23T02:00:00.123Z ``` ## API Integration The main backend provides endpoints to interact with retention policies: ```bash # Get current tenant limits and retention info GET /api/tenant/limits # Preview what would be deleted GET /api/tenant/data-retention/preview ``` ## Error Handling ### Tenant-Level Errors - Service continues if individual tenant cleanup fails - Errors logged with tenant context - Failed tenants skipped, others processed normally ### Service-Level Errors - Database connection issues cause service restart - Health checks detect and report issues - Graceful shutdown on container stop signals ### Example Error Log ``` ❌ Error cleaning tenant problematic-tenant: SequelizeTimeoutError: Query timeout ⚠️ Errors encountered: 1 - problematic-tenant: Query timeout ``` ## Performance ### Resource Usage - **Memory**: 64-128MB typical usage - **CPU**: Minimal, only during cleanup operations - **Storage**: Logs rotate automatically - **Network**: Database queries only ### Cleanup Performance - Batch operations for efficiency - Indexed database queries on timestamp fields - Parallel tenant processing where possible - Configurable batch sizes for large datasets ## Security ### Database Access - Read/write access only to required tables - Connection pooling with limits - Prepared statements prevent SQL injection ### Container Security - Non-root user execution - Minimal base image (node:18-alpine) - No exposed ports - Isolated network access ## Monitoring ### Health Checks ```bash # Docker health check docker exec data-retention node healthcheck.js # Container status docker-compose ps data-retention # Service logs docker-compose logs -f data-retention ``` ### Metrics - Cleanup duration and frequency - Records deleted per tenant - Memory usage over time - Error rates and types ## Troubleshooting ### Common Issues **Service won't start** ```bash # Check database connectivity docker-compose logs postgres docker-compose logs data-retention # Verify environment variables docker-compose config ``` **Cleanup not running** ```bash # Check cron schedule docker exec data-retention ps aux | grep cron # Force immediate cleanup docker exec data-retention node -e " const service = require('./index.js'); service.performCleanup(); " ``` **High memory usage** ```bash # Check cleanup frequency docker stats data-retention # Review tenant data volumes docker exec data-retention node -e " const { getModels } = require('./database'); // Check tenant data sizes " ``` ### Configuration Validation ```bash # Test database connection docker exec data-retention node healthcheck.js # Verify tenant policies docker exec -it data-retention node -e " const { getModels } = require('./database'); (async () => { const { Tenant } = await getModels(); const tenants = await Tenant.findAll(); console.log(tenants.map(t => ({ slug: t.slug, retention: t.features?.data_retention_days }))); })(); " ``` ## Migration from Integrated Service If upgrading from a version where data retention was part of the main backend: 1. **Deploy new container**: Add data retention service to docker-compose.yml 2. **Verify operation**: Check logs for successful startup and database connection 3. **Remove old code**: The integrated service code is automatically disabled 4. **Monitor transition**: Ensure cleanup operations continue normally The service is designed to be backward compatible and will work with existing tenant configurations without changes.