Data Retention Service
A lightweight, standalone microservice responsible for automated data cleanup based on tenant retention policies.
Overview
This service runs as a separate Docker container and performs the following functions:
- Automated Cleanup: Daily scheduled cleanup at 2:00 AM UTC
- Tenant-Aware: Respects individual tenant retention policies
- Lightweight: Minimal resource footprint (~64-128MB RAM)
- Resilient: Continues operation even if individual tenant cleanups fail
- Logged: Comprehensive logging and health monitoring
Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Main Backend │ │ Data Retention │ │ PostgreSQL │
│ Container │ │ Service │ │ Database │
│ │ │ │ │ │
│ • API Endpoints │ │ • Cron Jobs │◄──►│ • tenant data │
│ • Business Logic│ │ • Data Cleanup │ │ • detections │
│ • Rate Limiting │ │ • Health Check │ │ • heartbeats │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Features
🕒 Scheduled Operations
- Runs daily at 2:00 AM UTC via cron job
- Configurable immediate cleanup for development/testing
- Graceful shutdown handling
🏢 Multi-Tenant Support
- Processes all active tenants
- Respects individual retention policies:
-1= Unlimited retention (no cleanup)N= Delete data older than N days- Default: 90 days if not specified
🧹 Data Cleanup
- Drone Detections: Historical detection records
- Heartbeats: Device connectivity logs
- Security Logs: Audit trail entries (if applicable)
📊 Monitoring & Health
- Built-in health checks for Docker
- Memory usage monitoring
- Cleanup statistics tracking
- Error logging with tenant context
Configuration
Environment Variables
# Database Connection
DB_HOST=postgres # Database host
DB_PORT=5432 # Database port
DB_NAME=drone_detection # Database name
DB_USER=postgres # Database user
DB_PASSWORD=password # Database password
# Service Settings
NODE_ENV=production # Environment mode
IMMEDIATE_CLEANUP=false # Run cleanup on startup
LOG_LEVEL=info # Logging level
Docker Compose Integration
data-retention:
build:
context: ./data-retention-service
container_name: drone-detection-data-retention
restart: unless-stopped
environment:
DB_HOST: postgres
DB_PORT: 5432
DB_NAME: drone_detection
DB_USER: postgres
DB_PASSWORD: your_secure_password
depends_on:
postgres:
condition: service_healthy
deploy:
resources:
limits:
memory: 128M
Usage
Start with Docker Compose
# Start all services including data retention
docker-compose up -d
# Start only data retention service
docker-compose up -d data-retention
# View logs
docker-compose logs -f data-retention
Manual Container Build
# Build the container
cd data-retention-service
docker build -t data-retention-service .
# Run the container
docker run -d \
--name data-retention \
--env-file .env \
--network drone-network \
data-retention-service
Development Mode
# Install dependencies
npm install
# Run with immediate cleanup
IMMEDIATE_CLEANUP=true npm start
# Run in development mode
npm run dev
Logging Output
Startup
🗂️ Starting Data Retention Service...
📅 Environment: production
💾 Database: postgres:5432/drone_detection
✅ Database connection established
⏰ Scheduled cleanup: Daily at 2:00 AM UTC
✅ Data Retention Service started successfully
Cleanup Operation
🧹 Starting data retention cleanup...
⏰ Cleanup started at: 2024-09-23T02:00:00.123Z
🏢 Found 5 active tenants to process
🧹 Cleaning tenant acme-corp - removing data older than 90 days
✅ Tenant acme-corp: Deleted 1250 detections, 4500 heartbeats, 89 logs
⏭️ Skipping tenant enterprise-unlimited - unlimited retention
✅ Data retention cleanup completed
⏱️ Duration: 2347ms
📊 Deleted: 2100 detections, 8900 heartbeats, 156 logs
Health Monitoring
💚 Health Check - Uptime: 3600s, Memory: 45MB, Last Cleanup: 2024-09-23T02:00:00.123Z
API Integration
The main backend provides endpoints to interact with retention policies:
# Get current tenant limits and retention info
GET /api/tenant/limits
# Preview what would be deleted
GET /api/tenant/data-retention/preview
Error Handling
Tenant-Level Errors
- Service continues if individual tenant cleanup fails
- Errors logged with tenant context
- Failed tenants skipped, others processed normally
Service-Level Errors
- Database connection issues cause service restart
- Health checks detect and report issues
- Graceful shutdown on container stop signals
Example Error Log
❌ Error cleaning tenant problematic-tenant: SequelizeTimeoutError: Query timeout
⚠️ Errors encountered: 1
- problematic-tenant: Query timeout
Performance
Resource Usage
- Memory: 64-128MB typical usage
- CPU: Minimal, only during cleanup operations
- Storage: Logs rotate automatically
- Network: Database queries only
Cleanup Performance
- Batch operations for efficiency
- Indexed database queries on timestamp fields
- Parallel tenant processing where possible
- Configurable batch sizes for large datasets
Security
Database Access
- Read/write access only to required tables
- Connection pooling with limits
- Prepared statements prevent SQL injection
Container Security
- Non-root user execution
- Minimal base image (node:18-alpine)
- No exposed ports
- Isolated network access
Monitoring
Health Checks
# Docker health check
docker exec data-retention node healthcheck.js
# Container status
docker-compose ps data-retention
# Service logs
docker-compose logs -f data-retention
Metrics
- Cleanup duration and frequency
- Records deleted per tenant
- Memory usage over time
- Error rates and types
Troubleshooting
Common Issues
Service won't start
# Check database connectivity
docker-compose logs postgres
docker-compose logs data-retention
# Verify environment variables
docker-compose config
Cleanup not running
# Check cron schedule
docker exec data-retention ps aux | grep cron
# Force immediate cleanup
docker exec data-retention node -e "
const service = require('./index.js');
service.performCleanup();
"
High memory usage
# Check cleanup frequency
docker stats data-retention
# Review tenant data volumes
docker exec data-retention node -e "
const { getModels } = require('./database');
// Check tenant data sizes
"
Configuration Validation
# Test database connection
docker exec data-retention node healthcheck.js
# Verify tenant policies
docker exec -it data-retention node -e "
const { getModels } = require('./database');
(async () => {
const { Tenant } = await getModels();
const tenants = await Tenant.findAll();
console.log(tenants.map(t => ({
slug: t.slug,
retention: t.features?.data_retention_days
})));
})();
"
Migration from Integrated Service
If upgrading from a version where data retention was part of the main backend:
- Deploy new container: Add data retention service to docker-compose.yml
- Verify operation: Check logs for successful startup and database connection
- Remove old code: The integrated service code is automatically disabled
- Monitor transition: Ensure cleanup operations continue normally
The service is designed to be backward compatible and will work with existing tenant configurations without changes.