Files
drone-detector/data-retention-service
2025-09-24 04:57:07 +02:00
..
2025-09-23 13:55:10 +02:00
2025-09-24 04:57:07 +02:00
2025-09-23 13:59:41 +02:00
2025-09-23 13:55:10 +02:00
2025-09-24 04:57:07 +02:00
2025-09-23 13:59:41 +02:00
2025-09-23 13:55:10 +02:00

Data Retention Service

A lightweight, standalone microservice responsible for automated data cleanup based on tenant retention policies.

Overview

This service runs as a separate Docker container and performs the following functions:

  • Automated Cleanup: Daily scheduled cleanup at 2:00 AM UTC
  • Tenant-Aware: Respects individual tenant retention policies
  • Lightweight: Minimal resource footprint (~64-128MB RAM)
  • Resilient: Continues operation even if individual tenant cleanups fail
  • Logged: Comprehensive logging and health monitoring

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Main Backend  │    │  Data Retention │    │   PostgreSQL    │
│    Container    │    │     Service     │    │    Database     │
│                 │    │                 │    │                 │
│ • API Endpoints │    │ • Cron Jobs     │◄──►│ • tenant data   │
│ • Business Logic│    │ • Data Cleanup  │    │ • detections    │
│ • Rate Limiting │    │ • Health Check  │    │ • heartbeats    │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Features

🕒 Scheduled Operations

  • Runs daily at 2:00 AM UTC via cron job
  • Configurable immediate cleanup for development/testing
  • Graceful shutdown handling

🏢 Multi-Tenant Support

  • Processes all active tenants
  • Respects individual retention policies:
    • -1 = Unlimited retention (no cleanup)
    • N = Delete data older than N days
    • Default: 90 days if not specified

🧹 Data Cleanup

  • Drone Detections: Historical detection records
  • Heartbeats: Device connectivity logs
  • Security Logs: Audit trail entries (if applicable)

📊 Monitoring & Health

  • Built-in health checks for Docker
  • Memory usage monitoring
  • Cleanup statistics tracking
  • Error logging with tenant context

Configuration

Environment Variables

# Database Connection
DB_HOST=postgres          # Database host
DB_PORT=5432             # Database port  
DB_NAME=drone_detection  # Database name
DB_USER=postgres         # Database user
DB_PASSWORD=password     # Database password

# Service Settings
NODE_ENV=production      # Environment mode
IMMEDIATE_CLEANUP=false  # Run cleanup on startup
LOG_LEVEL=info          # Logging level

Docker Compose Integration

data-retention:
  build:
    context: ./data-retention-service
  container_name: drone-detection-data-retention
  restart: unless-stopped
  environment:
    DB_HOST: postgres
    DB_PORT: 5432
    DB_NAME: drone_detection
    DB_USER: postgres
    DB_PASSWORD: your_secure_password
  depends_on:
    postgres:
      condition: service_healthy
  deploy:
    resources:
      limits:
        memory: 128M

Usage

Start with Docker Compose

# Start all services including data retention
docker-compose up -d

# Start only data retention service
docker-compose up -d data-retention

# View logs
docker-compose logs -f data-retention

Manual Container Build

# Build the container
cd data-retention-service
docker build -t data-retention-service .

# Run the container
docker run -d \
  --name data-retention \
  --env-file .env \
  --network drone-network \
  data-retention-service

Development Mode

# Install dependencies
npm install

# Run with immediate cleanup
IMMEDIATE_CLEANUP=true npm start

# Run in development mode
npm run dev

Logging Output

Startup

🗂️  Starting Data Retention Service...
📅 Environment: production
💾 Database: postgres:5432/drone_detection
✅ Database connection established
⏰ Scheduled cleanup: Daily at 2:00 AM UTC
✅ Data Retention Service started successfully

Cleanup Operation

🧹 Starting data retention cleanup...
⏰ Cleanup started at: 2024-09-23T02:00:00.123Z
🏢 Found 5 active tenants to process
🧹 Cleaning tenant acme-corp - removing data older than 90 days
✅ Tenant acme-corp: Deleted 1250 detections, 4500 heartbeats, 89 logs
⏭️  Skipping tenant enterprise-unlimited - unlimited retention
✅ Data retention cleanup completed
⏱️  Duration: 2347ms
📊 Deleted: 2100 detections, 8900 heartbeats, 156 logs

Health Monitoring

💚 Health Check - Uptime: 3600s, Memory: 45MB, Last Cleanup: 2024-09-23T02:00:00.123Z

API Integration

The main backend provides endpoints to interact with retention policies:

# Get current tenant limits and retention info
GET /api/tenant/limits

# Preview what would be deleted
GET /api/tenant/data-retention/preview

Error Handling

Tenant-Level Errors

  • Service continues if individual tenant cleanup fails
  • Errors logged with tenant context
  • Failed tenants skipped, others processed normally

Service-Level Errors

  • Database connection issues cause service restart
  • Health checks detect and report issues
  • Graceful shutdown on container stop signals

Example Error Log

❌ Error cleaning tenant problematic-tenant: SequelizeTimeoutError: Query timeout
⚠️  Errors encountered: 1
   - problematic-tenant: Query timeout

Performance

Resource Usage

  • Memory: 64-128MB typical usage
  • CPU: Minimal, only during cleanup operations
  • Storage: Logs rotate automatically
  • Network: Database queries only

Cleanup Performance

  • Batch operations for efficiency
  • Indexed database queries on timestamp fields
  • Parallel tenant processing where possible
  • Configurable batch sizes for large datasets

Security

Database Access

  • Read/write access only to required tables
  • Connection pooling with limits
  • Prepared statements prevent SQL injection

Container Security

  • Non-root user execution
  • Minimal base image (node:18-alpine)
  • No exposed ports
  • Isolated network access

Monitoring

Health Checks

# Docker health check
docker exec data-retention node healthcheck.js

# Container status
docker-compose ps data-retention

# Service logs  
docker-compose logs -f data-retention

Metrics

  • Cleanup duration and frequency
  • Records deleted per tenant
  • Memory usage over time
  • Error rates and types

Troubleshooting

Common Issues

Service won't start

# Check database connectivity
docker-compose logs postgres
docker-compose logs data-retention

# Verify environment variables
docker-compose config

Cleanup not running

# Check cron schedule
docker exec data-retention ps aux | grep cron

# Force immediate cleanup
docker exec data-retention node -e "
const service = require('./index.js');
service.performCleanup();
"

High memory usage

# Check cleanup frequency
docker stats data-retention

# Review tenant data volumes
docker exec data-retention node -e "
const { getModels } = require('./database');
// Check tenant data sizes
"

Configuration Validation

# Test database connection
docker exec data-retention node healthcheck.js

# Verify tenant policies
docker exec -it data-retention node -e "
const { getModels } = require('./database');
(async () => {
  const { Tenant } = await getModels();
  const tenants = await Tenant.findAll();
  console.log(tenants.map(t => ({
    slug: t.slug,
    retention: t.features?.data_retention_days
  })));
})();
"

Migration from Integrated Service

If upgrading from a version where data retention was part of the main backend:

  1. Deploy new container: Add data retention service to docker-compose.yml
  2. Verify operation: Check logs for successful startup and database connection
  3. Remove old code: The integrated service code is automatically disabled
  4. Monitor transition: Ensure cleanup operations continue normally

The service is designed to be backward compatible and will work with existing tenant configurations without changes.