Data Pipeline Monitoring

Track ETL jobs, data loads, and database migrations on the timeline. Correlate data pipeline events with downstream issues.

The Pattern

Pipeline records events — Airflow, cron jobs, or your ETL framework records data-load events when jobs complete
Query data operations — Ask “what data jobs ran today?” to see all data pipeline activity
Correlate with issues — When a downstream dashboard shows stale data, check if the data load completed successfully

Example: Track a Data Load

bash

curl -X POST https://api.opstrails.dev/api/v1/events \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "specversion": "1.0",
    "type": "data-load",
    "source": "//airflow/etl-pipeline",
    "time": "NOW",
    "subject": "warehouse",
    "data": {
      "description": "Loaded 2.3M rows into analytics warehouse"
    }
  }'

Example: Track a Migration

bash

curl -X POST https://api.opstrails.dev/api/v1/events \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "specversion": "1.0",
    "type": "data-load",
    "source": "//scripts/db-migration",
    "time": "NOW",
    "subject": "production",
    "severity": "MAJOR",
    "version": "migration-042",
    "data": {
      "description": "Added index on users.email column"
    }
  }'

Querying with AI

“What data jobs ran today?”
“When was the last data load for the warehouse?”
“Show me all data-load events from the last 24 hours”
“Were there any failed migrations this week?”

Best Practices

Use descriptive sources — Include the pipeline name in the source (e.g. //airflow/etl-pipeline)
Track both success and failure — Record events for both successful and failed jobs, using severity to distinguish
Include row counts — Add metrics like row counts or duration in the data payload for richer context
Combine with deployment tracking — Use the Deployment Tracking pattern to track code deploys alongside data pipeline events for full operational visibility
Enable incident response — When data pipelines fail, the Incident Response pattern helps AI quickly identify the root cause

Previous← Release Management NextGetting Started →