Uptime Monitoring

Heartbeat Monitors

Monitor cron jobs, scheduled tasks, and batch processes with passive heartbeat monitoring.

Heartbeat Monitors

Heartbeat monitors work differently from other monitor types. Instead of tracer checking your service, your service pings tracer. If the ping doesn't arrive within the expected time, an alert is triggered.

How It Works

  1. Create a heartbeat monitor with a expected ping interval
  2. Get a unique ping URL
  3. Your service calls this URL on each run
  4. If tracer doesn't receive a ping within the grace period, it alerts

This is perfect for monitoring:

  • Cron jobs
  • Scheduled tasks
  • Batch processes
  • Background workers
  • Backup scripts

Creating a Heartbeat Monitor

Create Monitor

  1. Go to Uptime > New Monitor > Heartbeat
  2. Enter a name: "Daily Backup Job"
  3. Set the schedule

Configure Timing

Period: 24 hours    # How often should we receive a ping?
Grace: 30 minutes   # How long to wait before alerting?

Period: Expected time between pings Grace: Extra time before alerting (accounts for delays)

Get Your Ping URL

After creation, you'll receive a unique URL:

https://ping.tracer/hb/abc123xyz

Add to Your Job

Add a ping to your script:

# At the end of your script
curl -fsS --retry 3 https://ping.tracer/hb/abc123xyz

Timing Configuration

Period

How often your job runs:

PeriodUse Case
1 minuteFrequent health checks
5 minutesRegular background tasks
1 hourHourly jobs
24 hoursDaily jobs
7 daysWeekly jobs

Grace Period

Extra time before alerting. Accounts for:

  • Job duration
  • Network latency
  • Small delays

Example: Job runs every hour, takes up to 10 minutes

  • Period: 1 hour
  • Grace: 15 minutes
  • Alert if no ping after 1 hour 15 minutes

Set grace period to at least the maximum expected job duration plus a buffer for network issues.

Ping Methods

Simple Ping

Just hit the URL:

curl https://ping.tracer/hb/YOUR_TOKEN

Ping with Exit Code

Report job success/failure:

# Success (exit code 0)
curl https://ping.tracer/hb/YOUR_TOKEN/0

# Failure (any non-zero exit code)
curl https://ping.tracer/hb/YOUR_TOKEN/1

Ping with Output

Include job output for debugging:

curl -X POST https://ping.tracer/hb/YOUR_TOKEN \
  -d "Processed 1000 records in 45 seconds"

Start/Complete Pings

Track job duration:

# At job start
curl https://ping.tracer/hb/YOUR_TOKEN/start

# Your job runs here...

# At job end
curl https://ping.tracer/hb/YOUR_TOKEN

Integration Examples

Bash/Shell Script

#!/bin/bash

# Your job logic
./run-backup.sh
EXIT_CODE=$?

# Ping with exit code
curl -fsS --retry 3 "https://ping.tracer/hb/YOUR_TOKEN/$EXIT_CODE"

exit $EXIT_CODE

Cron Job

# Run backup daily at 2 AM, ping on completion
0 2 * * * /usr/local/bin/backup.sh && curl -fsS https://ping.tracer/hb/YOUR_TOKEN

Wrapper approach for better error reporting:

0 2 * * * /usr/local/bin/run-with-heartbeat.sh /usr/local/bin/backup.sh
#!/bin/bash
# run-with-heartbeat.sh
HEARTBEAT_URL="https://ping.tracer/hb/YOUR_TOKEN"

# Signal start
curl -fsS "$HEARTBEAT_URL/start"

# Run the actual command
"$@"
EXIT_CODE=$?

# Signal completion with exit code
curl -fsS "$HEARTBEAT_URL/$EXIT_CODE"

exit $EXIT_CODE

Python

import requests
import sys

HEARTBEAT_URL = "https://ping.tracer/hb/YOUR_TOKEN"

def ping_heartbeat(exit_code=0, message=None):
    try:
        url = f"{HEARTBEAT_URL}/{exit_code}"
        if message:
            requests.post(url, data=message, timeout=10)
        else:
            requests.get(url, timeout=10)
    except Exception as e:
        print(f"Failed to ping heartbeat: {e}")

# Start
requests.get(f"{HEARTBEAT_URL}/start")

try:
    # Your job logic
    process_data()
    ping_heartbeat(0, "Processed successfully")
except Exception as e:
    ping_heartbeat(1, str(e))
    sys.exit(1)

Node.js

const https = require('https');

const HEARTBEAT_URL = 'https://ping.tracer/hb/YOUR_TOKEN';

async function pingHeartbeat(exitCode = 0) {
  return new Promise((resolve) => {
    https.get(`${HEARTBEAT_URL}/${exitCode}`, resolve)
      .on('error', () => resolve());
  });
}

async function main() {
  try {
    // Your job logic
    await processData();
    await pingHeartbeat(0);
  } catch (error) {
    console.error(error);
    await pingHeartbeat(1);
    process.exit(1);
  }
}

main();

Docker

# In your Dockerfile or entrypoint
CMD ./your-job.sh && curl -fsS https://ping.tracer/hb/YOUR_TOKEN

Or with wrapper script:

#!/bin/bash
# docker-entrypoint.sh

curl -fsS "https://ping.tracer/hb/$HEARTBEAT_TOKEN/start"

./your-job.sh
EXIT_CODE=$?

curl -fsS "https://ping.tracer/hb/$HEARTBEAT_TOKEN/$EXIT_CODE"

exit $EXIT_CODE

Kubernetes CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: backup
              image: backup-image
              command:
                - /bin/sh
                - -c
                - |
                  curl -fsS "$HEARTBEAT_URL/start"
                  ./backup.sh
                  EXIT_CODE=$?
                  curl -fsS "$HEARTBEAT_URL/$EXIT_CODE"
                  exit $EXIT_CODE
              env:
                - name: HEARTBEAT_URL
                  valueFrom:
                    secretKeyRef:
                      name: heartbeat-secrets
                      key: url
          restartPolicy: OnFailure

Alert States

StateDescription
HealthyPings arriving on schedule
Grace PeriodNo ping yet, within grace
DownNo ping, grace period exceeded
PausedMonitoring paused

Viewing History

The heartbeat history shows:

  • Ping timestamps
  • Exit codes
  • Job duration (if using start/complete)
  • Output messages
  • Gaps (missed pings)

Best Practices

  1. Use appropriate grace periods - Too short causes false alarms
  2. Report exit codes - Know if job succeeded or failed
  3. Include output - Helps debugging failures
  4. Use start pings - Track job duration
  5. Handle ping failures gracefully - Don't fail job if ping fails
  6. Use retries - curl --retry 3 handles transient failures

Always handle ping failures gracefully. Your job should complete successfully even if the heartbeat ping fails.

Troubleshooting

Missed Pings

Problem: Getting alerts for missed pings when job is running

Solutions:

  • Increase grace period
  • Verify ping URL is correct
  • Check network connectivity
  • Ensure ping happens after job completes

False Positives

Problem: Alerts when jobs are actually running

Solutions:

  • Use start/complete pings to track duration
  • Increase grace period
  • Check for job delays

Job Failed But No Alert

Problem: Job failed but heartbeat shows healthy

Cause: Ping succeeded even though job failed

Solution: Report exit code:

./job.sh
curl "https://ping.tracer/hb/TOKEN/$?"