Skip to main content

Overview

When something goes wrong, use the Logs and Metrics endpoints to diagnose the issue. This guide walks through common problems and how to debug them.

Getting Started with Logs

First, retrieve recent logs:
curl "https://trio.machinefi.com/logs?limit=50"
Filter by log level to find errors:
curl "https://trio.machinefi.com/logs?level=ERROR&limit=50"
Each log entry includes:
  • time: When it happened
  • level: DEBUG, INFO, WARNING, or ERROR
  • message: Details with job ID prefix (first 8 chars)

Common Issues & Solutions

Job Not Triggering

Symptom: Job is running but condition never triggers Debugging steps:
  1. Check logs for analysis results:
    curl "https://trio.machinefi.com/logs?level=INFO" | grep "Analysis result"
    
    Look for:
    [JOB:a1b2c3d4] Analysis result: triggered=false
    
  2. Review the explanation:
    [JOB:a1b2c3d4] Analysis result: triggered=false, explanation=The street appears empty
    
    If the explanation doesn’t match your expectations, your condition may be:
    • Too specific
    • Poorly worded
    • Not matching what’s actually in the stream
  3. Test with Check-Once:
    curl -X POST https://trio.machinefi.com/check-once \
      -H "Content-Type: application/json" \
      -d '{
        "stream_url": "https://youtube.com/watch?v=YOUR_STREAM",
        "condition": "Your condition here"
      }'
    
    Compare the explanation to what you see in the stream.
  4. Refine your condition:
    # Old: Too vague
    "Is there snow?"
    
    # New: More specific
    "Is it actively snowing? Look for falling white particles or fresh snow accumulation."
    
See also: Writing Good Conditions

High Frame Skip Rate

Symptom: Most frames are being skipped by pre-filter Check metrics:
curl https://trio.machinefi.com/metrics | jq '.frames'
Output:
{
  "captured": 450,
  "skipped_prefilter": 380
}
Skip rate: 380 / (450 + 380) = 46% Is this a problem?
  • 70%+ skip rate: Great! You’re watching a static scene (security cam). Pre-filter is saving money.
  • 20%- skip rate: Stream has lots of motion. Consider if this is normal.
  • Increasing over time: Could indicate the stream went offline or is now static.
To adjust:
  • Lower sensitivity (if available in config): MOTION_THRESHOLD=0.01
  • Or accept it - high skip rates actually save you money!

Job Failing Immediately

Symptom: Job starts but immediately fails Check error logs:
curl "https://trio.machinefi.com/logs?level=ERROR"
Look for patterns like: API Key Error:
[JOB:a1b2c3d4] Failed to call gemini: 401 Unauthorized
Fix: Check GOOGLE_API_KEY is set and valid Stream Error:
[JOB:a1b2c3d4] Failed to open stream: Video unavailable
Fix: Verify URL is a live stream (not a regular video or VOD) Rate Limit:
[JOB:a1b2c3d4] gemini: 429 Too Many Requests
Fix: Increase interval_seconds, add fallback API key, or wait

Webhook Not Received

Symptom: Condition triggered but webhook didn’t arrive Check metrics:
curl https://trio.machinefi.com/metrics | jq '.webhooks'
Output:
{
  "sent": 28,
  "failed": 5
}
Failure rate: 5/28 = 18% Debugging:
  1. Check if your webhook is reachable:
    curl -X POST https://your-webhook-url.com/webhook \
      -H "Content-Type: application/json" \
      -d '{"test": "data"}'
    
    Should return 2xx status code.
  2. Check for timeouts: Webhooks must respond within 5 seconds. If your receiver takes longer:
    • Use background tasks
    • Return immediately, process async
  3. Check firewall: Verify your server allows incoming connections from Trio’s IP range.
  4. Check logs for delivery errors:
    curl "https://trio.machinefi.com/logs?level=ERROR" | grep webhook
    
See also: Webhooks & Notifications

Stream Goes Offline

Symptom: Job fails with stream error Check logs:
curl "https://trio.machinefi.com/logs?level=ERROR" | grep -i stream
You’ll see:
[JOB:a1b2c3d4] Stream offline: URL no longer valid
Solutions:
  1. Verify stream is actually live:
    • Open the URL in a browser
    • Check if the stream is still broadcasting
  2. Use URL validation before starting jobs:
    curl "https://trio.machinefi.com/validate-url?url=YOUR_URL"
    
  3. Handle stream failures in your webhook:
    if payload.get("type") == "job_status":
        if payload["details"].get("reason") == "stream_offline":
            # Handle offline stream
            notify_user("Stream went offline")
    

High API Costs

Symptom: Bills are higher than expected Analyze metrics:
curl https://trio.machinefi.com/metrics | jq '.providers'
See: Cost Analysis Guide Quick wins:
  1. Increase interval_seconds (60+ seconds)
  2. Ensure pre-filter is enabled
  3. Use cheaper model (gemini-2.5-flash)
  4. Check for jobs left running 24/7 unintentionally

Debug Workflow

When a job isn’t working:
  1. Check if it’s running:
    curl https://trio.machinefi.com/jobs/YOUR_JOB_ID
    
  2. Review recent logs:
    curl "https://trio.machinefi.com/logs?level=ERROR&limit=20"
    
  3. Check metrics for patterns:
    curl https://trio.machinefi.com/metrics
    
  4. Test condition independently:
    curl -X POST https://trio.machinefi.com/check-once \
      -H "Content-Type: application/json" \
      -d '{"stream_url": "...", "condition": "..."}'
    
  5. Refine and retry

Tips for Production

Monitor Continuously

Set up periodic health checks:
import httpx
import schedule

def check_health():
    response = httpx.get("https://trio.machinefi.com/healthz")
    if response.status_code != 200:
        alert("Trio API is down!")

schedule.every(5).minutes.do(check_health)

Set Up Alerting

Alert on high error rates:
metrics = httpx.get("https://trio.machinefi.com/metrics").json()

# Alert if error rate > 5%
for provider, stats in metrics['providers'].items():
    error_rate = stats['errors'] / stats['calls']
    if error_rate > 0.05:
        alert(f"{provider} error rate: {error_rate:.1%}")