Alerts
Define alert rules using GQL queries with threshold conditions. GoodLogs evaluates them every 60 seconds and notifies you via webhook, Slack, or Discord when metrics cross thresholds.
Default rules on new projects
Every newly created project is seeded with three baseline alerts so something useful is firing on day one — no configuration needed:
| Name | Condition | Severity | Cooldown |
|---|---|---|---|
| High error rate | error_count > 50 over 30 min | warning | 60 min |
| Critical error spike | error_count > 200 over 30 min | critical | 30 min |
| Log volume drop | log_volume > 0 over 15 min | warning | 60 min |
You can disable, tune, or delete them from Dashboard → Alerts.
Creating Alerts
Go to Alerts in the sidebar and click + New Alert. Write a GQL query with an alert condition, or use the AI button to describe what you want in plain English.
GQL Alert Syntax
Alert queries use the over:WINDOW OP THRESHOLD syntax at the end of the pipeline. This defines a rolling time window and a condition that triggers the alert.
# Basic: alert if more than 50 errors in 30 minutes
severity:error | count | over:30m > 50
# Any fatal errors in 5 minutes
severity:fatal | count | over:5m > 0
# Pattern match: payment failures
message:~"payment failed" | count | over:1h > 10
# Service-specific errors
severity:error service:billing | count | over:15m > 20
# Dead service (no events)
from:events | count | over:10m < 1
# 5xx errors
status_code:>=500 | count | over:5m > 50
# Latency degradation
| avg(duration_ms) | over:10m > 2000
# Signup drops below threshold
from:events event_name:signup | count | over:1h < 5
# Database timeout detection
message:~"database timeout" | count | over:5m > 0
| Operator | Meaning | Example |
|---|---|---|
| > | Greater than | over:30m > 100 |
| >= | Greater or equal | over:5m >= 1 |
| < | Less than (detect drops) | over:10m < 1 |
| <= | Less or equal | over:1h <= 5 |
| = | Equal to | over:30m = 0 |
| != | Not equal to | over:1h != 0 |
Supported Aggregations
Any GQL aggregate function works before over::
| count | over:30m > 50 # count of matching rows
| avg(duration_ms) | over:10m > 2000 # average of a field
| sum(amount) | over:1h > 10000 # sum of a field
| max(response_time) | over:5m > 5000 # max value
AI-Powered Alert Creation
Click the AI button in the alert GQL bar to describe your alert condition in plain English. The AI generates a valid GQL alert query using your project's actual schema.
Example Prompts
"alert me if there are more than 100 errors in 30 minutes"
→ severity:error | count | over:30m > 100
"notify when fatal errors happen"
→ severity:fatal | count | over:5m > 0
"alert if payment failures exceed 10 per hour"
→ message:~"payment failed" | count | over:1h > 10
"warn if average response time goes above 2 seconds"
→ | avg(duration_ms) | over:10m > 2000
"alert when no events for 10 minutes"
→ from:events | count | over:10m < 1
"alert if 5xx errors exceed 50 in 5 minutes"
→ status_code:>=500 | count | over:5m > 50
The AI knows your schema — it uses your actual field names, event names, and property types when generating queries.
Tip: While typing in AI mode, autocomplete suggests your project's field names to help you reference them accurately in your description.
Quick-Start Examples
Click any example in the alert creation form to pre-fill the GQL bar:
| Template | What It Does |
|---|---|
| severity:error | count | over:30m > 50 | Error spike detection |
| severity:fatal | count | over:5m > 0 | Any fatal error |
| message:~timeout | count | over:15m > 20 | Timeout pattern |
| message:=~"payment.*failed" | count | over:1h > 10 | Regex pattern match |
| from:events | count | over:10m < 1 | Dead service detection |
Severity
Each alert has a severity that affects notification styling and urgency:
| Severity | Color | Use Case |
|---|---|---|
| info | 🔵 Blue | Informational — non-urgent notifications |
| warning | 🟡 Yellow | Warning — needs attention soon (default) |
| critical | 🔴 Red | Critical — immediate action required |
Notification Channels
Configure one or more notification channels per alert. Notifications fire on both trigger and resolve events.
Webhook
POST a JSON payload to any HTTP endpoint.
{
"alert": "Error Spike",
"metric": "error_count",
"value": 127,
"threshold": 50,
"status": "triggered",
"severity": "critical",
"project_id": "uuid",
"message": "Error Spike: error_count is 127 (threshold: 50)",
"timestamp": "2026-05-26T14:32:00Z"
}
Slack
Formatted message with severity emoji and metric details. Provide an incoming webhook URL.
🔴 *Alert TRIGGERED*: Error Spike
Error Spike: error_count is 127 (threshold: 50)
*Metric:* `error_count` | *Value:* `127` | *Threshold:* `50`
Discord
Color-coded rich embed (red for critical, yellow for warning, green for resolved).
Webhook Signing
Each notification channel supports an optional signing secret. When configured, GoodLogs computes an HMAC-SHA256 signature of the JSON payload and includes it in the x-goodlogs-signature header.
x-goodlogs-signature: sha256=a1b2c3d4e5f6...
Verifying Signatures
On your server, compute the HMAC-SHA256 of the raw request body using your secret and compare:
const crypto = require('crypto');
function verifySignature(body, secret, signature) {
const expected = 'sha256=' + crypto
.createHmac('sha256', secret)
.update(body)
.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(expected),
Buffer.from(signature)
);
}
// In your webhook handler:
app.post('/webhook', (req, res) => {
const sig = req.headers['x-goodlogs-signature'];
if (!verifySignature(JSON.stringify(req.body), WEBHOOK_SECRET, sig)) {
return res.status(401).send('Invalid signature');
}
// Process alert...
});
import hmac, hashlib
def verify_signature(body: bytes, secret: str, signature: str) -> bool:
expected = 'sha256=' + hmac.new(
secret.encode(), body, hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected, signature)
# In your webhook handler:
sig = request.headers.get('x-goodlogs-signature', '')
if not verify_signature(request.data, WEBHOOK_SECRET, sig):
abort(401)
Warning: Always use constant-time comparison (timingSafeEqual / compare_digest) to prevent timing attacks.
Cooldown
Set a cooldown period (in minutes) to prevent the same alert from re-triggering too quickly. This avoids notification storms from flapping metrics.
Cooldown: 15 minutes
→ After triggering, the alert won't fire again for 15 minutes
even if the metric stays above the threshold.
Muting
Mute an alert to suppress notifications during maintenance windows. The alert still evaluates (status updates) but doesn't send notifications.
POST /api/orgs/:org/projects/:project/alerts/:id/mute
{ "minutes": 60 }
Message Templates
Customize notification messages with variable substitution:
🚨 {{name}}: {{metric}} is {{value}} (threshold: {{threshold}}, status: {{status}})
| Variable | Description |
|---|---|
| {{name}} | Alert rule name |
| {{metric}} | Metric being measured |
| {{value}} | Current metric value |
| {{threshold}} | Configured threshold |
| {{status}} | triggered or resolved |
| {{severity}} | info, warning, or critical |
| {{project_id}} | Project UUID |
Alert Lifecycle
- OK → TRIGGERED: metric crosses threshold → trigger notification
- TRIGGERED → OK: metric returns to normal → resolve notification (includes duration)
Querying Alerts with GQL
Use from:alerts to query alert rules and from:alert_events to query the alert timeline.
Alert Rules
# All alert rules
from:alerts
# Currently firing alerts
from:alerts status:triggered
# Critical alerts
from:alerts severity:critical
# Count alerts by status
from:alerts | count by status
Fields: name, metric, status (ok/triggered), severity, threshold, window_minutes, enabled, cooldown_minutes
Alert Timeline
# Recent triggers
from:alert_events event_type:triggered | last:7d
# Most triggered alerts
from:alert_events event_type:triggered | count by alert_name | top 10 | last:30d
# Alert frequency trend
from:alert_events | count | timeseries 1d | last:30d
# Average resolution time
from:alert_events event_type:resolved | avg(duration_seconds) | last:30d
Fields: event_type (triggered/resolved), metric, actual_value, threshold, alert_name, duration_seconds
Status Pages
Alert events power the Public Status Pages feature. When status pages are enabled on a project, the page automatically derives its state from open alert events:
- No open alerts → Operational (green)
- Open alert with error/fatal/crash/5xx metric → Outage (red)
- Any other open alert → Degraded (amber)
The 90-day uptime bar and incident history are built from the alert_events timeline. See Status Pages for setup and configuration.