Alerts notify you when your queues experience issues like high failure rates, large backlogs, or slow processing. Stay informed about problems before they impact users.
Alerts are available on Pro and Enterprise plans. Upgrade your plan to enable alerts.
Alert Types
bullstudio supports six alert types to cover common queue monitoring scenarios:
Failure Rate
Triggers when the percentage of failed jobs exceeds a threshold.
| Setting | Description | Example |
|---|
| Threshold | Failure rate percentage to trigger alert | 10% |
| Time Window | Period to calculate failure rate | 15 minutes |
| Resolve Threshold | Rate to resolve alert | 5% |
Use case: Detect when your workers are experiencing higher-than-normal error rates.
Backlog Exceeded
Triggers when the number of waiting jobs exceeds a threshold.
| Setting | Description | Example |
|---|
| Threshold | Number of waiting jobs to trigger | 1000 |
| Resolve Threshold | Number to resolve alert | 500 |
Use case: Detect when jobs are piling up faster than workers can process them.
Processing Time (Average)
Triggers when average job processing time exceeds a threshold.
| Setting | Description | Example |
|---|
| Threshold | Average time in milliseconds | 5000ms |
| Time Window | Period to calculate average | 10 minutes |
| Resolve Threshold | Time to resolve alert | 3000ms |
Use case: Detect when jobs are taking longer than expected to process.
Processing Time (P95)
Triggers when the 95th percentile processing time exceeds a threshold.
| Setting | Description | Example |
|---|
| Threshold | P95 time in milliseconds | 10000ms |
| Time Window | Period to calculate P95 | 10 minutes |
| Resolve Threshold | Time to resolve alert | 7000ms |
Use case: Detect outlier slow jobs that may indicate specific issues.
Processing Time (P99)
Triggers when the 99th percentile processing time exceeds a threshold.
| Setting | Description | Example |
|---|
| Threshold | P99 time in milliseconds | 30000ms |
| Time Window | Period to calculate P99 | 10 minutes |
| Resolve Threshold | Time to resolve alert | 20000ms |
Use case: Catch the slowest jobs that may timeout or cause problems.
Missing Workers
Triggers when no workers have processed jobs for a specified period.
| Setting | Description | Example |
|---|
| Grace Period | Time with no activity before alerting | 5 minutes |
Use case: Detect when all workers have crashed or stopped.
Alert Statuses
| Status | Description |
|---|
| OK | Conditions are within normal thresholds |
| Triggered | Threshold exceeded, notification sent |
Creating an Alert
Navigate to Alerts
Go to your workspace and click Alerts in the sidebar.
Click Create Alert
Click the Create Alert button.
Configure the alert
Fill in the alert configuration:| Field | Description |
|---|
| Name | Descriptive name for the alert |
| Description | Optional details about what this alert monitors |
| Connection | Which Redis connection to monitor |
| Queue | Which queue to monitor (or all queues) |
| Alert Type | Select from the available types |
| Configuration | Type-specific thresholds |
| Recipients | Email addresses for notifications |
| Cooldown | Minimum time between notifications |
Save
Click Save to create the alert. It will start monitoring immediately.
Alert Configuration
Thresholds
Set appropriate thresholds based on your normal operating parameters:
Normal failure rate: ~1%
→ Alert threshold: 5%
→ Resolve threshold: 2%
Normal processing time: ~500ms
→ Alert threshold: 2000ms
→ Resolve threshold: 1000ms
Start with higher thresholds and adjust down as you understand your baseline metrics.
Time Windows
Time windows determine how data is aggregated:
- Shorter windows (5-10 min): More responsive, may be noisy
- Longer windows (30-60 min): Smoother, may miss brief spikes
Cooldown Period
The cooldown prevents alert fatigue:
- Minimum: 1 minute
- Maximum: 24 hours (1440 minutes)
- Default: 15 minutes
During cooldown, no new notifications are sent even if the alert remains triggered.
Recipients
Add one or more email addresses to receive notifications:
- Notifications sent when alert triggers
- Resolution notifications sent when alert returns to OK
Managing Alerts
Edit Alert
- Click on the alert in the list
- Click Edit
- Modify settings
- Click Save
Enable/Disable Alert
Toggle an alert without deleting it:
- Click on the alert
- Toggle the Enabled switch
- Disabled alerts don’t evaluate or send notifications
Delete Alert
- Click on the alert
- Click Delete
- Confirm deletion
Deleting an alert removes all its history. Consider disabling instead if you may need it later.
Test Alert
Send a test notification to verify your configuration:
- Click on the alert
- Click Test
- A test notification will be sent to all recipients
Viewing Alert History
Each alert maintains a history of triggers:
- When the alert was triggered
- When it was resolved
- Duration of the incident
Use this to:
- Identify patterns in issues
- Track incident frequency
- Measure improvement over time
Alert Best Practices
Start with Essential Alerts
Begin with these core alerts for most queues:
- Failure Rate: Catch bugs and external service issues
- Backlog Exceeded: Detect capacity problems
- Missing Workers: Know when workers are down
Set Meaningful Thresholds
Base thresholds on your actual metrics:
1. Review your dashboard for normal values
2. Set threshold at 2-3x normal
3. Set resolve threshold at 1.5x normal
4. Adjust based on false positives/negatives
Use Appropriate Time Windows
| Scenario | Recommended Window |
|---|
| High-volume queues (1000+ jobs/hour) | 5-10 minutes |
| Medium-volume queues (100-1000/hour) | 15-30 minutes |
| Low-volume queues (<100/hour) | 30-60 minutes |
Avoid Alert Fatigue
- Use resolve thresholds to prevent flapping
- Set reasonable cooldown periods
- Start with fewer, broader alerts
- Refine over time based on actual incidents
Notification Content
Alert notifications include:
- Alert name and type
- Queue and connection name
- Current metric value vs threshold
- Link to the alert in bullstudio
- Timestamp
Next Steps