Skip to main content
Alerts notify you when your queues experience issues like high failure rates, growing backlogs, or slow processing. This guide walks through setting up effective alerts for your production queues.
Alerts require a Pro or Enterprise plan. Upgrade your plan to enable alerts.

Prerequisites

Before setting up alerts, ensure you have:
  • A bullstudio account with Pro or Enterprise plan
  • At least one Redis connection configured
  • Queues actively processing jobs (for meaningful thresholds)

Step 1: Understand Your Baseline

Before creating alerts, understand your normal operating metrics:
1

Open the Dashboard

Navigate to your workspace and select your connection.
2

Review metrics for 24-48 hours

Look at:
  • Normal failure rate (typically < 1% for healthy queues)
  • Average processing time
  • Typical queue depth (waiting jobs)
  • Throughput patterns
3

Note your thresholds

Write down values that would indicate a problem:
  • Failure rate: 2-3x your normal rate
  • Processing time: 2-3x your normal average
  • Backlog: 10x your normal waiting count
Spend time observing your metrics before setting thresholds. Alerts based on real data are more effective than guesses.

Step 2: Create Your First Alert

Start with a Failure Rate alert—the most universally useful type:
1

Navigate to Alerts

Go to your workspace and click Alerts in the sidebar.
2

Click Create Alert

Click the Create Alert button.
3

Configure the alert

Fill in the form:
FieldValueNotes
NameHigh Failure RateDescriptive name
ConnectionYour production connectionSelect from dropdown
QueueYour main queueOr leave blank for all queues
TypeFailure Rate
Threshold5%Adjust based on your baseline
Time Window15 minutesPeriod to calculate rate
Resolve Threshold2%When to consider resolved
Recipientsyour-email@example.comAdd team members
Cooldown15 minutesPrevent spam
4

Save

Click Save to create the alert.

Step 3: Add Essential Alerts

For comprehensive monitoring, set up these additional alerts:

Backlog Alert

Detects when jobs are piling up faster than workers can process:
Name: Queue Backlog High
Type: Backlog Exceeded
Threshold: 1000 jobs (adjust based on your scale)
Resolve Threshold: 500 jobs
Set backlog thresholds based on your normal queue depth. A queue that typically has 10 waiting jobs is different from one with 1000.

Processing Time Alert

Detects performance degradation:
Name: Slow Processing
Type: Processing Time (Avg)
Threshold: 5000ms (adjust based on your baseline)
Time Window: 15 minutes
Resolve Threshold: 2000ms

Missing Workers Alert

Detects when all workers have stopped:
Name: No Active Workers
Type: Missing Workers
Grace Period: 5 minutes
The Missing Workers alert is especially important for production. It catches scenarios where all workers have crashed.

Step 4: Test Your Alerts

Verify alerts are configured correctly:
1

Open alert details

Click on the alert you want to test.
2

Click Test

Click the Test button.
3

Check your email

Verify you received the test notification.

Alert Configuration Tips

Choosing Thresholds

ScenarioSuggested Approach
High-volume queuesUse percentage-based thresholds, shorter time windows
Low-volume queuesUse longer time windows to avoid noise
Critical queuesSet tighter thresholds, faster cooldowns
Background jobsMore relaxed thresholds, longer cooldowns

Time Window Guidelines

Queue VolumeRecommended Window
> 1000 jobs/hour5-10 minutes
100-1000 jobs/hour15-30 minutes
< 100 jobs/hour30-60 minutes

Cooldown Strategy

  • Critical alerts: 5-15 minutes
  • Standard alerts: 15-30 minutes
  • Low-priority alerts: 1-4 hours

Common Alert Configurations

E-commerce Order Processing

Alert 1: Order Failures
- Type: Failure Rate
- Threshold: 2%
- Window: 10 minutes
- Cooldown: 5 minutes

Alert 2: Order Backlog
- Type: Backlog Exceeded
- Threshold: 500 orders
- Cooldown: 15 minutes

Alert 3: Slow Checkout
- Type: Processing Time (P95)
- Threshold: 10000ms
- Window: 10 minutes

Email Notification System

Alert 1: Email Failures
- Type: Failure Rate
- Threshold: 5%
- Window: 15 minutes

Alert 2: Email Backlog
- Type: Backlog Exceeded
- Threshold: 10000 emails

Alert 3: Missing Senders
- Type: Missing Workers
- Grace Period: 10 minutes

Data Pipeline

Alert 1: Pipeline Failures
- Type: Failure Rate
- Threshold: 1%
- Window: 30 minutes

Alert 2: Slow Processing
- Type: Processing Time (Avg)
- Threshold: 60000ms
- Window: 30 minutes

Responding to Alerts

When you receive an alert:
1

Acknowledge

Note the alert and begin investigation.
2

Check the Dashboard

Open bullstudio and review metrics for the affected queue.
3

Investigate Jobs

Go to the Jobs page, filter by failed jobs, and examine error messages.
4

Take Action

  • Retry jobs if it was a transient issue
  • Fix bugs and deploy
  • Scale workers if backlogged
  • Check external dependencies
5

Document

Record what happened and how it was resolved for future reference.

Avoiding Alert Fatigue

  • Start with fewer alerts and add more as needed
  • Use resolve thresholds to prevent flapping
  • Set appropriate cooldown periods
  • Review and adjust thresholds monthly
  • Disable alerts for non-critical queues during known maintenance

Next Steps