Stop Catching Exceptions in Your Rebus Handlers: Your Safety Net Is a Trapdoor
There's a moment that happens on nearly every team that adopts Rebus. Errors show up in the logs. A few transactions get dropped. An engineer—experienced, conscientious, doing exactly what worked in their last ten projects—adds a try/catch to the handler. The errors in the log go quiet. The dropped transactions continue. Everyone assumes the problem is solved.
It isn't. It got worse.
If your team is skeptical of Rebus's retry approach, or if you've been burned by messages disappearing without explanation, this article is for you. We're going to walk through why letting exceptions bubble up is not an antipattern—it's the framework contract—and show you how to build genuinely resilient handlers without reintroducing the bugs you were trying to fix.
The code examples here apply to Rebus, but as we'll show, the same contract exists in MassTransit and NServiceBus. This isn't a Rebus quirk. It's the distributed messaging model.
First: Understanding the Middleware Pipeline
Before discussing what to do or not do, you need a mental model of how Rebus actually executes your handler. Without this, the "bubble up" guidance sounds arbitrary.
Rebus operates as an onion pipeline. Your handler is the innermost layer—the core. Wrapped around it are several middleware layers that execute before and after your code runs. One of those outer layers is the Retry Strategy.
Here's the execution order when a message arrives:
Transport Layer → Pulls the message from the queue (RabbitMQ / Azure Service Bus)
↓
Retry Middleware → Wraps execution in its own try/catch
↓
[Other Middleware] → Logging, unit of work, serialization, etc.
↓
Your Handler → Your code runs here
↑
Retry Middleware → If no exception: tells transport to ACK (delete) the message
→ If exception: increments retry count, schedules next attempt
This is key: the Retry Middleware is already catching exceptions for you. It knows whether your handler succeeded or failed based on one thing—whether an exception propagated back up to it.
When your handler completes without throwing, the retry middleware signals the transport to acknowledge and delete the message from the queue. Success.
When your handler throws, the retry middleware catches it, keeps the message on the queue, and schedules a retry.
Now here's the problem.
The Trapdoor: What Happens When You Catch Inside the Handler
When you put a try/catch inside your handler and do not re-throw the exception, your handler returns normally. No exception reaches the retry middleware. As far as the framework is concerned, your handler succeeded.
The retry middleware dutifully signals the broker: delete this message.
The message is gone. Permanently.
// ❌ PATTERN TO AVOID: The "Silent Drop"
// This is the most dangerous pattern in Rebus.
public async Task Handle(OrderMessage message)
{
try
{
await _orderRepository.SaveAsync(message);
}
catch (Exception ex)
{
// You log it. You feel safe. The message is already deleted.
// There is no retry. There is no recovery. The data is gone.
_logger.LogError(ex, "Failed to save order {OrderId}", message.OrderId);
}
}
The log shows an error. The queue shows a success. The database has nothing. The engineer who wrote this code sees "ERROR" in Splunk and thinks: "Good thing I caught that." Meanwhile, the order never processed, and no one knows how to recover it.
This is what we mean when we call it a trapdoor. It looks like a floor. It isn't.
Why Engineers Reach for try/catch (And Why Those Reasons Don't Apply Here)
There are three legitimate instincts driving this pattern. None of them apply in a messaging context.
"I need to log the error before it disappears."
In a console application or a one-shot script, this is true. If you don't log it, it's gone. In Rebus, the framework logs it for you—including the full stack trace, the message payload headers, the delivery count, and the source queue. If you're using Fleet Manager or a similar tool, failed messages are queryable and replayable. Your manual log is redundant at best, misleading at worst.
"I need to handle specific exceptions differently."
This is a valid need. We'll cover how to handle it safely in the patterns section. Hint: you can still catch—you just need to throw afterward.
"I need to prevent the handler from crashing."
Rebus handlers don't "crash" the application on unhandled exceptions. The exception is caught by the retry middleware, the message is retained, and the worker moves on to the next message. Letting the exception bubble up is exactly what keeps the application running and preserves the data.
"But Rebus Is the Only Framework That Does This"
It isn't. Every major .NET service bus uses the same contract: your handler must signal failure through an unhandled exception. Swallowing exceptions breaks them all.
| Framework | How Retries Are Triggered | What Happens if You Swallow the Exception |
|---|---|---|
| Rebus | Unhandled exception bubbles to SimpleRetryStrategy middleware |
Message is ACKed and deleted |
| MassTransit | Unhandled exception bubbles to UseMessageRetry middleware |
Message is ACKed and deleted |
| NServiceBus | Unhandled exception bubbles to Recoverability pipeline |
Message is ACKed and deleted |
The "bubble up" contract isn't a Rebus design choice you can debate. It's the fundamental model for how all these frameworks deliver durability guarantees. The framework cannot protect a message it doesn't know is in trouble.
If your team has experience with MassTransit or NServiceBus, point to this table. The critique of Rebus is actually a critique of the entire category.
Patterns to Avoid
❌ The Silent Drop
Already shown above. The most dangerous pattern. Logs the error, deletes the message, no recovery possible.
// ❌ AVOID
public async Task Handle(OrderMessage message)
{
try { await _repository.SaveAsync(message); }
catch (Exception ex)
{
_logger.LogError(ex, "Error"); // Message silently deleted after this
}
}
❌ The Manual Retry Loop
This pattern is an attempt to handle retries "manually" inside the handler. It blocks the thread while looping, prevents Rebus from processing other messages during that time, and still fails to preserve the message if all retries are exhausted.
// ❌ AVOID
public async Task Handle(OrderMessage message)
{
for (int i = 0; i < 5; i++)
{
try
{
await _repository.SaveAsync(message);
return; // "Success," message gets deleted
}
catch (Exception ex)
{
if (i == 4) _logger.LogError(ex, "Gave up"); // Message deleted, no recovery
await Task.Delay(1000);
}
}
}
❌ The Manual Dead Letter Table
Some engineers respond to dropped messages by building their own "bad message" table or queue. This creates a parallel, non-standard observability channel that doesn't integrate with any tooling and is quickly forgotten.
// ❌ AVOID
public async Task Handle(OrderMessage message)
{
try { await _repository.SaveAsync(message); }
catch (Exception ex)
{
// Builds a homemade dead-letter mechanism that no one monitors
await _deadLetterTable.InsertAsync(message, ex.Message);
}
}
❌ The Stack Trace Destroyer
If your team insists on catching and re-throwing, make sure they know the difference between throw; and throw ex;. Using throw ex; resets the stack trace origin to the throw statement itself, wiping out the line number where the actual failure occurred.
// ❌ AVOID: Destroys the original stack trace
catch (Exception ex)
{
_logger.LogError(ex, "Caught something");
throw ex; // Stack trace now points to THIS line, not where it actually failed
}
Patterns to Embrace
✅ Let It Fail (The Default)
The simplest, cleanest pattern. No try/catch needed. If _repository.SaveAsync throws, Rebus catches it, retains the message, and retries automatically. The full stack trace, message payload, and retry count are captured by the framework.
// ✅ PREFER: Clean handler, full retry protection
public async Task Handle(OrderMessage message)
{
await _repository.SaveAsync(message);
}
✅ Logging Scopes for Context (Without the Catch)
The most common reason engineers want a try/catch is to add business context to the log—"I want to know which OrderId failed." You can do this without catching anything using ILogger.BeginScope. The scope attaches properties to every log entry generated within the block, including the ones Rebus generates automatically on failure.
// ✅ PREFER: Context without swallowing the exception
public async Task Handle(OrderMessage message)
{
using (_logger.BeginScope(new Dictionary<string, object>
{
["OrderId"] = message.OrderId,
["CustomerId"] = message.CustomerId
}))
{
// If this fails, the Rebus error log automatically includes
// OrderId and CustomerId in its metadata. No catch needed.
await _repository.SaveAsync(message);
await _eventBus.PublishAsync(new OrderSavedEvent(message.OrderId));
}
}
If you're using Serilog, LogContext.PushProperty accomplishes the same thing and may feel more natural:
// ✅ PREFER (Serilog variant)
public async Task Handle(OrderMessage message)
{
using (LogContext.PushProperty("OrderId", message.OrderId))
using (LogContext.PushProperty("CustomerId", message.CustomerId))
{
await _repository.SaveAsync(message);
}
}
✅ Catch-and-Rethrow for Wrapping Context
If you need to add context to the exception itself—not just the log—you can catch, wrap, and rethrow. Use throw; (not throw ex;) to preserve the original stack trace.
// ✅ ACCEPTABLE: Wrap with context, preserve stack trace
public async Task Handle(OrderMessage message)
{
try
{
await _repository.SaveAsync(message);
}
catch (Exception ex)
{
// Adds domain context to the exception without destroying the trace
throw new OrderProcessingException(message.OrderId, "Failed to persist order", ex);
}
}
✅ Exception Filtering for Specific Types
When you genuinely need different behavior for different exception types—such as skipping retries for validation errors—filter by type and let everything else bubble up untouched.
// ✅ ACCEPTABLE: Type-specific handling with explicit re-throw for unknowns
public async Task Handle(OrderMessage message)
{
try
{
await _orderService.ProcessAsync(message);
}
catch (ValidationException ex)
{
// Validation errors are not retriable. Log and discard intentionally.
_logger.LogWarning(ex, "Order {OrderId} failed validation, discarding", message.OrderId);
// No throw—this is an explicit, intentional discard for a known bad message
}
catch (Exception)
{
throw; // Everything else: let Rebus handle it
}
}
Advanced Recovery: Configuring the Safety Net
Now that your handlers are correctly signaling failures, you can configure Rebus to respond to those failures intelligently.
Exponential Backoff
The default retry behavior fires all attempts as fast as possible. For transient issues like a momentary database blip, this is often fine. But for scenarios where you expect the failure to resolve in seconds, exponential backoff gives the infrastructure time to recover without flooding the queue with attempts.
Configure.With(activator)
.Options(o =>
{
o.SimpleRetryStrategy(
maxDeliveryAttempts: 5,
secondLevelRetriesEnabled: false,
errorDetailsHeaderMaxLength: 10000
);
// Use Rebus.Backoff or configure a custom delay strategy
o.SetBackoffTimes(
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(5),
TimeSpan.FromSeconds(10),
TimeSpan.FromSeconds(30)
);
})
.Start();
When to use backoff: When failures are likely transient—network timeouts, brief database unavailability, rate limiting from external APIs. Backoff gives the dependency time to recover between attempts without overwhelming it.
When not to use backoff: When the error is deterministic—bad data, a schema mismatch, a missing record. Backing off doesn't help if the tenth attempt has the same data as the first.
Second-Level Retries
Second-level retries (SLR) address the scenario that most engineers are actually afraid of: "What if the database is down for five minutes? I don't want to lose those messages."
With SLR enabled, if a message exhausts its immediate retries, instead of going directly to the error queue, it moves to a deferred state and is replayed after a configurable delay. This is the framework-native solution to "database is down" scenarios—not manual retry loops inside your handler.
Configure.With(activator)
.Options(o =>
{
o.SimpleRetryStrategy(
maxDeliveryAttempts: 5, // Immediate retries before promoting to SLR
secondLevelRetriesEnabled: true
);
})
.Transport(t => /* your transport config */)
.Timeouts(t => t.StoreInMemory()) // Or use a persistent store for production
.Start();
When to use SLR: Infrastructure outages with a known recovery window—database restarts, service deployments, dependency downtime measured in minutes. SLR is the right tool when immediate retries fail due to environmental conditions, not message-level problems.
When not to use SLR: When your immediate retries are already handling transient blips adequately. SLR adds complexity and requires a timeout store. Use it when you need recovery windows that span minutes or hours, not seconds.
Circuit Breaker
The circuit breaker pattern addresses a different problem: "What if the database is down for an hour? I don't want a thousand messages failing and clogging the error queue."
A circuit breaker monitors the failure rate and, when a threshold is exceeded, stops pulling new messages entirely—preventing a flood of failures during a known outage.
Rebus supports this via the Rebus.CircuitBreaker package:
Configure.With(activator)
.Options(o =>
{
o.EnableCircuitBreaker(c =>
{
// Open the circuit if we see 10 SqlExceptions within 60 seconds
c.OpenOn<SqlException>(
trackingPeriodInSeconds: 60,
attempts: 10
);
// Also trip on general infrastructure exceptions
c.OpenOn<TimeoutException>(
trackingPeriodInSeconds: 30,
attempts: 5
);
// After tripping, wait 30 seconds before trying a single message (half-open state)
c.SetHalfOpenPeriod(30);
});
})
.Start();
When to use a circuit breaker: Extended infrastructure outages where continued processing would only generate noise and fill error queues. Also valuable when your downstream systems have rate limits or cost implications for failed calls.
Important: If you implement a Polly circuit breaker inside your handler (instead of at the Rebus configuration level), the BrokenCircuitException must still bubble up to Rebus. Catching and swallowing BrokenCircuitException is the same trapdoor problem—Rebus sees success and deletes the message.
// ✅ Polly circuit breaker inside handler - exception must still propagate
public async Task Handle(OrderMessage message)
{
// If the circuit is open, Polly throws BrokenCircuitException.
// Let it bubble up. Rebus will retain and retry the message.
await _circuitBreaker.ExecuteAsync(() => _externalService.CallAsync(message));
}
Putting It Together: A Decision Guide
| Scenario | Tool |
|---|---|
| Transient blips (sub-second) | Immediate retries (default) |
| Transient failures lasting seconds | Exponential backoff |
| Infrastructure outages lasting minutes | Second-level retries (SLR) |
| Extended outages, flood prevention | Circuit breaker |
| Non-retriable bad data | Intentional discard (explicit, documented) |
The Observability Payoff
When exceptions bubble up correctly, the framework produces something far more valuable than a log entry—it produces a replayable artifact.
Each failed message in the error queue carries metadata that no manual log can match:
| Header | Value |
|---|---|
rbs2-exception |
Full stack trace |
rbs2-error-details |
Reason the last attempt failed |
rbs2-source-queue |
Origin queue for tracing |
rbs2-delivery-count |
Number of attempts made |
If you're using Rebus Fleet Manager, this metadata powers a UI where you can search failed messages by exception type, read the formatted stack trace, and replay messages directly to the source queue—after you've fixed the root cause.
A swallowed exception produces a log line. An unhandled exception produces a recoverable message. Only one of those lets you fix the problem and recover the data.
Summary
The try/catch instinct is deeply ingrained, and it's the right instinct in most programming contexts. Distributed messaging is the exception. In Rebus—and in MassTransit and NServiceBus—the exception is the signal. It's the mechanism by which the framework knows your handler needs help.
Catching and swallowing that signal doesn't protect your data. It destroys it while giving you the appearance of safety.
The safer path is counterintuitive until you internalize the middleware model:
- Don't catch to prevent errors. The framework catches them for you.
- Don't catch to add log context. Use logging scopes instead.
- Do catch when you need to filter exception types or wrap with domain context—but always re-throw.
- Configure backoff for transient failures.
- Configure SLR for infrastructure outages.
- Configure circuit breakers to prevent flood scenarios.
Once the exceptions are flowing correctly, the retry infrastructure becomes your most powerful reliability tool—and every failed message becomes something you can monitor, investigate, and recover without writing a line of custom recovery code.
Resources
You May Also Like
In Message-Based Systems, Who Owns the Contract?
Brad Jolicoeur - 02/17/2026
The "Big Save" Problem: Why Task-Based UI is Event Sourcing’s Best Friend
Brad Jolicoeur - 02/16/2026
Profiling .NET 10 Applications: The 2026 Guide to Performance