Error Handling Chapter 03
How the 1Engage codebase handles, wraps, classifies, and propagates errors across layers.
On This Page
- 3.1 The
errorInterface in Go - 3.2 Custom HTTP Error Type
- 3.3 Error Wrapping with
%w - 3.4
errors.As— Type-Based Error Handling - 3.5
errors.Is— Sentinel Error Checking - 3.6 Sentinel Errors with
errors.New - 3.7
PermanentError— Retry Classification - 3.8 Meta API Error Classification
- 3.9 Validation Error Handling
- 3.10 Error Flow Diagram
3.1 The error Interface in Go
What
Go does not have exceptions, try/catch, or stack-unwinding. Instead, errors are ordinary values that implement a single-method interface:
type error interface {
Error() string
}
Why
This design means every function that can fail returns an error as its last return value. The caller is forced to check the error immediately—errors can never be silently ignored without an explicit _. This makes error paths visible and explicit in every function.
How
The standard pattern you see everywhere in the codebase:
result, err := doSomething()
if err != nil {
return fmt.Errorf("context about what failed: %w", err)
}
// use result
Any type that has an Error() string method satisfies the interface. The 1Engage codebase defines several custom error types that implement this interface to carry structured information (HTTP status codes, field-level validation details, retry classification).
3.2 Custom HTTP Error Type
What
A structured error type in pkg/shared/http/error.go that carries an HTTP status code, a machine-readable code, a human message, and optional field-level details.
// pkg/shared/http/error.go
type ErrorDetail struct {
Field string `json:"field,omitempty"`
Message string `json:"message"`
}
type Error struct {
Status int `json:"-"` // HTTP status (not serialized)
Code string `json:"code"` // Machine-readable: "BAD_REQUEST"
Message string `json:"message"` // Human-readable
Details []ErrorDetail `json:"details,omitempty"` // Per-field errors
}
func (e *Error) Error() string {
return e.Message
}
The Status field uses json:"-" so it is never leaked to API consumers. The Error() method makes *Error satisfy Go's error interface, so it can be returned as a regular error value.
Factory Functions
Convenience constructors for common HTTP error responses:
func BadRequest(msg string) error {
return &Error{Status: 400, Code: "BAD_REQUEST", Message: msg}
}
func Unauthorized(msg string) error {
return &Error{Status: 401, Code: "UNAUTHORIZED", Message: msg}
}
func Forbidden(msg string) error {
return &Error{Status: 403, Code: "FORBIDDEN", Message: msg}
}
func NotFound(msg string) error {
return &Error{Status: 404, Code: "NOT_FOUND", Message: msg}
}
func Internal(err error) error {
return &Error{Status: 500, Code: "INTERNAL_ERROR", Message: "internal server error"}
}
func Conflic(msg string) error {
return &Error{Status: 409, Code: "CONFLICT", Message: msg}
}
Internal() accepts an error parameter (for logging) but always returns a generic message. This prevents leaking internal details to API consumers. Conflic (no 't') is the actual function name in the codebase.
Why This Design
Service-layer code returns errors like httpx.NotFound("user not found"). The handler layer calls HandleError(w, err) which uses errors.As to extract the status code and serialize the response. This achieves separation of concerns—services decide what went wrong, the HTTP layer decides how to respond.
// In a service method:
func (s *UserService) GetByID(id string) (*User, error) {
user, err := s.repo.FindByID(id)
if err != nil {
return nil, httpx.NotFound("user not found") // returns error with Status=404
}
return user, nil
}
// In the handler — the handler never decides status codes:
func (h *UserHandler) Get(w http.ResponseWriter, r *http.Request) {
user, err := h.service.GetByID(id)
if err != nil {
httpx.HandleError(w, err) // automatically sends 404
return
}
httpx.SendSuccess(w, r, user)
}
3.3 Error Wrapping with %w
What
Go's fmt.Errorf with the %w verb wraps an existing error inside a new error, adding contextual information while preserving the original error for later inspection.
Real Examples
// pkg/eventbus/kafka/producer.go
return fmt.Errorf("failed to marshal event: %w", err)
return fmt.Errorf("failed to publish event: %w", err)
return fmt.Errorf("failed to marshal event %s: %w", event.ID, err)
return fmt.Errorf("failed to publish batch: %w", err)
// pkg/auth/jwt.go
return TenantContext{}, fmt.Errorf("token validation failed: %w", err)
// pkg/eventbus/kafka/client.go
errs = append(errs, fmt.Errorf("producer close error: %w", err))
errs = append(errs, fmt.Errorf("consumer close error: %w", err))
return fmt.Errorf("failed to connect to Kafka: %w", err)
return fmt.Errorf("failed to get controller: %w", err)
// pkg/shared/helpers/uploadfile.go
return nil, fmt.Errorf("multiple failed upload index[%d] %s: %w", i, d.Name, err)
return fmt.Errorf("failed to delete file %s: %w", filepath, err)
Why
- Preserves the error chain: The original error is accessible via
errors.Unwrap(), enablingerrors.Is()anderrors.As()to inspect wrapped errors at any depth. - The prefix tells WHERE: The string before
%windicates the operation that failed ("failed to marshal event"), creating a stack-trace-like chain without actual stack traces. - Contrast with
%v: Using%vinstead of%wwould create a new error with the original's text but break the chain—you could no longer useerrors.Is/errors.Ason it.
"failed to publish batch: failed to marshal event: json: unsupported type"Each layer adds its context prefix.
3.4 errors.As — Type-Based Error Handling
What
errors.As walks the error chain (unwrapping at each level) and checks if any error in the chain matches the target type. If found, it assigns the matched error to the target variable.
HTTP Error Handling
From pkg/shared/http/error.go — the central error response function:
// pkg/shared/http/error.go
func HandleError(w http.ResponseWriter, err error) {
var httpErr *Error
if errors.As(err, &httpErr) {
// Found an *httpx.Error in the chain — use its status code
w.WriteHeader(httpErr.Status)
json.NewEncoder(w).Encode(httpErr)
return
}
// Unknown error type — default to 500
w.WriteHeader(http.StatusInternalServerError)
json.NewEncoder(w).Encode(&Error{
Code: "INTERNAL_ERROR",
Message: "internal server error",
})
}
PostgreSQL Error Handling
From auth-service/internal/repository/user_repo.go — extracting database-specific error codes:
// auth-service/internal/repository/user_repo.go
func (r *UserRepository) Create(user *model.User) (*model.User, error) {
err := r.db.Transaction(func(tx *gorm.DB) error {
if err := tx.Create(&user).Error; err != nil {
var pgErr *pgconn.PgError
if errors.As(err, &pgErr) {
if pgErr.Code == "23503" {
return httpx.Conflic("email or phone already exists")
}
}
return httpx.BadRequest(err.Error())
}
return nil
})
// ...
}
Why errors.As Over Type Assertion
| Approach | Code | Works Through Wrapping? |
|---|---|---|
| Type assertion | err.(*pgconn.PgError) |
No — fails if error was wrapped with %w |
errors.As |
errors.As(err, &pgErr) |
Yes — unwraps through any number of layers |
Because errors are frequently wrapped with fmt.Errorf("...: %w", err), the outer error is a different type. errors.As peels through each wrapping layer to find the target type.
3.5 errors.Is — Sentinel Error Checking
What
errors.Is checks if any error in the chain matches a specific sentinel value. Like errors.As, it walks through wrapped errors.
GORM Record Not Found
This is the most common pattern in the codebase, appearing in nearly every repository:
// auth-service/internal/repository/user_repo.go
if errors.Is(err, gorm.ErrRecordNotFound) {
return nil, httpx.NotFound("user not found")
}
// auth-service/internal/repository/role_repo.go
if errors.Is(err, gorm.ErrRecordNotFound) {
return nil, httpx.NotFound("role not found")
}
// auth-service/internal/repository/auth_repo.go
if errors.Is(err, gorm.ErrRecordNotFound) {
return nil, httpx.NotFound("tenant not found")
}
// broadcast-service/internal/repository/broadcast_repo.go
if errors.Is(err, gorm.ErrRecordNotFound) {
return nil, httpx.NotFound("broadcast not found")
}
// admin-service/internal/service/tenant_service.go
if errors.Is(err, gorm.ErrRecordNotFound) {
return nil, httpx.NotFound("tenant not found")
}
gorm.ErrRecordNotFound into an httpx.NotFound error with a domain-specific message. This ensures a consistent 404 response regardless of which entity was missing.
Context Cancellation
From the Kafka consumer — distinguishing graceful shutdown from real errors:
// pkg/eventbus/kafka/consumer.go
// In the consume loop — a cancelled context is normal shutdown, not an error
if errors.Is(err, context.Canceled) {
return err
}
// When starting goroutines — don't propagate cancellation as a failure
go func(s *subscription) {
defer c.wg.Done()
err := c.consumeLoop(ctx, s)
if err != nil && !errors.Is(err, context.Canceled) {
errCh <- err // only real errors go to the error channel
}
}(sub)
Why errors.Is Over ==
| Approach | Code | Works Through Wrapping? |
|---|---|---|
| Direct comparison | err == gorm.ErrRecordNotFound |
No — fails if error was wrapped |
errors.Is |
errors.Is(err, gorm.ErrRecordNotFound) |
Yes — unwraps through all layers |
3.6 Sentinel Errors with errors.New
What
Simple, standalone error values created with errors.New. These represent programming mistakes or API misuse, not runtime failures from external systems.
// pkg/eventbus/kafka/consumer.go
func (c *Consumer) Subscribe(topic, groupID string, handler eventbus.Handler) error {
c.mu.Lock()
defer c.mu.Unlock()
if c.running {
return errors.New("cannot subscribe while consumer is running")
}
if groupID == "" {
return errors.New("group ID is required for subscription")
}
// ...
}
func (c *Consumer) Run(ctx context.Context) error {
c.mu.Lock()
if c.running {
c.mu.Unlock()
return errors.New("consumer is already running")
}
if len(c.subscriptions) == 0 {
c.mu.Unlock()
return errors.New("no subscriptions registered")
}
// ...
}
Why These Aren't Wrapped
- No originating error: These errors don't come from a failed operation—they're state validation checks. There's nothing to wrap.
- Programming errors: Calling
Subscribewhile the consumer is running is a bug in the calling code, not a runtime failure. The message is enough. - Guard clauses: They protect invariants—no consumer should run without subscriptions, no subscription should happen while running.
errors.New vs fmt.Errorf:
Use errors.New for errors that originate here (no cause). Use fmt.Errorf("...: %w", err) when wrapping an error from a called function.
3.7 PermanentError — Retry Classification
What
A custom error type that signals "do not retry this operation." The codebase has two implementations of this concept for different contexts.
Implementation 1: Event Bus (pkg/eventbus/retry.go)
Used in the generic retry handler for event processing:
// pkg/eventbus/retry.go
type PermanentError struct {
Err error
}
func (e *PermanentError) Error() string {
return e.Err.Error()
}
func (e *PermanentError) Unwrap() error {
return e.Err // enables errors.Is/As through the chain
}
// Pointer receiver → checked with *PermanentError
func IsPermanentError(err error) bool {
_, ok := err.(*PermanentError)
return ok
}
Implementation 2: Helpers (pkg/shared/helpers/permanent_error.go)
Used for classifying external API errors (Meta/WhatsApp):
// pkg/shared/helpers/permanent_error.go
type PermanentError struct {
Err error
}
func (e PermanentError) Error() string { // VALUE receiver — not pointer
return e.Err.Error()
}
func NewPermanentError(err error) error {
return PermanentError{Err: err} // returns value, not pointer
}
// Value type assertion → checked with PermanentError (no pointer)
func IsPermanent(err error) bool {
_, ok := err.(PermanentError)
return ok
}
*PermanentError) and implements Unwrap(). The helpers version uses a value receiver (PermanentError) and does not implement Unwrap(). This means they are checked differently: err.(*PermanentError) vs err.(PermanentError).
How It's Used in Kafka Consumer
The consumer loop uses the helpers version to decide whether to retry or skip a message:
// pkg/eventbus/kafka/consumer.go
if err := c.processMessage(ctx, sub, msg); err != nil {
// PERMANENT ERROR → stop retrying
if helpers.IsPermanent(err) {
slog.Error("Permanent error, skipping message",
"topic", sub.topic,
"partition", msg.Partition,
"offset", msg.Offset,
"error", err,
)
// Commit the offset so Kafka won't deliver this message again
sub.reader.CommitMessages(ctx, msg)
continue
}
// TRANSIENT ERROR → do NOT commit, Kafka will redeliver
slog.Warn("Transient error, will retry",
"topic", sub.topic,
"error", err,
)
}
Decision Table
| Error Type | Commit Offset? | Retry? | Example |
|---|---|---|---|
PermanentError |
Yes | No | Invalid template ID, bad request to Meta API |
| Transient error | No | Yes (auto) | Network timeout, rate limit, 503 from server |
nil (success) |
Yes | N/A | Message processed successfully |
3.8 Meta API Error Classification
What
The HandleMetaAPIError function classifies HTTP responses from the Meta/WhatsApp Business API into permanent or retryable errors. This drives the retry behavior of the Kafka consumer.
// pkg/shared/helpers/permanent_error.go
type MetaErrorResponse struct {
Error struct {
Message string `json:"message"`
Type string `json:"type"`
Code int `json:"code"`
ErrorSubcode int `json:"error_subcode"`
FbTraceID string `json:"fbtrace_id"`
} `json:"error"`
}
func HandleMetaAPIError(statusCode int, respBody []byte) error {
// Success — no error
if statusCode >= 200 && statusCode < 300 {
return nil
}
// Parse Meta's error response body
var metaErr MetaErrorResponse
json.Unmarshal(respBody, &metaErr)
baseErr := fmt.Errorf(
"meta api error status=%d code=%d subcode=%d message=%s",
statusCode, metaErr.Error.Code,
metaErr.Error.ErrorSubcode, metaErr.Error.Message,
)
// ── HTTP-level classification ──
switch statusCode {
case 400, 401, 403, 404:
return NewPermanentError(baseErr) // client error → don't retry
case 429:
return baseErr // rate limited → retry later
case 500, 502, 503, 504:
return baseErr // server error → retry
}
// ── Meta error code classification ──
switch metaErr.Error.Code {
case 100: return NewPermanentError(baseErr) // invalid parameter
case 190: return NewPermanentError(baseErr) // invalid OAuth token
case 10, 200, 368:
return NewPermanentError(baseErr) // permission / blocked
}
// ── Subcode classification ──
switch metaErr.Error.ErrorSubcode {
case 33: return NewPermanentError(baseErr) // unsupported request
case 2388007: return NewPermanentError(baseErr) // template not found
}
// ── Fallback ──
if statusCode >= 400 && statusCode < 500 {
return NewPermanentError(baseErr) // unknown 4xx → permanent
}
return baseErr // everything else → retryable
}
Why This Three-Layer Classification
The Meta API uses a combination of HTTP status codes, error codes, and error subcodes. A single HTTP status isn't enough to determine retry strategy:
| Layer | Permanent (Don't Retry) | Retryable |
|---|---|---|
| HTTP Status | 400, 401, 403, 404 | 429 (rate limit), 500-504 |
| Meta Error Code | 100, 190, 10, 200, 368 | Other codes |
| Meta Subcode | 33, 2388007 | Other subcodes |
| Fallback | Any unknown 4xx | Everything else |
HandleMetaAPIError returns a PermanentError, the Kafka consumer commits the message offset (skipping it forever). When it returns a regular error, the consumer does NOT commit, so Kafka will redeliver the message for retry.
3.9 Validation Error Handling
What
Two functions handle validation errors from go-playground/validator, converting field-level validation failures into structured JSON responses with per-field detail messages.
ValidationError — Simple Response
// pkg/shared/http/error.go
func ValidationError(w http.ResponseWriter, err error) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
httpErr := &Error{
Status: http.StatusBadRequest,
Code: "VALIDATION_ERROR",
Message: "validation error",
Details: []ErrorDetail{},
}
if ve, ok := err.(validator.ValidationErrors); ok {
for _, e := range ve {
httpErr.Details = append(httpErr.Details, ErrorDetail{
Field: toSnakeCase(e.Field()), // PascalCase → snake_case
Message: getValidationErrorMessage(e.Tag(), e.Param()),
})
}
} else {
httpErr.Details = append(httpErr.Details, ErrorDetail{
Message: err.Error(),
})
}
json.NewEncoder(w).Encode(httpErr)
}
ValidationErrorWithMeta — With Request Meta
Adds request metadata (trace ID, timing) to the validation error response:
func ValidationErrorWithMeta(w http.ResponseWriter, r *http.Request, err error) {
// Same validation logic, but wraps in:
resp := ValidationErrorResponse{
Success: false,
Error: httpErr,
Meta: NewMeta(r.Context()), // adds request_id, timestamp, etc.
}
json.NewEncoder(w).Encode(resp)
}
Validation Tag Messages
The getValidationErrorMessage function translates validator tags into human-readable messages:
func getValidationErrorMessage(tag, param string) string {
switch tag {
case "required": return "required"
case "email": return "invalid email format"
case "min": return "must be at least " + param + " characters"
case "max": return "must be at most " + param + " characters"
case "url": return "invalid URL format"
case "uuid": return "invalid UUID format"
case "oneof": return "must be one of: " + param
case "numeric": return "must be numeric"
default: return tag
}
}
Example API Response
{
"code": "VALIDATION_ERROR",
"message": "validation error",
"details": [
{ "field": "email", "message": "invalid email format" },
{ "field": "password", "message": "must be at least 8 characters" },
{ "field": "role_id", "message": "invalid UUID format" }
]
}
RoleID) to snake_case (role_id) via toSnakeCase(), matching the JSON convention used by the API.
3.10 Error Flow Diagram
HTTP Request Error Propagation
Errors flow upward through layers, being wrapped, classified, or transformed at each boundary:
┌─────────────────────────────────────────────────────────────────────────┐
│ HTTP HANDLER │
│ │
│ func (h *Handler) Get(w http.ResponseWriter, r *http.Request) { │
│ user, err := h.service.GetByID(id) │
│ if err != nil { │
│ httpx.HandleError(w, err) ─────────────────┐ │
│ return │ │
│ } │ │
│ } │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ errors.As() │ │
│ │ *httpx.Error? │ │
│ └──┬───────────┬───┘ │
│ yes │ │ no │
│ ▼ ▼ │
│ Status from 500 Internal │
│ Error.Status Server Error │
└─────────────────────────────────────────────────────────────────────────┘
▲
│ returns error (e.g., httpx.NotFound)
│
┌─────────────────────────────────────────────────────────────────────────┐
│ SERVICE LAYER │
│ │
│ func (s *Service) GetByID(id string) (*User, error) { │
│ user, err := s.repo.FindByID(id) │
│ if err != nil { │
│ return nil, err // pass through httpx errors from repo │
│ } │
│ return user, nil │
│ } │
└─────────────────────────────────────────────────────────────────────────┘
▲
│ returns httpx.NotFound / httpx.Conflic
│
┌─────────────────────────────────────────────────────────────────────────┐
│ REPOSITORY LAYER │
│ │
│ func (r *Repo) FindByID(id string) (*User, error) { │
│ var user User │
│ err := r.db.First(&user, "id = ?", id).Error │
│ if err != nil { │
│ if errors.Is(err, gorm.ErrRecordNotFound) { │
│ return nil, httpx.NotFound("user not found") ◄── CLASSIFY│
│ } │
│ return nil, httpx.Internal(err) ◄── CLASSIFY │
│ } │
│ return &user, nil │
│ } │
│ │
│ func (r *Repo) Create(user *User) error { │
│ err := r.db.Create(user).Error │
│ if err != nil { │
│ var pgErr *pgconn.PgError │
│ if errors.As(err, &pgErr) { │
│ if pgErr.Code == "23503" { │
│ return httpx.Conflic("already exists") ◄── CLASSIFY│
│ } │
│ } │
│ return httpx.BadRequest(err.Error()) │
│ } │
│ return nil │
│ } │
└─────────────────────────────────────────────────────────────────────────┘
▲
│ raw errors (gorm.ErrRecordNotFound, *pgconn.PgError)
│
┌─────────────────────────────────────────────────────────────────────────┐
│ DATABASE / EXTERNAL SYSTEM │
│ │
│ gorm.ErrRecordNotFound *pgconn.PgError{Code: "23503"} │
│ connection errors constraint violations │
└─────────────────────────────────────────────────────────────────────────┘Kafka Consumer Error Propagation
A separate flow for asynchronous event processing with retry classification:
┌──────────────────────────────────────────────────────────────────┐
│ KAFKA CONSUMER LOOP │
│ │
│ msg := reader.FetchMessage(ctx) │
│ err := processMessage(ctx, sub, msg) │
│ │
│ ┌─── err == nil? ────────────────────── YES ──► Commit offset │
│ │ │
│ NO │
│ │ │
│ ├─── helpers.IsPermanent(err)? ── YES ──► Log + Commit offset │
│ │ (skip forever) │
│ │ │
│ └─── transient error ────────────────── ► Do NOT commit │
│ (Kafka will redeliver)│
└──────────────────────────────────────────────────────────────────┘
▲
│ PermanentError or regular error
│
┌──────────────────────────────────────────────────────────────────┐
│ EVENT HANDLER / SERVICE │
│ │
│ err := callMetaAPI(ctx, payload) │
│ │
│ ┌─── err == nil? ───────── YES ──► return nil (success) │
│ │ │
│ └─── err != nil ────────────────► return err (may be │
│ PermanentError or regular) │
└──────────────────────────────────────────────────────────────────┘
▲
│ classified by HandleMetaAPIError
│
┌──────────────────────────────────────────────────────────────────┐
│ HandleMetaAPIError() │
│ │
│ HTTP 400/401/403/404 ──► NewPermanentError() (don't retry) │
│ HTTP 429 ──► regular error (retry later) │
│ HTTP 500/502/503/504 ──► regular error (retry) │
│ Meta code 100/190 ──► NewPermanentError() (don't retry) │
│ Unknown 4xx ──► NewPermanentError() (don't retry) │
│ Everything else ──► regular error (retry) │
└──────────────────────────────────────────────────────────────────┘Summary of Error Patterns
| Pattern | Where Used | Purpose |
|---|---|---|
httpx.NotFound() |
Repositories, services | Classify error with HTTP status |
fmt.Errorf("...: %w") |
All layers | Wrap with context, preserve chain |
errors.As() |
Handlers, repositories | Extract typed error from chain |
errors.Is() |
Repositories, consumer | Check for specific sentinel error |
errors.New() |
Consumer guards | API misuse / programming errors |
PermanentError |
Event processing | Classify as non-retryable |
HandleMetaAPIError() |
Meta API client | Classify external API errors |
ValidationError() |
HTTP handlers | Field-level validation response |
HandleError() |
HTTP handlers | Central error → HTTP response |