Event Stream Schema Design for Partner Integrations

TL;DR: A specification for designing generic event schemas in MongoDB to handle diverse partner integration events with retry tracking.

Introduction

When building partner integration systems, events flow in both directions—inbound from partners and outbound to partners. This document specifies a generic schema for storing all events in a single MongoDB collection while maintaining flexibility for diverse event types.

Schema Specification

{
  "_id": "62a9c38946e0fb00017f6bbb",
  "event_day_id": 2723,
  "event_version": "v2",
  "event_type": "OR_MGMT",
  "event_status": "FAILED",
  "event_direction": "IN",
  
  "entity_metadata": {
    "partnerId": "6231cfc8800d284c981b3146",
    "partnerName": "ExamplePartner",
    "entity_id": "order-12345",
    "entity_type": "Order",
    "resilienceAttemptCount": {
      "1": "1655292809853",
      "2": "1655292814919",
      "3": "1655292824964"
    }
  },
  
  "entity_data": {
    // Flexible JSON - varies by event type
  },
  
  "expire_at": "2022-06-15T11:34:45.102Z",
  "created": "1655292885101",
  "createdBy": "624acf127ad1c76470951dff",
  "lastUpdatedTime": "1655292885101"
}

Field Definitions

Core Event Fields

FieldTypeDescription
event_day_idIntegerDay identifier for partitioning (days since epoch)
event_versionStringSchema version for backwards compatibility
event_typeStringDomain + event category (e.g., OR_MGMT)
event_statusStringProcessing status: PENDING, PROCESSING, SUCCESS, FAILED
event_directionStringIN (from partner) or OUT (to partner)

Entity Metadata

FieldTypeDescription
partnerIdStringUnique partner identifier
partnerNameStringHuman-readable partner name
entity_idStringExternal/business identifier
entity_typeStringEntity category: Order, Sub, Prom
resilienceAttemptCountObjectRetry attempt timestamps

Entity Data

The entity_data field is intentionally schemaless to accommodate varying event payloads:

"entity_data": {
  "orderId": 59734,
  "orderStatusLatest": "OR_PLC",
  "orderStatusLatestTimestamp": 1655292804,
  "orderLineItems": [...],
  "orderSummary": {...},
  "orderMetadata": {...}
}

Retry Tracking Pattern

The resilienceAttemptCount object tracks delivery attempts:

"resilienceAttemptCount": {
  "1": "1655292809853",  // First attempt
  "2": "1655292814919",  // Retry after 5 seconds
  "3": "1655292824964",  // Retry after 10 seconds
  "4": "1655292845009",  // Retry after 20 seconds
  "5": "1655292885057"   // Retry after 40 seconds
}

This enables exponential backoff analysis and retry limiting.

Indexing Strategy

Recommended indexes for common query patterns:

// Event retrieval by partner
db.partner_events.createIndex({ "entity_metadata.partnerId": 1, "created": -1 })

// Status-based queries
db.partner_events.createIndex({ "event_status": 1, "event_type": 1 })

// TTL for automatic cleanup
db.partner_events.createIndex({ "expire_at": 1 }, { expireAfterSeconds: 0 })

// Entity lookup
db.partner_events.createIndex({ "entity_metadata.entity_id": 1, "entity_metadata.entity_type": 1 })

Query Patterns

Failed Events for Retry

db.partner_events.find({
  event_status: "FAILED",
  "entity_metadata.resilienceAttemptCount.5": { $exists: false }
})

Partner Event History

db.partner_events.find({
  "entity_metadata.partnerId": "6231cfc8800d284c981b3146"
}).sort({ created: -1 }).limit(100)

Best Practices

  1. Schema versioning: Always include event_version for migration support
  2. TTL configuration: Set appropriate expire_at based on retention requirements
  3. Idempotency: Use entity_id + event_type for deduplication
  4. Retry limits: Cap resilienceAttemptCount to prevent infinite loops
  5. Audit fields: Always populate created, createdBy, lastUpdatedTime