MongoDB TTL Index and Storage Optimization

TL;DR: Implementing TTL indexes for automatic document expiration and storage optimization in MongoDB NT systems

Current Infrastructure Status

MetricLimitCurrent
TierM60M30 - General
Storage90 GB37.8 GB
IOPS3000~350
Connections3000~600
Read/Write Ops-~720 reads

MongoDB Version: 5.0.15
Auto-scaling: Scale out max: M60, Scale in min: M30

Problem Statement

The nt_data collection stores multiple entries for various NT types:

  • Site buzz NTs (not enabled in NT center)
  • Push NTs only
  • User-created discussions

Many of these entries are transient and don’t need long-term storage, consuming unnecessary disk space.

TTL Index Strategy

Proposed Expiration Rules

Entry Typent_center_enabledTTL Duration
Push-only NTsfalse7 days
NT center entriestrue30 days (configurable)

Implementation Approach

  1. Create an expireAt field on each document
  2. Set expiration timestamp based on NT type during creation/update
  3. Create TTL index with expireAfterSeconds: 0 (uses the expireAt value directly)

Index Creation Commands

Primary TTL Index

db.nt.createIndex(
    { "expireAt": 1 },
    { 
        "expireAfterSeconds": 0, 
        "background": true 
    }
)

Note: This operation may take significant time on collections with 100M+ entries.

Temporary Partial Index

For immediate cleanup of non-nt-center entries:

db.nt.createIndex(
    { "expiry": 1 },
    { 
        "expireAfterSeconds": 0, 
        "partialFilterExpression": { "nce": false }, 
        "background": true 
    }
)

Application Code Changes

Setting Expiration on Document Creation

@Document(collection = "nt_data")
public class NT {
    
    @Id
    private String id;
    
    private boolean ntCenterEnabled;  // nce
    
    @Indexed(expireAfter = "0s")
    private Date expireAt;
    
    // Other fields...
    
    public void setExpiration(int ntTtlDays, int pushOnlyTtlDays) {
        int ttlDays = ntCenterEnabled ? ntTtlDays : pushOnlyTtlDays;
        this.expireAt = Date.from(
            Instant.now().plus(ttlDays, ChronoUnit.DAYS)
        );
    }
}

Configuration Properties

nt:
  ttl:
    nt-center-days: 30
    push-only-days: 7

Index Analysis Considerations

Current Indexes to Review

Analyze existing indexes for:

  • Usage frequency
  • Query patterns
  • Redundancy with new TTL index

Index Maintenance

// Check index usage statistics
db.nt_data.aggregate([
    { $indexStats: {} }
])

// Identify unused indexes
db.nt_data.aggregate([
    { $indexStats: {} },
    { $match: { "accesses.ops": { $lt: 100 } } }
])

Estimated Storage Impact

Before implementing, estimate the reduction:

// Count documents that would be expired
db.nt_data.countDocuments({
    "nce": false,
    "createdAt": { $lt: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) }
})

// Estimate size reduction
db.nt_data.aggregate([
    { $match: { 
        "nce": false,
        "createdAt": { $lt: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) }
    }},
    { $group: { 
        _id: null, 
        totalSize: { $sum: { $bsonSize: "$$ROOT" } }
    }}
])

Rollout Plan

Phase 1: Preparation

  1. Create background TTL index on staging
  2. Monitor index build progress
  3. Verify documents are expiring correctly

Phase 2: Production

  1. Create TTL index with background: true
  2. Monitor disk space reduction
  3. Update application code to set expireAt on new documents

Phase 3: Cleanup

  1. Remove temporary partial index if used
  2. Document final storage metrics
  3. Configure alerting for storage thresholds

Monitoring Queries

// Monitor TTL deletion rate
db.adminCommand({ serverStatus: 1 }).metrics.ttl

// Check collection statistics
db.ntdata.stats()

// View index sizes
db.nt_data.stats().indexSizes

Best Practices

  1. Always use background: true for index creation on large collections
  2. Test on staging first with representative data volumes
  3. Monitor disk I/O during TTL cleanup cycles
  4. Set appropriate TTL values based on business requirements
  5. Use partial indexes when only a subset of documents need expiration