MongoDB Compact Operation Guide

Overview

MongoDB’s compact command rewrites and defragments collection data and indexes, reclaiming disk space from deleted or updated documents.

When to Run Compact

After bulk delete operations
When disk usage is significantly higher than actual data size
When query performance has degraded
Before increasing disk allocation

Prerequisites

1. Disk Space Assessment

Current Status:

Database size: 102.5 GB
Disk capacity: 116 GB
Available space: 13.5 GB (insufficient)

Requirement: The compact operation needs temporary space for data rewriting. Calculate:

Required space = Collection size × (1 + fragmentation ratio + safety margin)

Recommendation: Increase disk to at least 150 GB before proceeding.

2. Backup

Always create a backup before running compact:

# Using mongodump
mongodump --uri="mongodb://user:pass@host:port/db_name" \
  --out=/backup/pre-compact-$(date +%Y%m%d)

# Or create a snapshot if using cloud provider

3. Trial Run in Non-Production

Execute the compact operation in a development/staging environment first and document:

Duration
Resource utilization
Any errors encountered

Running Compact

Basic Syntax

// Connect to MongoDB
use target_database

// Compact a collection
db.runCommand({ compact: "collection_name" })

With Force Option

// Force compact even if node is primary (use cautiously)
db.runCommand({ 
    compact: "collection_name",
    force: true 
})

Compact All Collections

// Get all collection names and compact each
db.getCollectionNames().forEach(function(collName) {
    print("Compacting: " + collName);
    printjson(db.runCommand({ compact: collName }));
});

Monitoring During Compact

Watch Progress

// Check current operations
db.currentOp({ "command.compact": { $exists: true } })

Monitor Disk Usage

# Watch disk space during operation
watch -n 5 'df -h /data/db'

Check Collection Stats

// Before compact
db.collection_name.stats()

// Note: dataSize, storageSize, and indexSizes

Post-Compact Verification

Compare Storage Statistics

// Check improved storage utilization
var stats = db.collection_name.stats();

print("Data Size: " + (stats.size / 1024 / 1024).toFixed(2) + " MB");
print("Storage Size: " + (stats.storageSize / 1024 / 1024).toFixed(2) + " MB");
print("Index Size: " + (stats.totalIndexSize / 1024 / 1024).toFixed(2) + " MB");

Verify Query Performance

Run representative queries and compare execution times:

// Example query with explain
db.collection_name.find({ 
    status: "active",
    created_at: { $gte: ISODate("2023-01-01") }
}).explain("executionStats")

Impact Considerations

What Happens During Compact

Aspect	Impact
Collection Access	Collection locked (not available for reads/writes)
Replica Set	Other nodes remain available
Disk I/O	High disk activity
Duration	Proportional to collection size

Best Practices

Schedule During Low Traffic
- Run during maintenance windows
- Notify stakeholders of potential latency

Run on Secondaries First

// On secondary node
rs.secondaryOk()
db.runCommand({ compact: "collection_name" })

Monitor Replication Lag
```
rs.printSlaveReplicationInfo()
```

Automation Script

#!/bin/bash
# compact_collection.sh

MONGO_URI="mongodb://user:pass@host:port/dbname"
COLLECTION=$1
LOG_FILE="/var/log/mongo-compact-$(date +%Y%m%d).log"

echo "Starting compact for $COLLECTION at $(date)" >> $LOG_FILE

# Get pre-compact stats
mongosh "$MONGO_URI" --eval "JSON.stringify(db.$COLLECTION.stats())" >> $LOG_FILE

# Run compact
mongosh "$MONGO_URI" --eval "db.runCommand({compact: '$COLLECTION'})" >> $LOG_FILE

# Get post-compact stats
mongosh "$MONGO_URI" --eval "JSON.stringify(db.$COLLECTION.stats())" >> $LOG_FILE

echo "Completed compact for $COLLECTION at $(date)" >> $LOG_FILE

Alternative: Rolling Compact for Replica Sets

For minimal downtime, perform rolling compaction:

Compact each secondary node
Step down primary
Compact former primary (now secondary)

// Step down primary (on primary node)
rs.stepDown(300)  // 5-minute stepdown

Troubleshooting

”Not enough disk space”

# Check actual disk usage
du -sh /data/db/*

# Temporarily clean up logs or old backups

Operation Running Too Long

// Check if compact is still running
db.currentOp({"msg": /compact/})

// If needed, kill the operation
db.killOp(<opid>)

Replication Lag After Compact

Monitor and wait for secondaries to catch up before proceeding to next node:

// Check replication lag
rs.printSecondaryReplicationInfo()

Comments & Discussion

Want to suggest corrections or improvements?

Have a correction, suggestion, or idea for improvement?

Comment below using GitHub Discussions (recommended)
Email directly via LinkedIn for detailed feedback
Open an issue on GitHub for technical corrections

All constructive feedback is welcome and helps improve the content for everyone.