Content Profanity Detection System

TL;DR: N-gram based profanity detection for user-generated content with Redis-backed word management

Overview

A profanity detection system that supports n-gram matching (1 to n words) for filtering inappropriate content in user-generated posts and messages.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Content Moderation Pipeline                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────┐    ┌──────────────┐    ┌────────────────┐   │
│  │   Content    │───▶│   Profanity  │───▶│   Decision     │   │
│  │   Input      │    │   Detector   │    │   Engine       │   │
│  └──────────────┘    └──────────────┘    └────────────────┘   │
│                             │                     │            │
│                             ▼                     ▼            │
│                      ┌──────────────┐    ┌────────────────┐   │
│                      │    Redis     │    │   Response     │   │
│                      │  Word Cache  │    │   (allow/flag) │   │
│                      └──────────────┘    └────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Configuration

Remote Configuration

# Remote config variables
GRAM_SIZE: 3              # Maximum n-gram size to check
CONFIDENCE_THRESHOLD: 0.8  # Matching confidence threshold
CONTENT_TYPES:
  - P2P                   # Peer-to-peer messages
  - POSTS                 # Community posts
  - COMMENTS              # Post comments

N-gram Detection Algorithm

from typing import List, Set, Tuple
import re

class ProfanityDetector:
    def __init__(self, gram_size: int = 3):
        self.gram_size = gram_size
        self.profanity_cache: Set[str] = set()
        
    def load_words(self, redis_client):
        """Load profanity words from Redis"""
        self.profanity_cache = redis_client.smembers("profanity:words")
        
    def generate_ngrams(self, text: str) -> List[str]:
        """Generate n-grams from 1 to gram_size"""
        # Normalize text
        words = re.sub(r'[^\w\s]', '', text.lower()).split()
        
        ngrams = []
        for n in range(1, min(self.gram_size + 1, len(words) + 1)):
            for i in range(len(words) - n + 1):
                ngram = ' '.join(words[i:i+n])
                ngrams.append(ngram)
                
        return ngrams
    
    def detect(self, text: str) -> Tuple[bool, List[str]]:
        """Detect profanity in text"""
        ngrams = self.generate_ngrams(text)
        
        found_profanity = []
        for ngram in ngrams:
            if ngram in self.profanity_cache:
                found_profanity.append(ngram)
                
        return (len(found_profanity) > 0, found_profanity)

API Endpoints

Add Word to Dictionary

curl --location --request PUT \
  'http://internal-api:40063/word/' \
  --header 'x-access-token: YOUR_ACCESS_TOKEN' \
  --header 'Content-Type: application/json' \
  --data-raw '{
    "word": "inappropriate_word",
    "contentType": "P2P"
  }'

Check Content

curl --location --request POST \
  'http://internal-api:40063/check/' \
  --header 'x-access-token: YOUR_ACCESS_TOKEN' \
  --header 'Content-Type: application/json' \
  --data-raw '{
    "content": "Text to check for profanity",
    "contentType": "POSTS"
  }'

Response Format

{
  "status": "SUCCESS",
  "data": {
    "is_profane": false,
    "matched_words": [],
    "confidence": 0.0
  }
}
{
  "status": "SUCCESS",
  "data": {
    "is_profane": true,
    "matched_words": ["word1", "multi word phrase"],
    "confidence": 0.95
  }
}

Word Management

Redis Data Structure

# Single words
profanity:words:single -> SET of single words

# Multi-word phrases (n-grams)
profanity:words:ngram -> SET of multi-word phrases

# Content type specific
profanity:words:P2P -> SET of P2P specific words
profanity:words:POSTS -> SET of post specific words

Adding Words Programmatically

import redis

class WordManager:
    def __init__(self, redis_host: str, redis_port: int):
        self.redis = redis.Redis(host=redis_host, port=redis_port)
        
    def add_word(self, word: str, content_type: str = None):
        """Add word to profanity dictionary"""
        word = word.lower().strip()
        
        # Add to general set
        if ' ' in word:
            self.redis.sadd('profanity:words:ngram', word)
        else:
            self.redis.sadd('profanity:words:single', word)
            
        # Add to content-type specific set if provided
        if content_type:
            self.redis.sadd(f'profanity:words:{content_type}', word)
            
    def remove_word(self, word: str):
        """Remove word from all dictionaries"""
        word = word.lower().strip()
        
        # Remove from all sets
        for key in self.redis.scan_iter('profanity:words:*'):
            self.redis.srem(key, word)
            
    def list_words(self, content_type: str = None) -> list:
        """List all words, optionally filtered by content type"""
        if content_type:
            return list(self.redis.smembers(f'profanity:words:{content_type}'))
        
        words = set()
        words.update(self.redis.smembers('profanity:words:single'))
        words.update(self.redis.smembers('profanity:words:ngram'))
        return list(words)

Service Implementation

Spring Boot Service

@Service
public class ProfanityService {
    
    private final RedisTemplate<String, String> redisTemplate;
    private final int gramSize;
    
    public ProfanityService(
            RedisTemplate<String, String> redisTemplate,
            @Value("${profanity.gram-size:3}") int gramSize) {
        this.redisTemplate = redisTemplate;
        this.gramSize = gramSize;
    }
    
    public ProfanityCheckResult check(String content, String contentType) {
        Set<String> allWords = loadWords(contentType);
        List<String> ngrams = generateNgrams(content.toLowerCase());
        
        List<String> matchedWords = ngrams.stream()
            .filter(allWords::contains)
            .collect(Collectors.toList());
            
        return ProfanityCheckResult.builder()
            .isProfane(!matchedWords.isEmpty())
            .matchedWords(matchedWords)
            .confidence(calculateConfidence(matchedWords, content))
            .build();
    }
    
    private Set<String> loadWords(String contentType) {
        Set<String> words = new HashSet<>();
        
        // Load single words
        Set<String> single = redisTemplate.opsForSet()
            .members("profanity:words:single");
        if (single != null) words.addAll(single);
        
        // Load n-grams
        Set<String> ngrams = redisTemplate.opsForSet()
            .members("profanity:words:ngram");
        if (ngrams != null) words.addAll(ngrams);
        
        // Load content-type specific
        if (contentType != null) {
            Set<String> typeSpecific = redisTemplate.opsForSet()
                .members("profanity:words:" + contentType);
            if (typeSpecific != null) words.addAll(typeSpecific);
        }
        
        return words;
    }
    
    private List<String> generateNgrams(String text) {
        String[] words = text.replaceAll("[^\\w\\s]", "").split("\\s+");
        List<String> ngrams = new ArrayList<>();
        
        for (int n = 1; n <= Math.min(gramSize, words.length); n++) {
            for (int i = 0; i <= words.length - n; i++) {
                String ngram = String.join(" ", 
                    Arrays.copyOfRange(words, i, i + n));
                ngrams.add(ngram);
            }
        }
        
        return ngrams;
    }
    
    private double calculateConfidence(List<String> matched, String content) {
        if (matched.isEmpty()) return 0.0;
        
        int totalChars = content.length();
        int matchedChars = matched.stream()
            .mapToInt(String::length)
            .sum();
            
        return Math.min(1.0, (double) matchedChars / totalChars * 2);
    }
}

Deployment

Docker Configuration

version: '3.8'
services:
  profanity-service:
    image: profanity-service:1.1.3
    ports:
      - "40063:8080"
    environment:
      - SPRING_REDIS_HOST=redis
      - SPRING_REDIS_PORT=6379
      - PROFANITY_GRAM_SIZE=3
    depends_on:
      - redis
      
  redis:
    image: redis:6-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
      
volumes:
  redis-data:

Testing

Unit Tests

import pytest
from profanity_detector import ProfanityDetector

class TestProfanityDetector:
    
    def test_single_word_detection(self):
        detector = ProfanityDetector(gram_size=3)
        detector.profanity_cache = {'badword'}
        
        is_profane, matches = detector.detect("this contains badword here")
        
        assert is_profane == True
        assert 'badword' in matches
        
    def test_ngram_detection(self):
        detector = ProfanityDetector(gram_size=3)
        detector.profanity_cache = {'bad phrase here'}
        
        is_profane, matches = detector.detect("this is bad phrase here today")
        
        assert is_profane == True
        assert 'bad phrase here' in matches
        
    def test_clean_content(self):
        detector = ProfanityDetector(gram_size=3)
        detector.profanity_cache = {'badword'}
        
        is_profane, matches = detector.detect("this is clean content")
        
        assert is_profane == False
        assert len(matches) == 0

Best Practices

  1. Regular dictionary updates - Review and update profanity lists periodically
  2. Context awareness - Consider content type when filtering
  3. False positive handling - Allow appeals for incorrectly flagged content
  4. Audit logging - Log all detection decisions for review
  5. Performance optimization - Cache word lists, use efficient data structures
Acknowledgements
  • Aravind — Model assessment and evaluation