Log, Monitor, Cluster, Restore and Deploy

TL;DR: Broad based infrastructure framework for production - establishing logging, monitoring, clustering, restoration and deployment standards.

This document establishes a logging, monitoring, clustering, restoration and deployment framework to be adopted across all application layers. This is required to improve ability to trace and isolate issues within application and thus enhance supportability.

Logs

This chapter outlines proposed coding etiquette and techniques to capture adequate information about rogue routines.

API and Underlying Layers

ComponentConfiguration
Preferred frameworkHybrid (log4j2 and customized exception hierarchy handler)
Configuration fileServer.utilities/src/main/resources/log4j2.xml
Factorycom.hs.organisation.utilities.fabricator.LogFabricator
Exception hierarchycom.hs.organisation.models.common.error.*

Rules of Engagement

Exception Hierarchy

Exception TypeDescription
OrganisationExceptionHierarchical checked exception for business case oddities
OrganisationRuntimeExceptionHierarchical unchecked exception for technical/library failures
ResponseCodeRepository of all available error state codes with customized messages

Precepts

  1. Service Layer - Highest level API responsible for handling exceptions and logging
  2. API Layer - Aggregates utilitarian libraries, captures business-case oddities
  3. Rivets - Utilitarian libraries capturing technical runtime failures
  4. Peripherals - Independent projects amalgamating rules for service, API, and rivets

Tenets

  • Capture input parameters and identifiable information as part of exception logging
  • Every method must be included in try-catch block with transduction to Organisation Runtime exception
  • Service and peripherals are responsible for logging to file system
  • API and peripherals capture business-case oddities (translated to OrganisationException)

Monitor

Monitoring infrastructure utilizing Zabbix for JMX and application monitoring.

Installation - Server

# Install Zabbix server
rpm -ivh http://repo.zabbix.com/zabbix/3.0/rhel/7/x86_64/zabbix-release-3.0-1.el7.noarch.rpm
yum install zabbix-server-mysql zabbix-web-mysql

# Create database
create database zabbix character set utf8 collate utf8_bin;
grant all privileges on zabbix.* to zabbix@localhost identified by 'password';

# Import schema
cd /usr/share/doc/zabbix-server-mysql-3.0.4
zcat create.sql.gz | mysql -uzabbix -p zabbix

Consumers

Tomcat JMX

CATALINA_OPTS="-Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.port=9000 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false"

Cassandra JMX

Cassandra exposes JMX on port 7199 by default. Configure templates for:

  • Cluster health
  • Node status
  • Compaction metrics
  • Read/Write latencies

Neo4j Monitoring

Enable JMX and configure metrics export:

metrics.enabled=true
metrics.neo4j.enabled=true
metrics.jvm.enabled=true

Cluster

Neo4j Causal Clustering Setup

# Core server configuration
dbms.mode=CORE
causal_clustering.minimum_core_cluster_size_at_formation=3
causal_clustering.initial_discovery_members=core1:5000,core2:5000,core3:5000

Cassandra Clustering Setup

cluster_name: 'Production Cluster'
num_tokens: 256
seed_provider:
  - class_name: org.apache.cassandra.locator.SimpleSeedProvider
    parameters:
      - seeds: "seed1,seed2"

Restore

Master Store Restoration

Migratory Snapshot

nodetool snapshot -t migration_backup keyspace_name

Migratory Restore

# Stop Cassandra
# Clear commitlog and saved_caches
# Copy snapshot files to data directory
# Restart Cassandra
nodetool refresh keyspace_name table_name

Graph Restoration (Neo4j)

# Stop Neo4j
neo4j-admin dump --database=neo4j --to=/backup/neo4j-backup.dump

# Restore
neo4j-admin load --database=neo4j --from=/backup/neo4j-backup.dump

Deploy

Deployment Strategy

  1. Phase One: Infrastructure provisioning
  2. Phase Two: Application deployment
  3. Phase Three: Configuration and validation

Nomenclature

EnvironmentPrefixExample
Productionprod-prod-api-01
Stagingstg-stg-api-01
Developmentdev-dev-api-01

Sample Host Configuration

hosts:
  api_servers:
    - prod-api-01
    - prod-api-02
  graph_servers:
    - prod-neo4j-01
    - prod-neo4j-02
    - prod-neo4j-03
  cache_servers:
    - prod-cassandra-01
    - prod-cassandra-02
    - prod-cassandra-03

This document serves as a comprehensive reference for infrastructure operations. Specific implementation details should be adapted based on environment requirements.