Back to Guides
Advanced
90 minutes

High Availability Deployment

Deploy TigerAccess in a highly available configuration with automatic failover, multi-region support, and 99.99% uptime SLA.

Overview

A production-grade TigerAccess deployment includes:

  • Multiple Auth Instances: 3+ auth servers for consensus
  • Scalable Proxies: Horizontally scaled proxy tier
  • Replicated Storage: PostgreSQL with streaming replication
  • Multi-Region: Active-passive or active-active deployment

Prerequisites

  • Minimum 6 servers (3 auth, 2 proxy, 1 load balancer)
  • PostgreSQL cluster with replication (or managed service like RDS)
  • Redis cluster or managed service (ElastiCache)
  • Shared storage (S3) for session recordings
  • Load balancer (ALB, NLB, or HAProxy)

HA Architecture

Recommended production architecture:

Load Balancer (ALB/NLB)
Proxy Tier (2+ instances, auto-scaling)
Auth Cluster (3+ instances, etcd consensus)
PostgreSQL Primary (with 2+ replicas)
Redis Cluster (3+ nodes)

Auth Service Cluster

Configure etcd Backend

# /etc/tigeraccess/config.yaml on each auth server
auth:
  cluster_name: "production"
  # Use etcd for distributed state
  storage:
    type: "etcd"
    peers:
      - "https://auth1.example.com:2379"
      - "https://auth2.example.com:2379"
      - "https://auth3.example.com:2379"
    # mTLS for etcd
    tls_ca_file: "/etc/tigeraccess/etcd-ca.pem"
    tls_cert_file: "/etc/tigeraccess/etcd-cert.pem"
    tls_key_file: "/etc/tigeraccess/etcd-key.pem"
  # External PostgreSQL
  audit:
    type: "postgres"
    conn_string: "postgres://user:pass@postgres.example.com:5432/tigeraccess"
  # Cache configuration
  cache:
    enabled: true
    type: "redis"
    addresses:
      - "redis1.example.com:6379"
      - "redis2.example.com:6379"
      - "redis3.example.com:6379"

Start Auth Cluster

# On each auth server
sudo tigeraccess start --config=/etc/tigeraccess/config.yaml --roles=auth

# Verify cluster health
tac status --cluster

Proxy Service Cluster

Configure Proxy Instances

# /etc/tigeraccess/config.yaml on each proxy
proxy:
  enabled: true
  public_addr: "proxy.example.com:3023"
  # Connect to auth cluster
  auth_servers:
    - "auth1.example.com:3025"
    - "auth2.example.com:3025"
    - "auth3.example.com:3025"
  # Session recording to S3
  recording:
    enabled: true
    mode: "proxy-async"
    storage:
      type: "s3"
      bucket: "tigeraccess-recordings"
      region: "us-east-1"
  # Enable all protocols
  ssh:
    enabled: true
    listen_addr: "0.0.0.0:3023"
  kubernetes:
    enabled: true
    listen_addr: "0.0.0.0:3026"
  database:
    enabled: true

Auto-Scaling Configuration

# AWS Auto Scaling Group example
{
  "AutoScalingGroupName": "tigeraccess-proxy",
  "MinSize": 2,
  "MaxSize": 10,
  "DesiredCapacity": 3,
  "HealthCheckType": "ELB",
  "HealthCheckGracePeriod": 300,
  "TargetGroupARNs": ["arn:aws:elasticloadbalancing:..."]
}

Database High Availability

PostgreSQL Replication

# Primary server postgresql.conf
wal_level = replica
max_wal_senders = 5
max_replication_slots = 5
synchronous_commit = on
synchronous_standby_names = 'standby1,standby2'

# Standby server recovery.conf
standby_mode = on
primary_conninfo = 'host=primary.example.com port=5432 user=replicator'
trigger_file = '/tmp/postgresql.trigger'

Managed Database Services

Recommended for production:

  • AWS RDS PostgreSQL: Multi-AZ deployment with automated failover
  • Google Cloud SQL: Regional availability with read replicas
  • Azure Database for PostgreSQL: Zone-redundant HA

Redis High Availability

# Redis Cluster mode
auth:
  cache:
    type: "redis"
    cluster_mode: true
    addresses:
      - "redis-node-1:6379"
      - "redis-node-2:6379"
      - "redis-node-3:6379"
      - "redis-node-4:6379"
      - "redis-node-5:6379"
      - "redis-node-6:6379"

Load Balancing

AWS Application Load Balancer

# ALB Target Group for SSH Proxy (NLB recommended for SSH)
{
  "Protocol": "TCP",
  "Port": 3023,
  "VpcId": "vpc-xxx",
  "HealthCheckEnabled": true,
  "HealthCheckProtocol": "TCP",
  "HealthCheckPort": "3023",
  "HealthCheckIntervalSeconds": 30,
  "HealthyThresholdCount": 2,
  "UnhealthyThresholdCount": 2
}

HAProxy Configuration

# /etc/haproxy/haproxy.cfg
frontend ssh_proxy
    bind *:3023
    mode tcp
    default_backend tigeraccess_proxies

backend tigeraccess_proxies
    mode tcp
    balance roundrobin
    option tcp-check
    server proxy1 proxy1.example.com:3023 check
    server proxy2 proxy2.example.com:3023 check
    server proxy3 proxy3.example.com:3023 check

frontend web_ui
    bind *:443 ssl crt /etc/ssl/certs/tigeraccess.pem
    mode http
    default_backend tigeraccess_web

backend tigeraccess_web
    mode http
    balance leastconn
    option httpchk GET /webapi/ping
    server proxy1 proxy1.example.com:3080 check
    server proxy2 proxy2.example.com:3080 check

Monitoring & Alerting

Prometheus Metrics

# Enable Prometheus endpoint
auth:
  metrics:
    enabled: true
    listen_addr: "0.0.0.0:9090"

# Key metrics to monitor:
# - tigeraccess_auth_cluster_health
# - tigeraccess_active_connections
# - tigeraccess_failed_logins
# - tigeraccess_session_duration_seconds
# - tigeraccess_certificate_expiry_seconds

Health Checks

# Check auth service health
curl https://auth.example.com:3025/healthz

# Check proxy health
curl https://proxy.example.com:3080/webapi/ping

# Check cluster status
tac status --cluster

Alert Rules

# Prometheus alert rules
groups:
  - name: tigeraccess
    rules:
      - alert: AuthServiceDown
        expr: up{job="tigeraccess-auth"} == 0
        for: 1m
        annotations:
          summary: "TigerAccess auth service is down"

      - alert: ProxyHighConnections
        expr: tigeraccess_active_connections > 1000
        for: 5m
        annotations:
          summary: "High number of active connections"

Disaster Recovery

Backup Strategy

# Backup auth configuration
tac backup create --output=/backups/tigeraccess-$(date +%Y%m%d).tar.gz

# Automated daily backups
0 2 * * * /usr/local/bin/tac backup create --output=/backups/daily-$(date +\%Y\%m\%d).tar.gz

Multi-Region Deployment

# Active-Passive configuration
# Primary Region (us-east-1)
auth:
  cluster_name: "production-primary"
  replication:
    mode: "primary"
    peers:
      - "https://auth-dr.us-west-2.example.com:3025"

# DR Region (us-west-2)
auth:
  cluster_name: "production-dr"
  replication:
    mode: "secondary"
    primary: "https://auth.us-east-1.example.com:3025"

Failover Procedure

# Promote DR cluster to primary
tac cluster promote --cluster=production-dr

# Update DNS to point to DR region
# Update load balancer health checks
# Verify all services are operational
tac status --cluster

Ready to Secure Your Infrastructure?

Join thousands of security-conscious teams using TigerAccess to protect their critical infrastructure and AI agents.

No credit card required • 14-day free trial • Enterprise support available