Docker Swarm on AWS

🎯 Goal

Provision a 3‑node Docker Swarm cluster on AWS using the AWS Console, then manually configure the Swarm and deploy services via SSH. This provides hands-on experience with both AWS infrastructure setup and Docker Swarm administration.

πŸ“‹ Prerequisites

πŸ“š Learning Objectives

πŸ“ Step-by-Step Instructions

Step 1: Create Infrastructure via AWS Console

1.1 Create Security Group

  1. Navigate to EC2 Console β†’ Security Groups β†’ Create security group
  2. Security group name: docker-swarm-sg
  3. Description: Security group for Docker Swarm cluster
  4. VPC: Select default VPC
  5. Inbound rules:
    • SSH: Port 22, Source: 0.0.0.0/0 (Your IP recommended)
    • HTTP: Port 80, Source: 0.0.0.0/0
    • Visualizer: Port 8080, Source: 0.0.0.0/0
    • Swarm Management: Port 2377, Source: docker-swarm-sg (self-reference)
    • Swarm Communication: Port 7946 (TCP), Source: docker-swarm-sg
    • Swarm Communication: Port 7946 (UDP), Source: docker-swarm-sg
    • Overlay Network: Port 4789 (UDP), Source: docker-swarm-sg
  6. Click Create security group

1.2 Launch EC2 Instances

  1. Navigate to EC2 Console β†’ Launch Instance

  2. Name: swarm-manager

  3. AMI: Amazon Linux 2023

  4. Instance type: t3.small

  5. Key pair: Select your existing key pair

  6. Network settings:

    • VPC: Default VPC
    • Subnet: Any default subnet
    • Auto-assign public IP: Enable
    • Security groups: Select docker-swarm-sg
  7. User data (Advanced details):

    #!/bin/bash
    dnf update -y
    dnf install -y docker
    systemctl enable --now docker
    usermod -aG docker ec2-user
    
  8. Click Launch instance

Repeat for 2 worker nodes: swarm-worker-1 and swarm-worker-2

1.3 Note Instance Details

Once instances are running, note down:

Step 2: Initialize Docker Swarm via SSH

2.1 Connect to Manager Node

# Replace with your key file and manager public IP
ssh -i ~/.ssh/your-key.pem ec2-user@<manager-public-ip>

2.2 Initialize Swarm on Manager

# Replace with manager's private IP
sudo docker swarm init --advertise-addr <manager-private-ip>

Expected output:

Swarm initialized: current node (xyz123) is now a manager.

To add a worker to this swarm, run the following command:
    docker swarm join --token SWMTKN-1-xxx... <manager-private-ip>:2377

Copy the join command - you’ll need it for the workers.

2.3 Add Worker Nodes

Connect to Worker 1:

# New terminal window
ssh -i ~/.ssh/your-key.pem ec2-user@<worker1-public-ip>

# Run the join command from step 2.2
sudo docker swarm join --token SWMTKN-1-xxx... <manager-private-ip>:2377

Connect to Worker 2:

# New terminal window  
ssh -i ~/.ssh/your-key.pem ec2-user@<worker2-public-ip>

# Run the same join command
sudo docker swarm join --token SWMTKN-1-xxx... <manager-private-ip>:2377

2.4 Verify Cluster

Back on the manager node:

sudo docker node ls

Expected output:

ID                            HOSTNAME                      STATUS   AVAILABILITY   MANAGER STATUS   ENGINE VERSION
abc123def456 *               ip-172-31-1-100.eu-west-1...  Ready    Active         Leader           25.0.8
ghi789jkl012                 ip-172-31-1-101.eu-west-1...  Ready    Active                          25.0.8  
mno345pqr678                 ip-172-31-1-102.eu-west-1...  Ready    Active                          25.0.8

Step 3: Deploy Services Manually

3.1 Create Docker Compose File

On the manager node, create the stack file:

cat > docker-stack.yml << 'EOF'
version: "3.8"
services:
  web:
    image: nginx:stable-alpine
    deploy:
      replicas: 3
      restart_policy:
        condition: on-failure
      update_config:
        parallelism: 1
        delay: 5s
    ports:
      - "80:80"
    networks: [webnet]

  viz:
    image: dockersamples/visualizer:stable
    deploy:
      placement:
        constraints: [node.role == manager]
    ports:
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks: [webnet]

networks:
  webnet:
    driver: overlay
EOF

3.2 Deploy the Stack

sudo docker stack deploy -c docker-stack.yml myapp

3.3 Verify Deployment

# Check stack status
sudo docker stack ls

# Check services
sudo docker service ls

# Check service details
sudo docker service ps myapp_web
sudo docker service ps myapp_viz

Step 4: Test and Scale Services

4.1 Test Web Access

Open in browser:

4.2 Scale Services

# Scale up nginx to 5 replicas
sudo docker service scale myapp_web=5

# Check scaling
sudo docker service ps myapp_web

# Scale back down to 3
sudo docker service scale myapp_web=3

4.3 Monitor Services

# Watch service status (Ctrl+C to exit)
watch sudo docker service ls

# View service logs
sudo docker service logs myapp_web
sudo docker service logs myapp_viz

Step 5: Cleanup

5.1 Remove Stack

On the manager node:

sudo docker stack rm myapp

5.2 Leave Swarm (Optional)

If you want to reset the Swarm:

# On workers
sudo docker swarm leave

# On manager (force)
sudo docker swarm leave --force

5.3 Terminate Instances

  1. Go to EC2 Console β†’ Instances
  2. Select all 3 instances
  3. Instance State β†’ Terminate Instance
  4. Confirm termination

5.4 Delete Security Group

  1. Go to EC2 Console β†’ Security Groups
  2. Select docker-swarm-sg
  3. Actions β†’ Delete security group

πŸ§ͺ Verification Checklist

πŸ”§ Troubleshooting

Can’t SSH to instances:

Worker can’t join swarm:

Services not accessible:

Docker not installed:

πŸ’‘ Key Differences vs Automated Approach

AspectManual (This Exercise)Automated (Scripts)
Setup Time15-20 minutes5-10 minutes
Learning ValueHigh (understand each step)Medium (focus on patterns)
ReproducibilityLow (human error prone)High (consistent results)
DebuggingEasy (direct access)Harder (through logs)
Production ReadyNo (manual steps)Yes (automated pipeline)

πŸŽ“ What You Learned

This manual approach gives you deeper understanding of what happens “under the hood” when using automated deployment scripts.