Skip to content

Agent Exchange - Production Deployment Guide

This guide covers deploying Agent Exchange to cloud platforms. Choose your preferred provider:

  • Google Cloud Platform (GCP) - Cloud Run, Firestore
  • Amazon Web Services (AWS) - ECS Fargate, DocumentDB

Table of Contents

  1. Overview
  2. GCP Deployment
  3. Prerequisites (GCP)
  4. Setup (GCP)
  5. Database (GCP)
  6. Secrets (GCP)
  7. Deploy (GCP)
  8. Monitoring (GCP)
  9. AWS Deployment
  10. Prerequisites (AWS)
  11. Setup (AWS)
  12. Database (AWS)
  13. Deploy (AWS)
  14. Monitoring (AWS)
  15. CI/CD Pipelines
  16. Service Configuration
  17. Scaling
  18. Security Best Practices
  19. Troubleshooting
  20. Teardown

Overview

Architecture

Agent Exchange consists of 11 microservices:

Service Port Description
aex-gateway 8080 API gateway, routing
aex-work-publisher 8081 Work specification management
aex-bid-gateway 8082 Bid submission and storage
aex-bid-evaluator 8083 Bid evaluation and ranking
aex-contract-engine 8084 Contract lifecycle management
aex-provider-registry 8085 Provider registration
aex-trust-broker 8086 Trust score management
aex-identity 8087 Tenant and API key management
aex-settlement 8088 Financial transactions + AP2 integration
aex-telemetry 8089 Metrics and events
aex-credentials-provider 8090 NEW: AP2 payment methods management

Demo Components (Optional)

Component Port Description
legal-agent-a 8100 Budget Legal Agent ($5 + $2/page)
legal-agent-b 8101 Standard Legal Agent ($15 + $0.50/page)
legal-agent-c 8102 Premium Legal Agent ($30 + $0.20/page)
orchestrator 8103 Consumer orchestrator agent
payment-legalpay 8200 Payment processor (2% fee, 1% reward)
payment-contractpay 8201 Payment processor (2.5% fee, 3% reward)
payment-compliancepay 8202 Payment processor (3% fee, 4% reward)
demo-ui-nicegui 8502 Recommended: Real-time NiceGUI dashboard
demo-ui 8501 Legacy Mesop dashboard (deprecated)

Deployment Order

Deploy services in this order due to dependencies:

  1. Database: MongoDB
  2. Infrastructure services: aex-identity, aex-telemetry
  3. Core services: aex-provider-registry, aex-trust-broker, aex-credentials-provider
  4. Business services: aex-bid-gateway, aex-bid-evaluator, aex-work-publisher, aex-contract-engine, aex-settlement
  5. Gateway: aex-gateway
  6. Demo (optional): legal-agents, payment-agents, orchestrator, demo-ui-nicegui

GCP Deployment

Prerequisites (GCP)

Required Tools

# Google Cloud SDK
curl https://sdk.cloud.google.com | bash
gcloud init

# Docker
# Install from https://docs.docker.com/get-docker/

# Go 1.22+ (for local builds)
# Install from https://go.dev/dl/

Required Permissions

You need the following IAM roles on your GCP project:

  • roles/owner or these specific roles:
  • roles/run.admin
  • roles/artifactregistry.admin
  • roles/iam.serviceAccountAdmin
  • roles/secretmanager.admin
  • roles/datastore.owner
  • roles/logging.admin

Environment Variables

export GCP_PROJECT_ID="your-project-id"
export GCP_REGION="us-central1"

Setup (GCP)

1. Create or Select Project

# Create new project
gcloud projects create $GCP_PROJECT_ID --name="Agent Exchange"

# Or select existing project
gcloud config set project $GCP_PROJECT_ID

2. Enable Billing

Ensure billing is enabled in the GCP Console.

3. Run Setup Script

./hack/deploy/setup-gcp.sh

This script will: - Enable required APIs - Create Artifact Registry repository - Create service accounts with proper roles - Set up Workload Identity for GitHub Actions - Create Firestore database - Create initial secrets

4. Manual API Enablement (if needed)

gcloud services enable \
  run.googleapis.com \
  artifactregistry.googleapis.com \
  firestore.googleapis.com \
  secretmanager.googleapis.com \
  cloudresourcemanager.googleapis.com \
  iam.googleapis.com \
  iamcredentials.googleapis.com \
  logging.googleapis.com \
  monitoring.googleapis.com \
  cloudtrace.googleapis.com

Database (GCP)

Current Implementation: MongoDB (via Docker or MongoDB Atlas) Production Target: Firestore in Native mode (not yet migrated)

Option A: MongoDB (Current)

# For development/staging - MongoDB via Docker or Atlas
# Connection string in environment variable:
export MONGO_URI="mongodb://root:root@localhost:27017/?authSource=admin"
export MONGO_DB="aex"

Option B: Firestore (Future Production)

# Create database (for future Firestore migration)
gcloud firestore databases create \
  --location=$GCP_REGION \
  --type=firestore-native

Collections Structure

Collection Description
work_specs Work specifications
bids Provider bids
contracts Awarded contracts
providers Registered providers
subscriptions Category subscriptions
tenants Tenant accounts
api_keys API keys
trust_records Provider trust data
balances Account balances
transactions Financial transactions
ledger Ledger entries

Firestore Indexes (for future migration)

gcloud firestore indexes composite create \
  --collection-group=work_specs \
  --field-config field-path=consumer_id,order=ASCENDING \
  --field-config field-path=created_at,order=DESCENDING

gcloud firestore indexes composite create \
  --collection-group=bids \
  --field-config field-path=work_id,order=ASCENDING \
  --field-config field-path=created_at,order=ASCENDING

Secrets (GCP)

# JWT signing secret
echo -n "$(openssl rand -base64 32)" | \
  gcloud secrets create aex-jwt-secret --data-file=-

# API key salt
echo -n "$(openssl rand -base64 32)" | \
  gcloud secrets create aex-api-key-salt --data-file=-

Deploy (GCP)

Build and Push Images

# Authenticate Docker with Artifact Registry
gcloud auth configure-docker ${GCP_REGION}-docker.pkg.dev

# Build all images
make docker-build

# Tag and push
VERSION="v1.0.0"
REGISTRY="${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT_ID}/aex"

for service in aex-gateway aex-work-publisher aex-bid-gateway aex-bid-evaluator \
               aex-contract-engine aex-provider-registry aex-trust-broker \
               aex-identity aex-settlement aex-telemetry; do
  docker tag agent-exchange/${service}:local ${REGISTRY}/${service}:${VERSION}
  docker push ${REGISTRY}/${service}:${VERSION}
done

Deploy Services

# Deploy to staging
./hack/deploy/deploy-cloudrun.sh staging all

# Deploy to production
./hack/deploy/deploy-cloudrun.sh production all

# Deploy specific service
./hack/deploy/deploy-cloudrun.sh production aex-gateway

Get Service URLs

for service in aex-gateway aex-work-publisher aex-bid-gateway aex-bid-evaluator \
               aex-contract-engine aex-provider-registry aex-trust-broker \
               aex-identity aex-settlement aex-telemetry; do
  URL=$(gcloud run services describe $service --region=$GCP_REGION --format='value(status.url)')
  echo "$service: $URL"
done

Monitoring (GCP)

Cloud Monitoring

View metrics in Cloud Monitoring:

# Key metrics
- cloud.run/request_count
- cloud.run/request_latencies
- cloud.run/container/instance_count
- cloud.run/container/cpu/utilizations
- cloud.run/container/memory/utilizations

Cloud Logging

# All AEX logs
gcloud logging read 'resource.type="cloud_run_revision" AND resource.labels.service_name=~"aex-.*"' \
  --limit=100 \
  --format="table(timestamp,resource.labels.service_name,textPayload)"

# Error logs only
gcloud logging read 'resource.type="cloud_run_revision" AND severity>=ERROR' \
  --limit=50

Rollback (GCP)

# List revisions
gcloud run revisions list --service=SERVICE_NAME --region=$GCP_REGION

# Route traffic to previous revision
gcloud run services update-traffic SERVICE_NAME \
  --to-revisions=REVISION_NAME=100 \
  --region=$GCP_REGION

# Gradual rollout (90/10 split)
gcloud run services update-traffic SERVICE_NAME \
  --to-revisions=NEW_REVISION=10,OLD_REVISION=90 \
  --region=$GCP_REGION

AWS Deployment

Prerequisites (AWS)

Required Tools

# AWS CLI v2
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Docker
# https://docs.docker.com/engine/install/

# jq (for JSON processing)
sudo apt-get install jq  # Ubuntu/Debian
brew install jq          # macOS

AWS Account Setup

aws configure
# Enter your AWS Access Key ID
# Enter your AWS Secret Access Key
# Enter your preferred region (e.g., us-east-1)
# Enter output format (json)

Verify Access

aws sts get-caller-identity

Setup (AWS)

Environment Variables

export AWS_REGION="us-east-1"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

Run Setup Script

chmod +x hack/deploy/setup-aws.sh
./hack/deploy/setup-aws.sh all

This creates: - ECR repositories for all 10 services - VPC with public/private subnets across 2 AZs - Security groups for ALB, ECS, and DocumentDB - ECS Fargate cluster - IAM roles for ECS tasks and GitHub Actions - Secrets Manager secrets - CloudWatch log groups - Application Load Balancer

Individual Setup Commands

./hack/deploy/setup-aws.sh ecr      # Create ECR repositories
./hack/deploy/setup-aws.sh vpc      # Create VPC and networking
./hack/deploy/setup-aws.sh ecs      # Create ECS cluster
./hack/deploy/setup-aws.sh iam      # Create IAM roles
./hack/deploy/setup-aws.sh secrets  # Create secrets
./hack/deploy/setup-aws.sh logs     # Create CloudWatch log groups
./hack/deploy/setup-aws.sh alb      # Create ALB

Database (AWS)

Option A: Amazon DocumentDB (Managed)

VPC_ID=$(aws ec2 describe-vpcs --filters "Name=tag:Name,Values=aex-vpc" \
  --query 'Vpcs[0].VpcId' --output text)

PRIVATE_SUBNETS=$(aws ec2 describe-subnets \
  --filters "Name=tag:Name,Values=aex-private-*" "Name=vpc-id,Values=$VPC_ID" \
  --query 'Subnets[*].SubnetId' --output text | tr '\t' ',')

DOCDB_SG=$(aws ec2 describe-security-groups \
  --filters "Name=group-name,Values=aex-docdb-sg" "Name=vpc-id,Values=$VPC_ID" \
  --query 'SecurityGroups[0].GroupId' --output text)

# Create subnet group
aws docdb create-db-subnet-group \
  --db-subnet-group-name aex-docdb-subnet-group \
  --db-subnet-group-description "Agent Exchange DocumentDB Subnet Group" \
  --subnet-ids ${PRIVATE_SUBNETS//,/ }

# Get password from Secrets Manager
DOCDB_PASSWORD=$(aws secretsmanager get-secret-value \
  --secret-id aex-docdb-password \
  --query SecretString --output text)

# Create DocumentDB cluster
aws docdb create-db-cluster \
  --db-cluster-identifier aex-docdb \
  --engine docdb \
  --master-username aexadmin \
  --master-user-password "$DOCDB_PASSWORD" \
  --vpc-security-group-ids "$DOCDB_SG" \
  --db-subnet-group-name aex-docdb-subnet-group

# Create instance
aws docdb create-db-instance \
  --db-instance-identifier aex-docdb-1 \
  --db-cluster-identifier aex-docdb \
  --db-instance-class db.r5.large \
  --engine docdb

Option B: MongoDB Atlas (External)

  1. Create a MongoDB Atlas cluster
  2. Configure VPC peering with your AWS VPC
  3. Store connection string in Secrets Manager:
aws secretsmanager create-secret \
  --name aex-mongo-uri \
  --secret-string "mongodb+srv://<USERNAME>:<PASSWORD>@<CLUSTER>.mongodb.net/aex"

Deploy (AWS)

Build and Push Images

# Login to ECR
aws ecr get-login-password --region $AWS_REGION | \
  docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com

# Build and push all services
VERSION=v1.0.0 ./hack/deploy/deploy-ecs.sh build

Deploy Services

# Deploy all services to staging
./hack/deploy/deploy-ecs.sh staging all

# Deploy all services to production
./hack/deploy/deploy-ecs.sh production all

# Deploy a single service
./hack/deploy/deploy-ecs.sh staging aex-gateway

Verify Deployment

# Check service status
aws ecs describe-services \
  --cluster aex-cluster \
  --services aex-gateway aex-work-publisher \
  --query 'services[*].{name:serviceName,status:status,running:runningCount,desired:desiredCount}'

# Get ALB URL
ALB_DNS=$(aws elbv2 describe-load-balancers --names "aex-alb" \
  --query 'LoadBalancers[0].DNSName' --output text)
echo "Access at: http://$ALB_DNS"

# Test health endpoint
curl http://$ALB_DNS/health

Monitoring (AWS)

CloudWatch Logs

# View recent logs
aws logs tail /ecs/agent-exchange/aex-gateway --follow

# Search logs
aws logs filter-log-events \
  --log-group-name /ecs/agent-exchange/aex-gateway \
  --filter-pattern "ERROR"

CloudWatch Metrics

Metric Description Threshold
CPUUtilization CPU usage percentage Alert > 80%
MemoryUtilization Memory usage percentage Alert > 85%
RunningTaskCount Number of running tasks Alert if 0
HTTPCode_Target_5XX_Count 5xx errors Alert > 10/min
TargetResponseTime Response latency Alert > 2s

Rollback (AWS)

# Rollback to previous task definition
aws ecs update-service \
  --cluster aex-cluster \
  --service aex-gateway \
  --task-definition aex-gateway:PREVIOUS_REVISION

# Force new deployment
aws ecs update-service --cluster aex-cluster --service aex-gateway \
  --force-new-deployment

CI/CD Pipelines

GCP (Cloud Run)

  1. Add GitHub Secrets:
  2. GCP_PROJECT_ID
  3. GCP_REGION
  4. GCP_WORKLOAD_IDENTITY_PROVIDER
  5. GCP_SERVICE_ACCOUNT

  6. Deploy via tag:

    git tag v1.0.0
    git push origin v1.0.0
    

AWS (ECS Fargate)

  1. Add GitHub Secrets:
  2. AWS_ROLE_ARN - arn:aws:iam::<account-id>:role/aex-github-actions-role

  3. Add GitHub Variables:

  4. AWS_REGION - us-east-1

  5. Configure Environments:

  6. aws-staging - For staging deployments
  7. aws-production - For production (add required reviewers)

  8. Deploy via tag:

    git tag -a v1.0.0 -m "Release v1.0.0"
    git push origin v1.0.0
    


Service Configuration

Resource Allocation

Service GCP Memory GCP CPU AWS Memory AWS CPU
aex-gateway 1Gi 2 1024 MB 512
aex-work-publisher 512Mi 1 512 MB 256
aex-bid-gateway 512Mi 1 512 MB 256
aex-bid-evaluator 512Mi 1 512 MB 256
aex-contract-engine 512Mi 1 512 MB 256
aex-provider-registry 512Mi 1 512 MB 256
aex-trust-broker 512Mi 1 512 MB 256
aex-identity 512Mi 1 512 MB 256
aex-settlement 512Mi 1 512 MB 256
aex-telemetry 256Mi 1 512 MB 256

Environment Variables

All Services

Variable Description Default
PORT HTTP port 8080
ENVIRONMENT Environment name production
LOG_LEVEL Logging level info

Service-Specific

Service Variables
aex-gateway WORK_PUBLISHER_URL, BID_GATEWAY_URL, PROVIDER_REGISTRY_URL, SETTLEMENT_URL, IDENTITY_URL
aex-work-publisher STORE_TYPE, PROVIDER_REGISTRY_URL
aex-bid-gateway PROVIDER_REGISTRY_URL
aex-bid-evaluator BID_GATEWAY_URL, TRUST_BROKER_URL
aex-contract-engine BID_GATEWAY_URL, SETTLEMENT_URL

Scaling

GCP Auto-scaling

# High-throughput services (gateway)
--concurrency=250

# CPU-intensive services (bid-evaluator)
--concurrency=50

# Cold start optimization
--min-instances=1 --cpu-boost

AWS Auto-scaling

# Register scalable target
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/aex-cluster/aex-gateway \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 1 \
  --max-capacity 10

# Create scaling policy (target tracking)
aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/aex-cluster/aex-gateway \
  --policy-name cpu-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "ScaleOutCooldown": 60,
    "ScaleInCooldown": 300
  }'

# Manual scaling
aws ecs update-service \
  --cluster aex-cluster \
  --service aex-gateway \
  --desired-count 5

Security Best Practices

Authentication

  • Disable unauthenticated access in production
  • Use service-to-service authentication
  • Validate API keys against Identity service
  • Implement rate limiting

Network Security

GCP

# VPC Connector for private networking
--vpc-connector=aex-connector
--vpc-egress=private-ranges-only

AWS

  • Services run in private subnets
  • ALB is the only public-facing component
  • Security groups restrict traffic between tiers

Secret Management

  • Never commit secrets to code
  • Use Secret Manager (GCP) or Secrets Manager (AWS)
  • Rotate secrets regularly
  • Use separate secrets per environment

Troubleshooting

GCP

Service Won't Start

gcloud run services logs read SERVICE_NAME --region=$GCP_REGION
gcloud run revisions describe REVISION_NAME --region=$GCP_REGION

Connection Refused

curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
  https://target-service-xxx.run.app/health

AWS

Task Fails to Start

aws ecs describe-tasks \
  --cluster aex-cluster \
  --tasks $(aws ecs list-tasks --cluster aex-cluster --service-name aex-gateway \
    --desired-status STOPPED --query 'taskArns[0]' --output text) \
  --query 'tasks[0].stoppedReason'

Service Unhealthy

aws elbv2 describe-target-health \
  --target-group-arn $(aws elbv2 describe-target-groups \
    --names aex-gateway-tg --query 'TargetGroups[0].TargetGroupArn' --output text)

Database Connection Issues

aws ecs execute-command \
  --cluster aex-cluster \
  --task <task-id> \
  --container aex-gateway \
  --interactive \
  --command "/bin/sh"

Teardown

Validate Before Teardown

# GCP - validate what will be deleted
./hack/deploy/teardown-gcp.sh validate

# AWS - validate what will be deleted
./hack/deploy/teardown-aws.sh validate

Execute Teardown

# GCP - delete all resources
./hack/deploy/teardown-gcp.sh

# AWS - delete all resources
./hack/deploy/teardown-aws.sh

Make Targets

# GCP
make gcp-teardown

# AWS
make aws-teardown

Demo Deployment (Local Development)

The demo showcases the complete AEX + A2A + AP2 flow with legal agents and payment processors.

Quick Start Demo

cd demo

# Start everything
docker-compose up -d

# Access UI
open http://localhost:8502

Step-by-Step Demo (for presentations)

cd demo

# 1. Stop everything and clean up
docker-compose down -v

# 2. Start AEX infrastructure only (no agents)
docker-compose up -d mongo aex-identity aex-provider-registry aex-trust-broker \
  aex-bid-gateway aex-bid-evaluator aex-contract-engine aex-work-publisher \
  aex-settlement aex-credentials-provider aex-telemetry aex-gateway

# 3. Start UI without dependencies
docker-compose up -d --no-deps demo-ui-nicegui

# 4. Verify empty marketplace
curl -s http://localhost:8085/providers | jq '.total'  # Should be 0

# 5. Add agents one by one (during presentation)
docker-compose up -d legal-agent-a      # Budget Legal
docker-compose up -d legal-agent-b      # Standard Legal
docker-compose up -d legal-agent-c      # Premium Legal

# 6. Add payment agents
docker-compose up -d payment-legalpay payment-contractpay payment-compliancepay

# 7. Add orchestrator
docker-compose up -d orchestrator

# 8. Open browser and run the demo
open http://localhost:8502

Demo Verification Commands

# Check AEX health
curl http://localhost:8080/health

# Count registered providers
curl -s http://localhost:8085/providers | jq '.total'

# List provider names
curl -s http://localhost:8085/providers | jq '.providers[].name'

# Check agent card
curl -s http://localhost:8100/.well-known/agent.json | jq '{name, description}'

Demo UI Features

The NiceGUI demo interface (port 8502) provides: - Real-time agent registration display (auto-refreshes every 5 seconds) - Work submission form - Live bid collection and comparison - Contract award with configurable strategies (balanced, lowest price, best quality) - A2A execution visualization - AP2 payment processing with cashback rewards - Settlement summary with ledger updates


Support Resources

GCP

AWS


Last Updated: 2025-12-30