Skip to main content

Overview and Sizing Guidelines

Proper capacity planning ensures your integrations have sufficient resources to handle expected workloads while maintaining performance SLAs. This page provides sizing guidelines, resource estimation methods, and scaling recommendations.

Key metrics for sizing

MetricDescriptionHow to Measure
Requests per second (RPS)Expected peak throughputLoad testing with bal test or tools like k6, JMeter
Response latency (p95)Target 95th percentile response timePerformance testing under load
Concurrent connectionsMaximum simultaneous client connectionsConnection pool configuration
Message sizeAverage request/response payload sizeLog analysis or API analytics
Integration complexityNumber of downstream calls per requestCode analysis

Resource estimation

CPU

WorkloadvCPUs per InstanceNotes
Simple passthrough (< 2 downstream calls)0.25 - 0.5Mostly I/O bound
Moderate transformation (2-5 downstream calls)0.5 - 1.0Some CPU for data mapping
Complex orchestration (5+ calls, heavy transformation)1.0 - 2.0CPU-intensive processing
AI/ML inference integration2.0 - 4.0Depends on model complexity

Memory

WorkloadMemory per InstanceNotes
Lightweight service (JVM)256 - 512 MBMinimal heap usage
Standard service (JVM)512 MB - 1 GBTypical integration workload
High-throughput service (JVM)1 - 2 GBLarge payloads, many connections
GraalVM native image64 - 256 MBSignificantly lower footprint

Instance count

Estimate the number of instances based on throughput requirements:

instances = ceil(peak_RPS / RPS_per_instance) + buffer_instances

Example: If each instance handles 500 RPS and your peak is 2000 RPS:

instances = ceil(2000 / 500) + 1 = 5 instances

Always add at least one buffer instance for rolling updates and failover.

Sizing by deployment target

Kubernetes

resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"

Configure Horizontal Pod Autoscaler (HPA):

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80

Virtual machines

Deployment SizeVM SpecInstancesHandles
Small2 vCPU, 4 GB RAM2Up to 500 RPS
Medium4 vCPU, 8 GB RAM3-4Up to 2000 RPS
Large8 vCPU, 16 GB RAM4-8Up to 10,000 RPS

Serverless (AWS Lambda)

SettingRecommendation
Memory512 MB - 1 GB (JVM), 256 MB (GraalVM native)
Timeout30 seconds (default), adjust per use case
Provisioned ConcurrencySet to expected minimum concurrent executions
Reserved ConcurrencySet to protect downstream systems

Connection pool sizing

Database connection pools

final mysql:Client dbClient = check new ({
host: "db.example.com",
port: 3306,
user: "svc_user",
password: "password",
database: "orders",
connectionPool: {
maxOpenConnections: 25,
maxConnectionLifeTime: 1800,
minIdleConnections: 5
}
});

Rule of thumb: maxOpenConnections = (number_of_instances * connections_per_instance) <= database_max_connections

HTTP client connection pools

final http:Client apiClient = check new ("https://api.example.com", {
httpVersion: http:HTTP_1_1,
poolConfig: {
maxActiveConnections: 50,
maxIdleConnections: 10,
waitTime: 30
},
timeout: 30
});

Load testing

Validate your capacity plan with load testing before going to production:

# Using k6
k6 run --vus 100 --duration 5m load-test.js

# Using Apache Bench
ab -n 10000 -c 100 http://localhost:9090/api/orders

Key results to collect

MetricTarget
Throughput (RPS)Meets or exceeds peak estimate
p95 LatencyUnder SLA threshold
Error RateUnder 0.1%
CPU UtilizationUnder 70% at peak
Memory UtilizationUnder 80% at peak

Scaling strategies

StrategyWhen to Use
Vertical scalingSingle-instance workloads, quick fix
Horizontal scalingStateless services, high availability
Auto-scaling (HPA)Variable traffic patterns
Event-driven scaling (KEDA)Queue/event-driven workloads

What's next