Skip to content

t3.medium

AWS t3.medium Specifications:

  • vCPUs: 2
  • Memory: 4GB RAM
  • Network Performance: Up to 5 Gigabit

DeepIntShield Configuration:

  • Buffer Size: 15,000
  • Initial Pool Size: 10,000
  • Test Load: 5,000 requests per second (RPS)

MetricValueNotes
Success Rate100.00%Perfect reliability under high load
Average Request Size0.13 KBLightweight request payload
Average Response Size1.37 KBStandard response size for testing
Average Latency2.12sTotal end-to-end response time
Peak Memory Usage1,312.79 MB~33% of available 4GB RAM

Almost all end-to-end latency is the upstream provider API call - the gateway itself adds only microseconds.

ComponentLatencyNotes
Upstream provider call1.56sThe actual model API request (unavoidable in any setup)
DeepIntShield overhead59 µsAdded latency from the gateway

DeepIntShield’s Total Overhead: 59 µs*

*Excludes the provider API call and JSON serialization, which are required in any implementation


  1. Perfect Reliability: 100% success rate even at 5,000 RPS
  2. Memory Efficiency: Uses only 33% of available RAM (1,312.79 MB / 4GB)
  3. Minimal Overhead: Just 59 µs of added latency per request
  4. Cost-effective: Sustains a 5,000 RPS workload on a low-cost instance
  • Memory Usage: Very efficient at 1,312.79 MB peak usage
  • CPU Performance: Handles 5,000 RPS workload effectively
  • Stability: No failed requests or throughput degradation under sustained load

Based on test results, these configurations work well:

{
"client": {
"initial_pool_size": 10000,
"buffer_size": 15000
}
}

For Lower Memory Usage:

  • Reduce initial_pool_size to 7,500-8,000
  • Decrease buffer_size to 12,000-13,000
  • Trade-off: Slightly higher latency

For Better Performance:

  • Increase initial_pool_size to 12,000-13,000
  • Increase buffer_size to 17,000-18,000
  • Trade-off: Higher memory usage (monitor RAM limits)

Metrict3.mediumt3.xlargeDifference
DeepIntShield Overhead59 µs11 µs+81% slower
Average Latency2.12s1.61s+24% slower
Memory Usage1,312.79 MB3,340.44 MB-61% usage

Key Insights:

  • t3.medium uses 61% less memory than t3.xlarge
  • Performance trade-offs are reasonable for cost savings
  • Most operations still complete in microseconds

When to upgrade to t3.xlarge:

  • Sustained load approaches 4,000+ RPS

  • Average latency climbs or success rate drops below 100%

  • Memory usage approaches 75% of available RAM

  • Run Your Own Benchmarks to test with your specific workload

  • Compare with t3.xlarge for performance scaling analysis