Some Problems Only Appear After 24 Hours
Memory leaks, connection leaks, and gradual performance degradation only become visible under extended load. Our soak testing runs your system at sustained load for 8 to 72 hours to surface problems that standard load tests miss entirely.
You might be experiencing...
Soak testing — also called endurance testing — exposes failure modes that standard load tests miss because they run too briefly. Memory leaks that accumulate at 50MB per hour are invisible in a 30-minute test but manifest as an out-of-memory crash after 20 hours of production traffic. Connection pool leaks that grow by 2 connections per minute exhaust a 500-connection pool in 4 hours. Thread pool saturation from unreleased goroutines or threads builds over days. These are operational problems masquerading as infrastructure problems.
The time-series analysis of soak test metrics is what distinguishes a meaningful soak test from simply running load for a long time. We look for upward-trending metrics that do not stabilise: heap size that grows linearly with runtime, file descriptor counts that increase without ceiling, database connection counts that ratchet upward with each traffic spike. Every identified trend comes with a rate-of-change measurement and a projected time to failure.
Root cause identification for memory leaks requires correlating the metric trend with code paths: which allocation call sites are producing objects that are not being garbage collected? Heap profilers — JVM heap dumps, Go pprof memory profiles, Python memory_profiler — taken at intervals during the soak test reveal the answer. We pair memory profiling with soak load to surface the specific code responsible for the leak.
Engagement Phases
Soak Test Design & Instrumentation
We design the soak test scenario at your target sustained load level (typically 70–80% of peak capacity). We configure comprehensive memory and resource monitoring: JVM heap, goroutine counts, file descriptors, database connection counts, Redis connection counts, and application-level metrics. We set up time-series dashboards for all key metrics.
Extended Load Execution
We run sustained load at the target level for the agreed duration. We monitor memory and resource metrics at 5-minute intervals, looking for upward trends that indicate leaks. We analyse application logs for increasing error rates, timeout patterns, and garbage collection pressure. We record the precise timing of any metric that begins trending upward.
Analysis & Remediation
We produce a soak test analysis report identifying all metrics that trended upward, their rate of change, and the projected time to failure. For identified leaks, we provide root cause analysis with recommended code or configuration fixes. We pair with your engineering team to implement and validate the highest-priority fixes.
Deliverables
Before & After
| Metric | Before | After |
|---|---|---|
| Memory leaks identified | Unknown | 3 found at 10 MB/hr |
| Connection leaks | Unknown | 2 found, fixed |
| Time to failure | Unknown | 72 hrs without restart |
Tools We Use
Frequently Asked Questions
How do you run a 72-hour test without it consuming your full attention?
We configure automated monitoring with alert thresholds that notify us if any metric exceeds expected bounds during the test run. We check in at regular intervals (morning and evening) and review the time-series data. The load generation runs autonomously — k6 and Locust are designed for extended runs. We are available for immediate response if an alert fires.
What is the difference between a memory leak and normal memory growth?
Normal memory growth stabilises: the application allocates memory for caches and in-flight requests, reaches a steady state, and garbage collects appropriately. A memory leak grows continuously without stabilising — the trend line has a positive slope that does not flatten. We use rate-of-change analysis to distinguish between the two, and we look for correlation with request count to identify whether memory growth is proportional to traffic.
We restart our application nightly — does that mean soak testing doesn't apply?
Nightly restarts are a common workaround for memory or connection leaks — which means the leak is present and you are managing it operationally rather than fixing it. Soak testing identifies the specific leak so it can be fixed, eliminating the need for scheduled restarts. Planned restarts also create brief downtime windows and complicate deployment scheduling.
Know Your Scaling Ceiling
Book a free 30-minute capacity scope call with our load testing engineers. We review your architecture, traffic expectations, and upcoming scaling events — and scope the load test that will give you the data you need.
Talk to an Expert