4.9 KiB
Concurrent Requests Support for ESP32-S3 Provider
Authored by Antigravity Date: 2026-03-08
1. Goal (What)
Enable the ESP32-S3 HTTP server to gracefully handle multiple concurrent web clients. Currently, if one browser connects, it consumes all available server sockets with "keep-alive" connections and blocks the static_file_handler via a single shared scratch buffer. The goal is to allow up to 5-10 clients (e.g., PCs, tablets, phones) on the local network to open the dashboard simultaneously without hanging, and to add a frontend safeguard (a loading spinner) to improve the user experience during slow network responses.
2. Rationale (Why)
The default ESP-IDF HTTP server (esp_http_server) is configured for minimal resource usage:
max_open_sockets = 7: A single modern browser tries to open up to 6 connections simultaneously to fetch HTML, CSS, JS, and API data.lru_purge_enable = false: When a browser finishes loading, it keeps those 6 sockets open (Keep-Alive) for future requests. If a second device tries to connect, it is rejected because the server has no free sockets, even though the first device's sockets are idle.
Furthermore, the current static_file_handler relies on a single shared rest_context->scratch buffer allocated globally. If the server is modified to multiplex handlers concurrently, this shared buffer would be overwritten by competing requests, causing data corruption in the served files.
3. Chosen Approach (How)
3.1 Backend Configuration (ESP-IDF)
Instead of implementing complex multi-threading (spawning multiple FreeRTOS worker tasks), we will leverage the HTTP server's built-in event loop multiplexing by tuning its configuration:
- Increase LwIP Socket Limit:
LWIP_MAX_SOCKETSis set to32insdkconfig.defaults. - Increase HTTP Socket Limit: Set
config.max_open_sockets = 24. This deliberately reserves8sockets for LwIP internals and outwards connections, guaranteeing the network stack always has headroom to accept a TCP handshake from a new client. - Enable Stale Socket Purging: Set
config.lru_purge_enable = true. This is the critical fix. When the 24 socket limit is reached and a new device attempts to connect, the server will intentionally drop the oldest idle keep-alive socket to make room, allowing the new device to load the page seamlessly.
3.2 Backend Scratch Buffer Pooling
To safely support multiplexed file serving without heavy malloc/free overhead on every request, we will replace the single shared scratch buffer with a Static Shared Buffer Pool:
- We allocated a global struct with a fixed array of
MAX_SCRATCH_BUFFERS = 10. - When
static_file_handlerbegins, it will request an available chunk from the pool, allocating a 4KB chunk on the heap only the first time it is used. - When the handler finishes, the chunk is marked as available yielding it for the next request.
- This provides isolation between up to 10 active transmission connections while minimizing heap fragmentation compared to per-request
mallocs.
3.3 Frontend Safety (Loading Spinner)
Even with backend improvements, network latency or heavy load might cause delays. We will implement a global request tracker to improve perceived performance:
- A new Svelte writable store
pendingRequestswill track the count of active API calls. api.jswill wrap the nativefetchin atrackedFetchfunction that increments/decrements this store.- A new
<Spinner />component will be rendered at the root (App.svelte). It will overlay the screen when$pendingRequests > 0, optionally with a small delay (e.g., 300ms) to prevent flashing on fast requests.
4. Design Decisions & Trade-offs
| Approach | Pros | Cons | Decision |
|---|---|---|---|
| True Multi-Threading (Multiple Worker Tasks) | Can process files fully in parallel on both cores. | High memory overhead for stack space per task; over-engineered for simple static file serving. | Rejected. Relying on the event loop's multiplexing is sufficient for local network use cases. |
Per-Request malloc / free |
Simplest way to isolate scratch buffers. | High heap fragmentation risk; computationally expensive on every HTTP request. | Rejected. |
| Fixed Pool (10 buffers) | Low overhead; memory footprint only grows organically to the maximum concurrent need limit (10 * 4KB = 40KB) and stabilizes. | Strict limit on how many connections can be actively transmitting data at the exact same millisecond. | Selected. Best balance of performance and memory safety. |
5. Potential Future Improvements
- If the
reallocpool grows too large during an unexpected spike, we could implement a cleanup routine that periodically shrinks the pool back to a baseline size when the server is idle. - If true parallel processing is needed later, the HTTP server's
config.core_idandasync_workerscould be utilized, but this requires ensuring all API handlers are perfectly thread-safe.