Concurrent Requests Support for ESP32-S3 Provider

Authored by Antigravity Date: 2026-03-08

1. Goal (What)

Enable the ESP32-S3 HTTP server to gracefully handle multiple concurrent web clients. Currently, if one browser connects, it consumes all available server sockets with "keep-alive" connections and blocks the static_file_handler via a single shared scratch buffer. The goal is to allow up to 5-10 clients (e.g., PCs, tablets, phones) on the local network to open the dashboard simultaneously without hanging, and to add a frontend safeguard (a loading spinner) to improve the user experience during slow network responses.

2. Rationale (Why)

The default ESP-IDF HTTP server (esp_http_server) is configured for minimal resource usage:

max_open_sockets = 7: A single modern browser tries to open up to 6 connections simultaneously to fetch HTML, CSS, JS, and API data.
lru_purge_enable = false: When a browser finishes loading, it keeps those 6 sockets open (Keep-Alive) for future requests. If a second device tries to connect, it is rejected because the server has no free sockets, even though the first device's sockets are idle.

Furthermore, the current static_file_handler relies on a single shared rest_context->scratch buffer allocated globally. If the server is modified to multiplex handlers concurrently, this shared buffer would be overwritten by competing requests, causing data corruption in the served files.

3. Chosen Approach (How)

3.1 Backend Configuration (ESP-IDF)

Instead of implementing complex multi-threading (spawning multiple FreeRTOS worker tasks), we will leverage the HTTP server's built-in event loop multiplexing by tuning its configuration:

Increase Socket Limit: Set config.max_open_sockets = 10 (or up to LWIP_MAX_SOCKETS limit) to provide more headroom for initial connections.
Enable Stale Socket Purging: Set config.lru_purge_enable = true. This is the critical fix. When the socket limit is reached and a new device attempts to connect, the server will intentionally drop the oldest idle keep-alive socket to make room, allowing the new device to load the page seamlessly.

3.2 Backend Scratch Buffer Pooling

To safely support multiplexed file serving without heavy malloc/free overhead on every request, we will replace the single shared scratch buffer with a dynamically growing Shared Buffer Pool:

We will allocate a global pool of scratch memory chunks.
When static_file_handler begins, it will request an available chunk from the pool.
If all chunks are currently in use by other concurrent requests, the pool will use realloc to expand its capacity and create a new chunk.
When the handler finishes, the chunk is marked as available yielding it for the next request.
This provides isolation between concurrent connections while minimizing heap fragmentation compared to per-request mallocs.

3.3 Frontend Safety (Loading Spinner)

Even with backend improvements, network latency or heavy load might cause delays. We will implement a global request tracker to improve perceived performance:

A new Svelte writable store pendingRequests will track the count of active API calls.
api.js will wrap the native fetch in a trackedFetch function that increments/decrements this store.
A new <Spinner /> component will be rendered at the root (App.svelte). It will overlay the screen when $pendingRequests > 0, optionally with a small delay (e.g., 300ms) to prevent flashing on fast requests.

4. Design Decisions & Trade-offs

Approach	Pros	Cons	Decision
True Multi-Threading (Multiple Worker Tasks)	Can process files fully in parallel on both cores.	High memory overhead for stack space per task; over-engineered for simple static file serving.	Rejected. Relying on the event loop's multiplexing is sufficient for local network use cases.
Per-Request `malloc` / `free`	Simplest way to isolate scratch buffers.	High heap fragmentation risk; computationally expensive on every HTTP request.	Rejected.
Dynamically Resizing Pool (`realloc`)	Low overhead; memory footprint only grows organically to the maximum concurrent need and stabilizes.	Slightly more complex to implement the pool state management.	Selected. Best balance of performance and memory safety.

5. Potential Future Improvements

If the realloc pool grows too large during an unexpected spike, we could implement a cleanup routine that periodically shrinks the pool back to a baseline size when the server is idle.
If true parallel processing is needed later, the HTTP server's config.core_id and async_workers could be utilized, but this requires ensuring all API handlers are perfectly thread-safe.

4.7 KiB Raw Blame History