4.6 KiB
Frontend OTA Strategy for ESP32-S3 Provider
Authored by Antigravity Date: 2026-03-03
1. Goal
Implement a robust Over-The-Air (OTA) update mechanism specifically for the Svelte frontend assets served by the ESP32-S3. The update must:
- Update the frontend code without requiring a full firmware re-flash.
- Provide a reliable fallback if an update fails (Rollback capability via A/B slots).
- Handle updates gracefully within the ESP32's available RAM limitations.
- Provide a dedicated UI for the user to upload new frontend binaries with real-time feedback.
- Ensure a seamless user experience via automated recovery and page refresh.
2. Chosen Approach
We implemented a Dual-Partition Image Flash (A/B slots) strategy using LittleFS.
Instead of updating individual files, the build process generates a single, pre-packaged .bin image of the entire www directory. This image is streamed directly to the inactive flash partition (www_0 or www_1), ensuring that the current UI remains fully functional until the update is confirmed and the device reboots.
3. Design Decisions & Trade-offs
3.1. Why Dual-Partition (A/B)?
- Safety: A failed or interrupted upload never "bricks" the UI. The ESP32 simply remains on the current working slot.
- Flash Allocation: With 16MB of total flash, allocating 2MB for UI (1MB per slot) is highly efficient given it provides zero-downtime potential.
3.2. Explicit Reboot vs. Hot-Swap
We chose an explicit reboot to switch slots.
- Pros: Guarantees a clean state, flushes NVS, and restarts all network/VFS handles.
- Cons: Brief ~3s downtime.
- Verdict: The safety of a clean boot outweighs the complexity of live-mounting partitions at runtime.
3.3. Semantic Versioning & Auto-Increment
We implemented a major.minor.revision versioning system stored in version.json.
- Decision: The
ota:packagescript automatically increments therevisionnumber on every build. - Value: This ensures that every OTA binary is unique and identifiable (e.g.,
www_v0.1.6.bin), preventing confusion during manual testing.
4. Final Architecture
4.1. The Partition Table
# Name, Type, SubType, Offset, Size
nvs, data, nvs, , 0x6000
otadata, data, ota, , 0x2000
www_0, data, littlefs, , 1M
www_1, data, littlefs, , 1M
4.2. State Management (NVS)
The active partition label (www_0 or www_1) is stored in NVS under the ota namespace with the key active_slot.
- On boot,
main.cppchecks this key. If missing, it defaults towww_0. - The
api_ota_frontend_handlerupdates this key only after a 100% successful flash.
4.3. Resilient Auto-Reload (The "Handshake")
To solve the "post-reboot-disconnect" problem, we implemented a two-part recovery logic:
- Targeted Polling: The frontend registers an
onRebootcallback. When the OTA succeeds, theAppenters arebootingstate. - Resilience: A dedicated
$effectin Svelte uses a "stubborn" polling loop. It ignores all connection errors (common while the ESP32 is resetting/reconnecting WiFi) and only refreshes the page once a 200 OK is received from/api/system/info.
5. UI/UX Implementation
5.1. Layout Separation
- Frontend Info Card: Extracted into a standalone component to provide high-level observability (Version, Active Slot, Partition Free Space).
- Advanced Tools: OTA controls are hidden behind a toggle to prevent accidental triggers and reduce UI clutter.
5.2. OTA Polling & Stats
- Partition Space: The
GET /api/ota/statusendpoint was expanded to return an array of partition objects withsize,used, andfreebytes. - Progressive Feedback: A progress bar provides visual feedback during the partition erase/flash cycle.
6. Implementation Results
6.1. Benchmarks
| Metric | Result |
|---|---|
| Binary Size | ~19kB (Gzipped) in a 1MB partition image |
| Flash Duration | ~3-5 seconds for a full 1MB partition |
| Reboot to UI Recovery | ~15-20 seconds (including WiFi reconnection) |
| Peak Heap during OTA | Small constant overhead (streaming pattern) |
6.2. Document Links
Created by Antigravity - Last Updated: 2026-03-03