89 lines
4.6 KiB
Markdown
89 lines
4.6 KiB
Markdown
# Frontend OTA Strategy for ESP32-S3 Provider
|
|
|
|
**Authored by Antigravity**
|
|
**Date:** 2026-03-03
|
|
|
|
---
|
|
|
|
## 1. Goal
|
|
|
|
Implement a robust Over-The-Air (OTA) update mechanism specifically for the Svelte frontend assets served by the ESP32-S3. The update must:
|
|
- Update the frontend code without requiring a full firmware re-flash.
|
|
- Provide a reliable fallback if an update fails (Rollback capability via A/B slots).
|
|
- Handle updates gracefully within the ESP32's available RAM limitations.
|
|
- Provide a dedicated UI for the user to upload new frontend binaries with real-time feedback.
|
|
- **Ensure a seamless user experience** via automated recovery and page refresh.
|
|
|
|
## 2. Chosen Approach
|
|
|
|
We implemented a **Dual-Partition Image Flash (A/B slots)** strategy using **LittleFS**.
|
|
|
|
Instead of updating individual files, the build process generates a single, pre-packaged `.bin` image of the entire `www` directory. This image is streamed directly to the inactive flash partition (`www_0` or `www_1`), ensuring that the current UI remains fully functional until the update is confirmed and the device reboots.
|
|
|
|
## 3. Design Decisions & Trade-offs
|
|
|
|
### 3.1. Why Dual-Partition (A/B)?
|
|
- **Safety**: A failed or interrupted upload never "bricks" the UI. The ESP32 simply remains on the current working slot.
|
|
- **Flash Allocation**: With 16MB of total flash, allocating 2MB for UI (1MB per slot) is highly efficient given it provides zero-downtime potential.
|
|
|
|
### 3.2. Explicit Reboot vs. Hot-Swap
|
|
We chose an **explicit reboot** to switch slots.
|
|
- **Pros**: Guarantees a clean state, flushes NVS, and restarts all network/VFS handles.
|
|
- **Cons**: Brief ~3s downtime.
|
|
- **Verdict**: The safety of a clean boot outweighs the complexity of live-mounting partitions at runtime.
|
|
|
|
### 3.3. Semantic Versioning & Auto-Increment
|
|
We implemented a `major.minor.revision` versioning system stored in `version.json`.
|
|
- **Decision**: The `ota:package` script automatically increments the `revision` number on every build.
|
|
- **Value**: This ensures that every OTA binary is unique and identifiable (e.g., `www_v0.1.6.bin`), preventing confusion during manual testing.
|
|
|
|
## 4. Final Architecture
|
|
|
|
### 4.1. The Partition Table
|
|
```csv
|
|
# Name, Type, SubType, Offset, Size
|
|
nvs, data, nvs, , 0x6000
|
|
otadata, data, ota, , 0x2000
|
|
www_0, data, littlefs, , 1M
|
|
www_1, data, littlefs, , 1M
|
|
```
|
|
|
|
### 4.2. State Management (NVS)
|
|
The active partition label (`www_0` or `www_1`) is stored in NVS under the `ota` namespace with the key `active_slot`.
|
|
- On boot, `main.cpp` checks this key. If missing, it defaults to `www_0`.
|
|
- The `api_ota_frontend_handler` updates this key only after a 100% successful flash.
|
|
|
|
### 4.3. Resilient Auto-Reload (The "Handshake")
|
|
To solve the "post-reboot-disconnect" problem, we implemented a two-part recovery logic:
|
|
1. **Targeted Polling**: The frontend registers an `onReboot` callback. When the OTA succeeds, the `App` enters a `rebooting` state.
|
|
2. **Resilience**: A dedicated `$effect` in Svelte uses a "stubborn" polling loop. It ignores all connection errors (common while the ESP32 is resetting/reconnecting WiFi) and only refreshes the page once a 200 OK is received from `/api/system/info`.
|
|
|
|
## 5. UI/UX Implementation
|
|
|
|
### 5.1. Layout Separation
|
|
- **Frontend Info Card**: Extracted into a standalone component to provide high-level observability (Version, Active Slot, Partition Free Space).
|
|
- **Advanced Tools**: OTA controls are hidden behind a toggle to prevent accidental triggers and reduce UI clutter.
|
|
|
|
### 5.2. OTA Polling & Stats
|
|
- **Partition Space**: The `GET /api/ota/status` endpoint was expanded to return an array of partition objects with `size`, `used`, and `free` bytes.
|
|
- **Progressive Feedback**: A progress bar provides visual feedback during the partition erase/flash cycle.
|
|
|
|
## 6. Implementation Results
|
|
|
|
### 6.1. Benchmarks
|
|
| Metric | Result |
|
|
|---|---|
|
|
| **Binary Size** | ~19kB (Gzipped) in a 1MB partition image |
|
|
| **Flash Duration** | ~3-5 seconds for a full 1MB partition |
|
|
| **Reboot to UI Recovery** | ~15-20 seconds (including WiFi reconnection) |
|
|
| **Peak Heap during OTA**| Small constant overhead (streaming pattern) |
|
|
|
|
### 6.2. Document Links
|
|
- [Walkthrough & Verification](file:///C:/Users/Paul/.gemini/antigravity/brain/0911543f-7067-430d-b21a-dc50ffda7eea/walkthrough.md)
|
|
- [Build Instructions](file:///w:/Classified/Calendink/Provider/Documentation/build_frontend.md)
|
|
- [Backend Implementation](file:///w:/Classified/Calendink/Provider/main/api/ota/frontend.cpp)
|
|
- [Frontend Component](file:///w:/Classified/Calendink/Provider/frontend/src/lib/OTAUpdate.svelte)
|
|
|
|
---
|
|
*Created by Antigravity - Last Updated: 2026-03-03*
|