From ad65bf520b64c6c3eafa6508b6ce37598615b5a8 Mon Sep 17 00:00:00 2001 From: Patedam Date: Tue, 3 Mar 2026 21:10:36 -0500 Subject: [PATCH] tdd for firmware ota --- Provider/tdd/firmware_ota.md | 72 ++++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 Provider/tdd/firmware_ota.md diff --git a/Provider/tdd/firmware_ota.md b/Provider/tdd/firmware_ota.md new file mode 100644 index 0000000..cceb184 --- /dev/null +++ b/Provider/tdd/firmware_ota.md @@ -0,0 +1,72 @@ +# Firmware OTA Strategy for ESP32-S3 Provider + +**Authored by Antigravity** +**Date:** 2026-03-03 + +--- + +## 1. Goal + +Implement a robust Over-The-Air (OTA) update mechanism specifically for the main firmware of the ESP32-S3. The update must: +- Update the core application logic without requiring a physical USB connection. +- Provide a reliable fallback if an update fails (Rollback capability via A/B slots). +- Provide a permanent "factory" fallback as an extreme safety measure. +- Integrate seamlessly with the existing Svelte frontend UI for a push-based update experience. +- Maintain a clear versioning scheme visible to the user. + +## 2. Chosen Approach + +We implemented a **Dual-Partition Image Flash (A/B slots) with Factory Fallback** strategy using ESP-IDF's native OTA mechanisms. + +The build process generates a single `.bin` firmware image. This image is uploaded via the frontend UI and streamed directly to the inactive OTA flash partition (`ota_0` or `ota_1`). Upon successful transfer and validation, the bootloader is instructed to boot from the new partition on the next restart. + +## 3. Design Decisions & Trade-offs + +### 3.1. Why Dual-Partition (A/B) with Factory? +- **Safety**: A failed or interrupted upload never "bricks" the device. +- **Factory Fallback**: By maintaining a dedicated 2MB `factory` partition alongside the two 2MB OTA partitions (`ota_0`, `ota_1`), we ensure that even if both OTA slots are irrecoverably corrupted, the device can always boot into a known-good state. This requires an initial USB flash to set up but provides maximum long-term reliability. +- **Storage Allocation**: With 16MB of total flash on the ESP32-S3, dedicating 6MB to application code (3x 2MB) is a worthwhile trade-off for extreme resilience, while still leaving ample room for the frontend (`www_0`, `www_1`) and NVS. + +### 3.2. Automatic App Rollback +We rely on ESP-IDF's built-in "App Rollback" feature. +- **The Mechanism**: When the ESP32 boots a newly OTA-flashed firmware, it is marked as "Pending Verify". If the application crashes, resets, or fails to explicitly mark itself as "valid" during this initial boot, the bootloader automatically reverts to the previous working partition on the subsequent boot. +- **Validation Point**: We consider the firmware "valid" (and stop the rollback timer) only after it successfully establishes a network connection (Ethernet or WiFi). This ensures that a bad update won't permanently disconnect the device from future OTA attempts. + +### 3.3. Push vs. Pull Updates +- **Decision**: We implemented a "Push" mechanism where the user manually uploads the `.bin` file via the web UI. +- **Rationale**: This matches the existing frontend OTA workflow and is simpler to implement initially. It avoids the need for external update servers, manifest files, and polling mechanisms. A "Pull" mechanism can be added later if fleet management becomes a requirement. + +### 3.4. Versioning Strategy +- **Decision**: We extract the firmware version directly from the ESP-IDF natively embedded `esp_app_desc_t` structure. +- **Rationale**: This ensures the version reported by the API (`GET /api/system/info`) is exactly the version compiled by CMake (`PROJECT_VER`), eliminating the risk of manual mismatches or external version files getting out of sync. + +## 4. Final Architecture + +### 4.1. The Partition Table +```csv +# Name, Type, SubType, Offset, Size +nvs, data, nvs, 0x9000, 0x6000 +otadata, data, ota, , 0x2000 +phy_init, data, phy, , 0x1000 +factory, app, factory, , 2M +ota_0, app, ota_0, , 2M +ota_1, app, ota_1, , 2M +www_0, data, littlefs, , 1M +www_1, data, littlefs, , 1M +``` + +### 4.2. Backend Components +- `main/api/ota/firmware.cpp`: The endpoint (`POST /api/ota/firmware`) handling the streaming ingestion of the `.bin` file using standard ESP-IDF `esp_ota` functions. +- `main/api/system/system.cpp`: The endpoint querying `esp_app_get_description()` to expose the unified version payload to the frontend. +- `main/main.cpp`: The orchestrator that calls `esp_ota_mark_app_valid_cancel_rollback()` post-network connection. + +### 4.3. UI/UX Implementation +- The Frontend OTA update component (`OTAUpdate.svelte`) is expanded to include a parallel "Firmware Update" section. +- This UI section handles file selection, upload progress visualization, and system reboot confirmation, providing parity with the existing frontend update UX. + +## 5. Summary + +We use **ESP-IDF's native OTA APIs** with a **Factory + Dual A/B Partition** layout for maximum reliability. The system leverages **Automatic App Rollback** to prevent network lockouts from bad firmware. Versioning is natively controlled via the **CMake build descriptions**, and the entire update process is driven centrally from the **Svelte Frontend UI** via a Push-based REST endpoint. + +--- +*Created by Antigravity - Last Updated: 2026-03-03*