4.8 KiB
Firmware OTA Strategy for ESP32-S3 Provider
Authored by Antigravity Date: 2026-03-03
1. Goal
Implement a robust Over-The-Air (OTA) update mechanism specifically for the main firmware of the ESP32-S3. The update must:
- Update the core application logic without requiring a physical USB connection.
- Provide a reliable fallback if an update fails (Rollback capability via A/B slots).
- Provide a permanent "factory" fallback as an extreme safety measure.
- Integrate seamlessly with the existing Svelte frontend UI for a push-based update experience.
- Maintain a clear versioning scheme visible to the user.
2. Chosen Approach
We implemented a Dual-Partition Image Flash (A/B slots) with Factory Fallback strategy using ESP-IDF's native OTA mechanisms.
The build process generates a single .bin firmware image. This image is uploaded via the frontend UI and streamed directly to the inactive OTA flash partition (ota_0 or ota_1). Upon successful transfer and validation, the bootloader is instructed to boot from the new partition on the next restart.
3. Design Decisions & Trade-offs
3.1. Why Dual-Partition (A/B) with Factory?
- Safety: A failed or interrupted upload never "bricks" the device.
- Factory Fallback: By maintaining a dedicated 2MB
factorypartition alongside the two 2MB OTA partitions (ota_0,ota_1), we ensure that even if both OTA slots are irrecoverably corrupted, the device can always boot into a known-good state. This requires an initial USB flash to set up but provides maximum long-term reliability. - Storage Allocation: With 16MB of total flash on the ESP32-S3, dedicating 6MB to application code (3x 2MB) is a worthwhile trade-off for extreme resilience, while still leaving ample room for the frontend (
www_0,www_1) and NVS.
3.2. Automatic App Rollback
We rely on ESP-IDF's built-in "App Rollback" feature.
- The Mechanism: When the ESP32 boots a newly OTA-flashed firmware, it is marked as "Pending Verify". If the application crashes, resets, or fails to explicitly mark itself as "valid" during this initial boot, the bootloader automatically reverts to the previous working partition on the subsequent boot.
- Validation Point: We consider the firmware "valid" (and stop the rollback timer) only after it successfully establishes a network connection (Ethernet or WiFi). This ensures that a bad update won't permanently disconnect the device from future OTA attempts.
3.3. Push vs. Pull Updates
- Decision: We implemented a "Push" mechanism where the user manually uploads the
.binfile via the web UI. - Rationale: This matches the existing frontend OTA workflow and is simpler to implement initially. It avoids the need for external update servers, manifest files, and polling mechanisms. A "Pull" mechanism can be added later if fleet management becomes a requirement.
3.4. Versioning Strategy
- Decision: We extract the firmware version directly from the ESP-IDF natively embedded
esp_app_desc_tstructure. - Rationale: This ensures the version reported by the API (
GET /api/system/info) is exactly the version compiled by CMake (PROJECT_VER), eliminating the risk of manual mismatches or external version files getting out of sync.
4. Final Architecture
4.1. The Partition Table
# Name, Type, SubType, Offset, Size
nvs, data, nvs, 0x9000, 0x6000
otadata, data, ota, , 0x2000
phy_init, data, phy, , 0x1000
factory, app, factory, , 2M
ota_0, app, ota_0, , 2M
ota_1, app, ota_1, , 2M
www_0, data, littlefs, , 1M
www_1, data, littlefs, , 1M
4.2. Backend Components
main/api/ota/firmware.cpp: The endpoint (POST /api/ota/firmware) handling the streaming ingestion of the.binfile using standard ESP-IDFesp_otafunctions.main/api/system/system.cpp: The endpoint queryingesp_app_get_description()to expose the unified version payload to the frontend.main/main.cpp: The orchestrator that callsesp_ota_mark_app_valid_cancel_rollback()post-network connection.
4.3. UI/UX Implementation
- The Frontend OTA update component (
OTAUpdate.svelte) is expanded to include a parallel "Firmware Update" section. - This UI section handles file selection, upload progress visualization, and system reboot confirmation, providing parity with the existing frontend update UX.
5. Summary
We use ESP-IDF's native OTA APIs with a Factory + Dual A/B Partition layout for maximum reliability. The system leverages Automatic App Rollback to prevent network lockouts from bad firmware. Versioning is natively controlled via the CMake build descriptions, and the entire update process is driven centrally from the Svelte Frontend UI via a Push-based REST endpoint.
Created by Antigravity - Last Updated: 2026-03-03