From 6d12940205b147468b5f03e06bd7c32f01a69285 Mon Sep 17 00:00:00 2001 From: marthsincemelee Date: Sun, 10 May 2026 15:36:12 +0200 Subject: [PATCH] docs(home-assistant): implementation plan for ZBT-2 Thread + OTBR Task-by-task plan covering: revert of prior ZHA commit, unstable OTBR module import, OTBR enablement against the ZBT-2, firmware flash via universal-silabs-flasher, rebuild on jupiter, and end-to-end smoke test through the HA UI. Designed for execution via superpowers:subagent-driven-development or superpowers:executing-plans, with operator handoffs marked explicitly (per the 'no SSH' workflow rule). Co-Authored-By: Claude Opus 4.7 --- .../plans/2026-05-10-zbt2-thread-otbr.md | 555 ++++++++++++++++++ 1 file changed, 555 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-10-zbt2-thread-otbr.md diff --git a/docs/superpowers/plans/2026-05-10-zbt2-thread-otbr.md b/docs/superpowers/plans/2026-05-10-zbt2-thread-otbr.md new file mode 100644 index 0000000..2b2c114 --- /dev/null +++ b/docs/superpowers/plans/2026-05-10-zbt2-thread-otbr.md @@ -0,0 +1,555 @@ +# ZBT-2 Thread + OTBR Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Run the Home Assistant Connect ZBT-2 as an OpenThread Border Router on `jupiter`, fully integrated with the existing native `services.home-assistant` + `services.matter-server` stack so Matter-over-Thread devices commission through the dongle. + +**Architecture:** Single NixOS module file (`modules/environments/home-assistant/default.nix`) is edited to import the `services.openthread-border-router` module from `nixos-unstable` (not yet in 25.11 stable), enable it against the ZBT-2's `/dev/serial/by-id/...` path, and add HA's `otbr` + `thread` extra components. The previous ZHA-direction commit on this branch is reverted first. The dongle is one-time-flashed from Zigbee NCP firmware to OpenThread RCP firmware via `universal-silabs-flasher` outside the NixOS lifecycle (per design decision: option B, CLI-only). + +**Tech Stack:** Nix flakes (flake-parts), NixOS 25.11 stable + nixos-unstable, `services.openthread-border-router`, `services.home-assistant`, `services.matter-server`, `python313Packages.universal-silabs-flasher`. + +**Spec:** [`docs/superpowers/specs/2026-05-10-zbt2-thread-otbr-design.md`](../specs/2026-05-10-zbt2-thread-otbr-design.md) — read this before starting. + +**User feedback rules in force:** +- Never commit to `master`; this branch is `feature/ha-zbt-2-thread`. Final merge happens at the end via PR or operator-driven merge. +- Do not SSH to `jupiter`. All commands targeting jupiter are operator handoffs — present the command, the user runs it and pastes output back. + +--- + +## File Map + +| Action | File | Responsibility | +|--------|------|----------------| +| Modify | `modules/environments/home-assistant/default.nix` | Import unstable OTBR module; enable OTBR for the ZBT-2; add `otbr` + `thread` HA components | +| Create (auto) | _(no new files)_ | All work fits in the one module | + +The `git revert` of `e8d09f4` automatically un-modifies the same file (drops `"zha"` and the `dialout` line). No host-level (`machines/jupiter/`) changes; no flake-level changes (the existing `_module.args.self = self;` in `machines/configuration.nix:21` already exposes `self.inputs.nixpkgs-unstable` to every module). + +--- + +## Validation Approach (instead of unit tests) + +This is a NixOS configuration change; there's no test framework. We use `nix eval` against `nixosConfigurations.jupiter.config.*` as the equivalent of unit tests — assert option resolution **before** the change (red), then **after** the change (green). Functional / smoke tests happen post-`nixos-rebuild` on jupiter via systemctl, mDNS, and the HA UI. + +All `nix eval` commands run on the dev Mac. All `systemctl` / `journalctl` / `nixos-rebuild` commands run on jupiter (operator handoff). + +--- + +### Task 1: Revert the prior ZHA commit + +**Files:** +- Modify: `modules/environments/home-assistant/default.nix` (via `git revert`) + +- [ ] **Step 1: Verify pre-state** + +On dev Mac, in the repo root: + +```bash +git log --oneline -3 +``` + +Expected: `dbeda27` (design spec) on top of `e8d09f4` (the ZHA commit) on top of `098e632`. + +Also confirm current `extraComponents` includes `"zha"`: + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents +``` + +Expected: `["matter","mobile_app","zha"]` + +- [ ] **Step 2: Revert** + +```bash +git revert --no-edit e8d09f4 +``` + +Expected: revert commit created cleanly (no merge conflicts), single file changed. + +- [ ] **Step 3: Verify post-state** + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents +``` + +Expected: `["matter","mobile_app"]` — `zha` is gone. + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.users.users.hass.extraGroups +``` + +Expected: `[]` — `dialout` is gone. + +```bash +git log --oneline -4 +``` + +Expected: revert commit on top of `dbeda27` on top of `e8d09f4`. + +(No explicit `git commit` step — `git revert` produced its own commit.) + +--- + +### Task 2: Wire the unstable OTBR module import (still disabled) + +This task gets the module into scope so options become available, but leaves `services.openthread-border-router.enable = false` (the default). The point is to confirm the import path works before adding device-specific config. + +**Files:** +- Modify: `modules/environments/home-assistant/default.nix` + +- [ ] **Step 1: Write the failing eval check** + +On dev Mac: + +```bash +nix eval --json .#nixosConfigurations.jupiter.options.services.openthread-border-router.enable.description 2>&1 | head -3 +``` + +Expected: error containing `attribute 'openthread-border-router' missing` or similar — the option doesn't exist yet because the module isn't imported. + +- [ ] **Step 2: Add `self` to the module's argument list and add the `imports` block** + +Current header (`modules/environments/home-assistant/default.nix` lines 1–11): + +```nix +# manages home automations +{ + config, + lib, + pkgs, + ... +}: +let + cfg = config.my.profiles.home-assistant; + hostName = config.networking.hostName; +in +``` + +Replace lines 1–11 with: + +```nix +# manages home automations +{ + config, + lib, + pkgs, + self, + ... +}: +let + cfg = config.my.profiles.home-assistant; + hostName = config.networking.hostName; +in +``` + +Then, immediately after the opening brace on line 12 of the modified file (i.e. at the top of the attribute set body, before `options.my.profiles.home-assistant`), add: + +```nix + imports = [ + # services.openthread-border-router isn't in nixos-25.11; pull from + # nixpkgs-unstable. Package comes from the existing unstable overlay. + "${self.inputs.nixpkgs-unstable}/nixos/modules/services/home-automation/openthread-border-router.nix" + ]; + +``` + +- [ ] **Step 3: Re-run the eval check** + +```bash +nix eval --json .#nixosConfigurations.jupiter.options.services.openthread-border-router.enable.description 2>&1 | head -3 +``` + +Expected: a JSON string describing the option (e.g. `"Whether to enable the OpenThread Border Router."`). + +- [ ] **Step 4: Verify the service is currently disabled** + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable +``` + +Expected: `false`. + +- [ ] **Step 5: Verify whole config still evaluates** + +```bash +nix eval .#nixosConfigurations.jupiter.config.system.build.toplevel.drvPath +``` + +Expected: a `/nix/store/...drv` path. Pre-existing trace warnings (the `*.service ordered after network-online.target` ones) are fine; no errors. + +- [ ] **Step 6: Commit** + +```bash +git add modules/environments/home-assistant/default.nix +git commit -m "$(cat <<'EOF' +feat(home-assistant): import openthread-border-router module from unstable + +Pulls the services.openthread-border-router NixOS module directly from +nixpkgs-unstable since it isn't in 25.11 yet. Service stays disabled +in this commit; configuration follows. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +### Task 3: Operator handoff — get the ZBT-2 device path from jupiter + +This task has no code. It collects the runtime parameter (USB serial number) that Task 4 needs. + +**Files:** _(none)_ + +- [ ] **Step 1: Hand off** + +Tell the operator: + +> "Plug the ZBT-2 into a USB-2 port on jupiter (it's still on stock Zigbee firmware — that's fine for this step). Then run `ls -l /dev/serial/by-id/` on jupiter and paste the full output back. We're after the line that contains `Nabu_Casa_Home_Assistant_Connect_ZBT-2`." + +- [ ] **Step 2: Wait for the operator's pasted output** + +Expected shape: a line like +`lrwxrwxrwx 1 root root 13 May 10 14:30 usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_-if00 -> ../../ttyACM0` + +- [ ] **Step 3: Record the by-id path** + +Capture the value `/dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_-if00` for use in Task 4. Use the **by-id** path (not `/dev/ttyACM0`) so USB renumbering can't break OTBR. + +--- + +### Task 4: Enable OTBR + add HA otbr/thread components + +**Files:** +- Modify: `modules/environments/home-assistant/default.nix` + +- [ ] **Step 1: Write the failing eval checks** + +On dev Mac: + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable +``` + +Expected: `false` (still disabled from Task 2). + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents +``` + +Expected: `["matter","mobile_app"]` — no `otbr`, no `thread` yet. + +- [ ] **Step 2: Add `"otbr"` and `"thread"` to `extraComponents`** + +In `modules/environments/home-assistant/default.nix`, locate the `extraComponents` list (currently `[ "matter" "mobile_app" ]`) and replace it with: + +```nix + extraComponents = [ + "matter" + "mobile_app" + "otbr" + "thread" + ]; +``` + +- [ ] **Step 3: Add the `services.openthread-border-router` block** + +In the same file, **after** the `services.home-assistant.config = { ... };` block and **before** `my.homepage.services`, add: + +```nix + services.openthread-border-router = { + enable = true; + package = pkgs.unstable.openthread-border-router; + openFirewall = true; + backboneInterfaces = [ "enp3s0" ]; + radio.device = ""; + }; + +``` + +Replace `` with the literal string captured in Task 3 step 3 (e.g. `"/dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_AB12CD34-if00"`). + +- [ ] **Step 4: Run the green eval checks** + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents +``` + +Expected: `["matter","mobile_app","otbr","thread"]`. + +```bash +nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable +``` + +Expected: `true`. + +```bash +nix eval --raw .#nixosConfigurations.jupiter.config.services.openthread-border-router.radio.url +``` + +Expected: a string like `spinel+hdlc+uart:///dev/serial/by-id/usb-Nabu_Casa_..._ZBT-2_-if00?uart-baudrate=115200` (the module composes this from `radio.device` automatically). + +- [ ] **Step 5: Full eval — system derivation must build** + +```bash +nix eval .#nixosConfigurations.jupiter.config.system.build.toplevel.drvPath +``` + +Expected: a `/nix/store/...drv` path with no eval errors. + +- [ ] **Step 6: `nix flake check` for good measure** + +```bash +nix flake check +``` + +Expected: no errors. (Same pre-existing trace warnings as before are acceptable.) + +- [ ] **Step 7: Commit** + +```bash +git add modules/environments/home-assistant/default.nix +git commit -m "$(cat <<'EOF' +feat(home-assistant): enable OTBR for ZBT-2 + add HA otbr/thread components + +Brings up otbr-agent against the ZBT-2 over Spinel/UART, opens the +REST API on :8081, and wires HA's otbr + thread integrations so +Matter-over-Thread devices can commission through the existing +matter-server. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +### Task 5: Operator handoff — flash OpenThread RCP firmware on the dongle + +The dongle is currently running Zigbee NCP firmware and won't speak Spinel until reflashed. This must happen **before** Task 6's rebuild (otherwise `otbr-agent` will try to talk to a Zigbee-firmware dongle and fail). + +**Files:** _(none on dev Mac)_ + +- [ ] **Step 1: Hand off — fetch firmware** + +Tell the operator: + +> "On any machine with a browser: download the latest **ZBT-2 OpenThread RCP** `.gbl` from . The asset name will look like `ot-rcp-zbt-2-.gbl`. Get it onto jupiter — `scp` it over, or just `curl` from jupiter's shell. Confirm by running `ls ~/ot-rcp-zbt-2-*.gbl` on jupiter and pasting the result." + +- [ ] **Step 2: Wait for confirmation** + +Expected: a single matching path, e.g. `/home/finn/ot-rcp-zbt-2-2025.10.0.gbl`. + +- [ ] **Step 3: Hand off — flash** + +Tell the operator: + +> "On jupiter, run (substituting the actual by-id path from Task 3 and the actual `.gbl` filename): +> +> ```bash +> nix shell nixpkgs#python313Packages.universal-silabs-flasher -c \ +> universal-silabs-flasher \ +> --device /dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_-if00 \ +> flash --firmware ~/ot-rcp-zbt-2-.gbl +> ``` +> +> Paste the full output. Expected duration: ~30 seconds. The tool detects the running firmware, drops the dongle into bootloader mode, writes the `.gbl`, and reboots back to RCP." + +- [ ] **Step 4: Verify the flash succeeded** + +Expected output ends with something like `Firmware update complete` (or equivalent success message). If the tool reports CRC failure / partial write — re-run; the bootloader stays addressable. + +If the operator reports `--help` shows different subcommand syntax (universal-silabs-flasher's CLI has changed across versions), have them check `universal-silabs-flasher --help` and adapt — but the `flash --firmware ` form has been stable since 1.0.x. + +--- + +### Task 6: Operator handoff — `nixos-rebuild switch` on jupiter + +**Files:** _(none on dev Mac)_ + +- [ ] **Step 1: Push the branch so jupiter can fetch it** + +On dev Mac: + +```bash +git push -u origin feature/ha-zbt-2-thread +``` + +(If the operator pulls via a different mechanism — local checkout, fileshare — adapt accordingly. The standard pattern in this repo is `git pull` on jupiter.) + +- [ ] **Step 2: Hand off — pull + rebuild** + +Tell the operator: + +> "On jupiter: +> +> ```bash +> cd ~/development/nixos # or wherever the flake lives on jupiter +> git fetch origin +> git checkout feature/ha-zbt-2-thread +> sudo nixos-rebuild switch --flake .#jupiter +> ``` +> +> Paste the tail of the output (everything from the first `building ...` line onward). Expected: build completes, switch to the new generation, no errors." + +- [ ] **Step 3: Verify the switch succeeded** + +If the operator's pasted output includes `error:` or the switch failed mid-activation, **stop here**. Common failure: option name mismatch with whatever version of nixos-unstable is locked in the flake. Fix on dev Mac, push, ask operator to pull + rebuild again. + +If the rebuild succeeded, proceed to Task 7. + +--- + +### Task 7: Operator handoff — service-level verification on jupiter + +**Files:** _(none)_ + +- [ ] **Step 1: Hand off — service health** + +Tell the operator: + +> "On jupiter, run each command and paste output: +> +> ```bash +> systemctl status otbr-agent.service --no-pager +> journalctl -u otbr-agent.service -n 50 --no-pager +> ip link show wpan0 +> ```" + +- [ ] **Step 2: Verify** + +Expected: +- `systemctl status` reports `active (running)`. +- `journalctl` shows OTBR startup messages, no repeated restart loops. +- `ip link show wpan0` shows the interface exists; state DOWN is correct (HA hasn't formed a network yet). + +If `otbr-agent` is in restart loop with `Failed to open device`: device path mismatch. Re-check Task 3's path. + +- [ ] **Step 3: Hand off — mDNS publication** + +Tell the operator: + +> "On jupiter: +> +> ```bash +> avahi-browse -r -t _meshcop._udp +> ```" + +Expected: one entry whose hostname matches jupiter, advertising port 8081. + +If empty: `backboneInterfaces` is wrong. On jupiter, run `ip link show` and tell operator to paste; pick the actual primary LAN interface, update `backboneInterfaces`, re-rebuild. + +- [ ] **Step 4: Hand off — REST API reachability** + +Tell the operator: + +> "On jupiter: +> +> ```bash +> curl -s http://127.0.0.1:8081/node/state +> ```" + +Expected: a JSON state string, most likely `"disabled"` (HA hasn't formed a network yet). + +If connection refused: OTBR isn't actually listening — re-check `journalctl`. + +--- + +### Task 8: Operator handoff — HA UI smoke test + +**Files:** _(none)_ + +- [ ] **Step 1: Hand off — confirm discovery** + +Tell the operator: + +> "Open `http://jupiter:8123` in a browser. Go to **Settings → Devices & Services**. Within ~30s of the rebuild, you should see **'Open Thread Border Router'** under 'Discovered'. Click **Configure**. Let HA form a new Thread network (or import existing dataset if you have one). Tell me when that's done — and paste any errors if it doesn't work." + +- [ ] **Step 2: Wait for confirmation** + +Expected: HA reports the Thread network is formed; the OTBR integration appears under 'Configured'. + +If discovery doesn't happen: cross-check with Task 7 step 3 (`avahi-browse`). HA reads from the system's avahi cache. + +- [ ] **Step 3: Hand off — Matter-over-Thread pairing** + +Tell the operator: + +> "Pick one Matter-over-Thread device. Use the HA Companion app, scan its Matter QR code, and follow the prompts. Tell me when it's paired — or paste any errors. Pairing should complete in 30–90s." + +- [ ] **Step 4: Wait for confirmation** + +Expected: device appears under both Matter and Thread integrations in HA, and is controllable from the dashboard. + +If pairing times out: see "Failure modes" table in the spec — most likely Thread mesh prefix isn't routed back to LAN. Operator runs `nft list ruleset` and `ip -6 route` on jupiter; debug from there. + +--- + +### Task 9: Merge to master + +**Files:** _(none)_ + +- [ ] **Step 1: Final branch state** + +On dev Mac: + +```bash +git log --oneline master..feature/ha-zbt-2-thread +``` + +Expected (in chronological order from oldest to newest): +1. `e8d09f4` — original ZHA commit +2. `dbeda27` — design spec +3. `` — Revert "feat(home-assistant): enable ZHA for ZBT-2 Zigbee dongle" +4. `` — feat(home-assistant): import openthread-border-router module from unstable +5. `` — feat(home-assistant): enable OTBR for ZBT-2 + add HA otbr/thread components + +That's a fine history to merge as-is (the ZHA→revert pair is honest about the pivot). + +- [ ] **Step 2: Hand off — merge** + +The user runs the merge themselves (per repo policy: never commit to master without explicit consent). Tell the operator: + +> "If the smoke tests in Task 8 worked, merge with: +> +> ```bash +> git switch master +> git merge --no-ff feature/ha-zbt-2-thread +> git push origin master +> ``` +> +> Or open a merge request / PR if you prefer review first." + +- [ ] **Step 3: Optional cleanup** + +After merge: + +```bash +git branch -d feature/ha-zbt-2-thread +git push origin --delete feature/ha-zbt-2-thread +``` + +--- + +## Self-Review + +**Spec coverage:** +- Goals (4 bullets) → Tasks 2 (OTBR module wiring), 4 (OTBR enable + HA components), 5 (firmware flash), 8 (Matter-over-Thread smoke test) ✓ +- Non-goals → respected; no multipan, no auto-flash, no fallback paths ✓ +- Architecture diagram → Task 4 produces the wiring shown; Tasks 6–8 verify it ✓ +- File changes (one module) → Tasks 1, 2, 4 ✓ +- Reverts of prior ZHA commit → Task 1 ✓ +- Operator workflow steps 0–7 → Tasks 1, 2, 3, 4, 5, 6, 7, 8 ✓ +- Verification (eval-only / service-level / functional) → Tasks 2/4/6/7/8 ✓ +- Failure-mode table → referenced in Tasks 6, 7, 8 for triage ✓ + +**Placeholder scan:** +- `` in Task 4 step 3 is intentional — it's a runtime parameter the operator fills in, captured in Task 3. +- ``, `` in shell commands are intentional placeholders for operator substitution. +- No "TBD", "TODO", "implement later", or vague "handle errors" steps. + +**Type / name consistency:** +- `services.openthread-border-router` used consistently (matches the unstable module's option path). +- `pkgs.unstable.openthread-border-router` matches the overlay (`machines/configuration.nix:11`). +- `extraComponents` strings (`"otbr"`, `"thread"`) match HA Core integration names. +- `radio.device` ↔ `radio.url` relationship documented (module composes `url` from `device`).