Files
nixos/docs/superpowers/plans/2026-05-10-zbt2-thread-otbr.md
T
marthsincemelee 6d12940205 docs(home-assistant): implementation plan for ZBT-2 Thread + OTBR
Task-by-task plan covering: revert of prior ZHA commit, unstable
OTBR module import, OTBR enablement against the ZBT-2, firmware
flash via universal-silabs-flasher, rebuild on jupiter, and
end-to-end smoke test through the HA UI.

Designed for execution via superpowers:subagent-driven-development
or superpowers:executing-plans, with operator handoffs marked
explicitly (per the 'no SSH' workflow rule).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 15:36:12 +02:00

19 KiB
Raw Blame History

ZBT-2 Thread + OTBR Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Run the Home Assistant Connect ZBT-2 as an OpenThread Border Router on jupiter, fully integrated with the existing native services.home-assistant + services.matter-server stack so Matter-over-Thread devices commission through the dongle.

Architecture: Single NixOS module file (modules/environments/home-assistant/default.nix) is edited to import the services.openthread-border-router module from nixos-unstable (not yet in 25.11 stable), enable it against the ZBT-2's /dev/serial/by-id/... path, and add HA's otbr + thread extra components. The previous ZHA-direction commit on this branch is reverted first. The dongle is one-time-flashed from Zigbee NCP firmware to OpenThread RCP firmware via universal-silabs-flasher outside the NixOS lifecycle (per design decision: option B, CLI-only).

Tech Stack: Nix flakes (flake-parts), NixOS 25.11 stable + nixos-unstable, services.openthread-border-router, services.home-assistant, services.matter-server, python313Packages.universal-silabs-flasher.

Spec: docs/superpowers/specs/2026-05-10-zbt2-thread-otbr-design.md — read this before starting.

User feedback rules in force:

  • Never commit to master; this branch is feature/ha-zbt-2-thread. Final merge happens at the end via PR or operator-driven merge.
  • Do not SSH to jupiter. All commands targeting jupiter are operator handoffs — present the command, the user runs it and pastes output back.

File Map

Action File Responsibility
Modify modules/environments/home-assistant/default.nix Import unstable OTBR module; enable OTBR for the ZBT-2; add otbr + thread HA components
Create (auto) (no new files) All work fits in the one module

The git revert of e8d09f4 automatically un-modifies the same file (drops "zha" and the dialout line). No host-level (machines/jupiter/) changes; no flake-level changes (the existing _module.args.self = self; in machines/configuration.nix:21 already exposes self.inputs.nixpkgs-unstable to every module).


Validation Approach (instead of unit tests)

This is a NixOS configuration change; there's no test framework. We use nix eval against nixosConfigurations.jupiter.config.* as the equivalent of unit tests — assert option resolution before the change (red), then after the change (green). Functional / smoke tests happen post-nixos-rebuild on jupiter via systemctl, mDNS, and the HA UI.

All nix eval commands run on the dev Mac. All systemctl / journalctl / nixos-rebuild commands run on jupiter (operator handoff).


Task 1: Revert the prior ZHA commit

Files:

  • Modify: modules/environments/home-assistant/default.nix (via git revert)

  • Step 1: Verify pre-state

On dev Mac, in the repo root:

git log --oneline -3

Expected: dbeda27 (design spec) on top of e8d09f4 (the ZHA commit) on top of 098e632.

Also confirm current extraComponents includes "zha":

nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app","zha"]

  • Step 2: Revert
git revert --no-edit e8d09f4

Expected: revert commit created cleanly (no merge conflicts), single file changed.

  • Step 3: Verify post-state
nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app"]zha is gone.

nix eval --json .#nixosConfigurations.jupiter.config.users.users.hass.extraGroups

Expected: []dialout is gone.

git log --oneline -4

Expected: revert commit on top of dbeda27 on top of e8d09f4.

(No explicit git commit step — git revert produced its own commit.)


Task 2: Wire the unstable OTBR module import (still disabled)

This task gets the module into scope so options become available, but leaves services.openthread-border-router.enable = false (the default). The point is to confirm the import path works before adding device-specific config.

Files:

  • Modify: modules/environments/home-assistant/default.nix

  • Step 1: Write the failing eval check

On dev Mac:

nix eval --json .#nixosConfigurations.jupiter.options.services.openthread-border-router.enable.description 2>&1 | head -3

Expected: error containing attribute 'openthread-border-router' missing or similar — the option doesn't exist yet because the module isn't imported.

  • Step 2: Add self to the module's argument list and add the imports block

Current header (modules/environments/home-assistant/default.nix lines 111):

# manages home automations 
{
  config,
  lib,
  pkgs,
  ...
}:
let
  cfg = config.my.profiles.home-assistant;
  hostName = config.networking.hostName;
in

Replace lines 111 with:

# manages home automations
{
  config,
  lib,
  pkgs,
  self,
  ...
}:
let
  cfg = config.my.profiles.home-assistant;
  hostName = config.networking.hostName;
in

Then, immediately after the opening brace on line 12 of the modified file (i.e. at the top of the attribute set body, before options.my.profiles.home-assistant), add:

  imports = [
    # services.openthread-border-router isn't in nixos-25.11; pull from
    # nixpkgs-unstable. Package comes from the existing unstable overlay.
    "${self.inputs.nixpkgs-unstable}/nixos/modules/services/home-automation/openthread-border-router.nix"
  ];

  • Step 3: Re-run the eval check
nix eval --json .#nixosConfigurations.jupiter.options.services.openthread-border-router.enable.description 2>&1 | head -3

Expected: a JSON string describing the option (e.g. "Whether to enable the OpenThread Border Router.").

  • Step 4: Verify the service is currently disabled
nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable

Expected: false.

  • Step 5: Verify whole config still evaluates
nix eval .#nixosConfigurations.jupiter.config.system.build.toplevel.drvPath

Expected: a /nix/store/...drv path. Pre-existing trace warnings (the *.service ordered after network-online.target ones) are fine; no errors.

  • Step 6: Commit
git add modules/environments/home-assistant/default.nix
git commit -m "$(cat <<'EOF'
feat(home-assistant): import openthread-border-router module from unstable

Pulls the services.openthread-border-router NixOS module directly from
nixpkgs-unstable since it isn't in 25.11 yet. Service stays disabled
in this commit; configuration follows.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
EOF
)"

Task 3: Operator handoff — get the ZBT-2 device path from jupiter

This task has no code. It collects the runtime parameter (USB serial number) that Task 4 needs.

Files: (none)

  • Step 1: Hand off

Tell the operator:

"Plug the ZBT-2 into a USB-2 port on jupiter (it's still on stock Zigbee firmware — that's fine for this step). Then run ls -l /dev/serial/by-id/ on jupiter and paste the full output back. We're after the line that contains Nabu_Casa_Home_Assistant_Connect_ZBT-2."

  • Step 2: Wait for the operator's pasted output

Expected shape: a line like lrwxrwxrwx 1 root root 13 May 10 14:30 usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_<serial-string>-if00 -> ../../ttyACM0

  • Step 3: Record the by-id path

Capture the value /dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_<serial-string>-if00 for use in Task 4. Use the by-id path (not /dev/ttyACM0) so USB renumbering can't break OTBR.


Task 4: Enable OTBR + add HA otbr/thread components

Files:

  • Modify: modules/environments/home-assistant/default.nix

  • Step 1: Write the failing eval checks

On dev Mac:

nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable

Expected: false (still disabled from Task 2).

nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app"] — no otbr, no thread yet.

  • Step 2: Add "otbr" and "thread" to extraComponents

In modules/environments/home-assistant/default.nix, locate the extraComponents list (currently [ "matter" "mobile_app" ]) and replace it with:

      extraComponents = [
        "matter"
        "mobile_app"
        "otbr"
        "thread"
      ];
  • Step 3: Add the services.openthread-border-router block

In the same file, after the services.home-assistant.config = { ... }; block and before my.homepage.services, add:

    services.openthread-border-router = {
      enable = true;
      package = pkgs.unstable.openthread-border-router;
      openFirewall = true;
      backboneInterfaces = [ "enp3s0" ];
      radio.device = "<PASTE-BY-ID-PATH-FROM-TASK-3>";
    };

Replace <PASTE-BY-ID-PATH-FROM-TASK-3> with the literal string captured in Task 3 step 3 (e.g. "/dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_AB12CD34-if00").

  • Step 4: Run the green eval checks
nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app","otbr","thread"].

nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable

Expected: true.

nix eval --raw .#nixosConfigurations.jupiter.config.services.openthread-border-router.radio.url

Expected: a string like spinel+hdlc+uart:///dev/serial/by-id/usb-Nabu_Casa_..._ZBT-2_<serial>-if00?uart-baudrate=115200 (the module composes this from radio.device automatically).

  • Step 5: Full eval — system derivation must build
nix eval .#nixosConfigurations.jupiter.config.system.build.toplevel.drvPath

Expected: a /nix/store/...drv path with no eval errors.

  • Step 6: nix flake check for good measure
nix flake check

Expected: no errors. (Same pre-existing trace warnings as before are acceptable.)

  • Step 7: Commit
git add modules/environments/home-assistant/default.nix
git commit -m "$(cat <<'EOF'
feat(home-assistant): enable OTBR for ZBT-2 + add HA otbr/thread components

Brings up otbr-agent against the ZBT-2 over Spinel/UART, opens the
REST API on :8081, and wires HA's otbr + thread integrations so
Matter-over-Thread devices can commission through the existing
matter-server.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
EOF
)"

Task 5: Operator handoff — flash OpenThread RCP firmware on the dongle

The dongle is currently running Zigbee NCP firmware and won't speak Spinel until reflashed. This must happen before Task 6's rebuild (otherwise otbr-agent will try to talk to a Zigbee-firmware dongle and fail).

Files: (none on dev Mac)

  • Step 1: Hand off — fetch firmware

Tell the operator:

"On any machine with a browser: download the latest ZBT-2 OpenThread RCP .gbl from https://github.com/NabuCasa/silabs-firmware-builder/releases. The asset name will look like ot-rcp-zbt-2-<version>.gbl. Get it onto jupiter — scp it over, or just curl from jupiter's shell. Confirm by running ls ~/ot-rcp-zbt-2-*.gbl on jupiter and pasting the result."

  • Step 2: Wait for confirmation

Expected: a single matching path, e.g. /home/finn/ot-rcp-zbt-2-2025.10.0.gbl.

  • Step 3: Hand off — flash

Tell the operator:

"On jupiter, run (substituting the actual by-id path from Task 3 and the actual .gbl filename):

nix shell nixpkgs#python313Packages.universal-silabs-flasher -c \
  universal-silabs-flasher \
    --device /dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_<serial>-if00 \
    flash --firmware ~/ot-rcp-zbt-2-<version>.gbl

Paste the full output. Expected duration: ~30 seconds. The tool detects the running firmware, drops the dongle into bootloader mode, writes the .gbl, and reboots back to RCP."

  • Step 4: Verify the flash succeeded

Expected output ends with something like Firmware update complete (or equivalent success message). If the tool reports CRC failure / partial write — re-run; the bootloader stays addressable.

If the operator reports --help shows different subcommand syntax (universal-silabs-flasher's CLI has changed across versions), have them check universal-silabs-flasher --help and adapt — but the flash --firmware <path> form has been stable since 1.0.x.


Task 6: Operator handoff — nixos-rebuild switch on jupiter

Files: (none on dev Mac)

  • Step 1: Push the branch so jupiter can fetch it

On dev Mac:

git push -u origin feature/ha-zbt-2-thread

(If the operator pulls via a different mechanism — local checkout, fileshare — adapt accordingly. The standard pattern in this repo is git pull on jupiter.)

  • Step 2: Hand off — pull + rebuild

Tell the operator:

"On jupiter:

cd ~/development/nixos      # or wherever the flake lives on jupiter
git fetch origin
git checkout feature/ha-zbt-2-thread
sudo nixos-rebuild switch --flake .#jupiter

Paste the tail of the output (everything from the first building ... line onward). Expected: build completes, switch to the new generation, no errors."

  • Step 3: Verify the switch succeeded

If the operator's pasted output includes error: or the switch failed mid-activation, stop here. Common failure: option name mismatch with whatever version of nixos-unstable is locked in the flake. Fix on dev Mac, push, ask operator to pull + rebuild again.

If the rebuild succeeded, proceed to Task 7.


Task 7: Operator handoff — service-level verification on jupiter

Files: (none)

  • Step 1: Hand off — service health

Tell the operator:

"On jupiter, run each command and paste output:

systemctl status otbr-agent.service --no-pager
journalctl -u otbr-agent.service -n 50 --no-pager
ip link show wpan0
```"
  • Step 2: Verify

Expected:

  • systemctl status reports active (running).
  • journalctl shows OTBR startup messages, no repeated restart loops.
  • ip link show wpan0 shows the interface exists; state DOWN is correct (HA hasn't formed a network yet).

If otbr-agent is in restart loop with Failed to open device: device path mismatch. Re-check Task 3's path.

  • Step 3: Hand off — mDNS publication

Tell the operator:

"On jupiter:

avahi-browse -r -t _meshcop._udp
```"

Expected: one entry whose hostname matches jupiter, advertising port 8081.

If empty: backboneInterfaces is wrong. On jupiter, run ip link show and tell operator to paste; pick the actual primary LAN interface, update backboneInterfaces, re-rebuild.

  • Step 4: Hand off — REST API reachability

Tell the operator:

"On jupiter:

curl -s http://127.0.0.1:8081/node/state
```"

Expected: a JSON state string, most likely "disabled" (HA hasn't formed a network yet).

If connection refused: OTBR isn't actually listening — re-check journalctl.


Task 8: Operator handoff — HA UI smoke test

Files: (none)

  • Step 1: Hand off — confirm discovery

Tell the operator:

"Open http://jupiter:8123 in a browser. Go to Settings → Devices & Services. Within ~30s of the rebuild, you should see 'Open Thread Border Router' under 'Discovered'. Click Configure. Let HA form a new Thread network (or import existing dataset if you have one). Tell me when that's done — and paste any errors if it doesn't work."

  • Step 2: Wait for confirmation

Expected: HA reports the Thread network is formed; the OTBR integration appears under 'Configured'.

If discovery doesn't happen: cross-check with Task 7 step 3 (avahi-browse). HA reads from the system's avahi cache.

  • Step 3: Hand off — Matter-over-Thread pairing

Tell the operator:

"Pick one Matter-over-Thread device. Use the HA Companion app, scan its Matter QR code, and follow the prompts. Tell me when it's paired — or paste any errors. Pairing should complete in 3090s."

  • Step 4: Wait for confirmation

Expected: device appears under both Matter and Thread integrations in HA, and is controllable from the dashboard.

If pairing times out: see "Failure modes" table in the spec — most likely Thread mesh prefix isn't routed back to LAN. Operator runs nft list ruleset and ip -6 route on jupiter; debug from there.


Task 9: Merge to master

Files: (none)

  • Step 1: Final branch state

On dev Mac:

git log --oneline master..feature/ha-zbt-2-thread

Expected (in chronological order from oldest to newest):

  1. e8d09f4 — original ZHA commit
  2. dbeda27 — design spec
  3. <revert hash> — Revert "feat(home-assistant): enable ZHA for ZBT-2 Zigbee dongle"
  4. <task-2 hash> — feat(home-assistant): import openthread-border-router module from unstable
  5. <task-4 hash> — feat(home-assistant): enable OTBR for ZBT-2 + add HA otbr/thread components

That's a fine history to merge as-is (the ZHA→revert pair is honest about the pivot).

  • Step 2: Hand off — merge

The user runs the merge themselves (per repo policy: never commit to master without explicit consent). Tell the operator:

"If the smoke tests in Task 8 worked, merge with:

git switch master
git merge --no-ff feature/ha-zbt-2-thread
git push origin master

Or open a merge request / PR if you prefer review first."

  • Step 3: Optional cleanup

After merge:

git branch -d feature/ha-zbt-2-thread
git push origin --delete feature/ha-zbt-2-thread

Self-Review

Spec coverage:

  • Goals (4 bullets) → Tasks 2 (OTBR module wiring), 4 (OTBR enable + HA components), 5 (firmware flash), 8 (Matter-over-Thread smoke test) ✓
  • Non-goals → respected; no multipan, no auto-flash, no fallback paths ✓
  • Architecture diagram → Task 4 produces the wiring shown; Tasks 68 verify it ✓
  • File changes (one module) → Tasks 1, 2, 4 ✓
  • Reverts of prior ZHA commit → Task 1 ✓
  • Operator workflow steps 07 → Tasks 1, 2, 3, 4, 5, 6, 7, 8 ✓
  • Verification (eval-only / service-level / functional) → Tasks 2/4/6/7/8 ✓
  • Failure-mode table → referenced in Tasks 6, 7, 8 for triage ✓

Placeholder scan:

  • <PASTE-BY-ID-PATH-FROM-TASK-3> in Task 4 step 3 is intentional — it's a runtime parameter the operator fills in, captured in Task 3.
  • <serial>, <version> in shell commands are intentional placeholders for operator substitution.
  • No "TBD", "TODO", "implement later", or vague "handle errors" steps.

Type / name consistency:

  • services.openthread-border-router used consistently (matches the unstable module's option path).
  • pkgs.unstable.openthread-border-router matches the overlay (machines/configuration.nix:11).
  • extraComponents strings ("otbr", "thread") match HA Core integration names.
  • radio.deviceradio.url relationship documented (module composes url from device).