Files
nixos/docs/superpowers/plans/2026-05-10-zbt2-thread-otbr.md
marthsincemelee 311e358d88 docs(plan): correct Task 2 scope — specialArgs needed for self in imports
The original plan claimed no flake-level changes were needed because
machines/configuration.nix:21 already passes `_module.args.self = self;`.
That's only true for `config`-time evaluation; `imports` are collected
before `config` is available, so referencing `self` in `imports` causes
infinite recursion. Fix: promote `self` to `specialArgs` on each
nixosSystem call. The implementer of Task 2 caught this on first
dispatch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 15:52:20 +02:00

21 KiB
Raw Permalink Blame History

ZBT-2 Thread + OTBR Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Run the Home Assistant Connect ZBT-2 as an OpenThread Border Router on jupiter, fully integrated with the existing native services.home-assistant + services.matter-server stack so Matter-over-Thread devices commission through the dongle.

Architecture: Single NixOS module file (modules/environments/home-assistant/default.nix) is edited to import the services.openthread-border-router module from nixos-unstable (not yet in 25.11 stable), enable it against the ZBT-2's /dev/serial/by-id/... path, and add HA's otbr + thread extra components. The previous ZHA-direction commit on this branch is reverted first. The dongle is one-time-flashed from Zigbee NCP firmware to OpenThread RCP firmware via universal-silabs-flasher outside the NixOS lifecycle (per design decision: option B, CLI-only).

Tech Stack: Nix flakes (flake-parts), NixOS 25.11 stable + nixos-unstable, services.openthread-border-router, services.home-assistant, services.matter-server, python313Packages.universal-silabs-flasher.

Spec: docs/superpowers/specs/2026-05-10-zbt2-thread-otbr-design.md — read this before starting.

User feedback rules in force:

  • Never commit to master; this branch is feature/ha-zbt-2-thread. Final merge happens at the end via PR or operator-driven merge.
  • Do not SSH to jupiter. All commands targeting jupiter are operator handoffs — present the command, the user runs it and pastes output back.

File Map

Action File Responsibility
Modify modules/environments/home-assistant/default.nix Import unstable OTBR module; enable OTBR for the ZBT-2; add otbr + thread HA components
Modify machines/configuration.nix Pass self via specialArgs so it's available during NixOS module imports evaluation (not just config)
Create (auto) (no new files) All work fits in the two modules

The git revert of e8d09f4 automatically un-modifies the home-assistant module (drops "zha" and the dialout line). No host-level (machines/jupiter/) changes.

Why the flake-level edit is needed: the existing _module.args.self = self; in machines/configuration.nix:21 makes self available in module bodies (option definitions, config blocks). It does not make self available during imports evaluation — _module.args is resolved from config, but imports are collected before config is evaluated, so self in imports causes an infinite recursion error. Promoting self to specialArgs short-circuits that and is the conventional fix.


Validation Approach (instead of unit tests)

This is a NixOS configuration change; there's no test framework. We use nix eval against nixosConfigurations.jupiter.config.* as the equivalent of unit tests — assert option resolution before the change (red), then after the change (green). Functional / smoke tests happen post-nixos-rebuild on jupiter via systemctl, mDNS, and the HA UI.

All nix eval commands run on the dev Mac. All systemctl / journalctl / nixos-rebuild commands run on jupiter (operator handoff).


Task 1: Revert the prior ZHA commit

Files:

  • Modify: modules/environments/home-assistant/default.nix (via git revert)

  • Step 1: Verify pre-state

On dev Mac, in the repo root:

git log --oneline -3

Expected: dbeda27 (design spec) on top of e8d09f4 (the ZHA commit) on top of 098e632.

Also confirm current extraComponents includes "zha":

nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app","zha"]

  • Step 2: Revert
git revert --no-edit e8d09f4

Expected: revert commit created cleanly (no merge conflicts), single file changed.

  • Step 3: Verify post-state
nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app"]zha is gone.

nix eval --json .#nixosConfigurations.jupiter.config.users.users.hass.extraGroups

Expected: []dialout is gone.

git log --oneline -4

Expected: revert commit on top of dbeda27 on top of e8d09f4.

(No explicit git commit step — git revert produced its own commit.)


Task 2: Wire the unstable OTBR module import (still disabled)

This task gets the module into scope so options become available, but leaves services.openthread-border-router.enable = false (the default). The point is to confirm the import path works before adding device-specific config.

Files:

  • Modify: machines/configuration.nix (add specialArgs = { inherit self; }; to each nixosSystem call)

  • Modify: modules/environments/home-assistant/default.nix

  • Step 1: Write the failing eval check

On dev Mac:

nix eval --json .#nixosConfigurations.jupiter.options.services.openthread-border-router.enable.description 2>&1 | head -3

Expected: error containing attribute 'openthread-border-router' missing or similar — the option doesn't exist yet because the module isn't imported.

  • Step 1a: Promote self to specialArgs in machines/configuration.nix

self must be reachable during imports evaluation (not just config evaluation). The existing _module.args.self = self; only covers config-time access. Edit each nixosSystem call (jupiter and mibook) to add specialArgs.

Current shape (lines 5056 and 5763):

jupiter = nixosSystem {
  system = "x86_64-linux";
  modules = defaultModules ++ [
    # nixos-hardware.nixosModules.bmax-b7-power
    ./jupiter/configuration.nix
  ];
};
mibook = nixosSystem {
  system = "x86_64-linux";
  modules = defaultModules ++ [
    # nixos-hardware.nixosModules.mibook
    ./mibook/configuration.nix
  ];
};

Add specialArgs = { inherit self; }; to each:

jupiter = nixosSystem {
  system = "x86_64-linux";
  specialArgs = { inherit self; };
  modules = defaultModules ++ [
    # nixos-hardware.nixosModules.bmax-b7-power
    ./jupiter/configuration.nix
  ];
};
mibook = nixosSystem {
  system = "x86_64-linux";
  specialArgs = { inherit self; };
  modules = defaultModules ++ [
    # nixos-hardware.nixosModules.mibook
    ./mibook/configuration.nix
  ];
};
  • Step 2: Add self to the module's argument list and add the imports block

Current header (modules/environments/home-assistant/default.nix lines 111):

# manages home automations 
{
  config,
  lib,
  pkgs,
  ...
}:
let
  cfg = config.my.profiles.home-assistant;
  hostName = config.networking.hostName;
in

Replace lines 111 with:

# manages home automations
{
  config,
  lib,
  pkgs,
  self,
  ...
}:
let
  cfg = config.my.profiles.home-assistant;
  hostName = config.networking.hostName;
in

Then, immediately after the opening brace on line 12 of the modified file (i.e. at the top of the attribute set body, before options.my.profiles.home-assistant), add:

  imports = [
    # services.openthread-border-router isn't in nixos-25.11; pull from
    # nixpkgs-unstable. Package comes from the existing unstable overlay.
    "${self.inputs.nixpkgs-unstable}/nixos/modules/services/home-automation/openthread-border-router.nix"
  ];

  • Step 3: Re-run the eval check
nix eval --json .#nixosConfigurations.jupiter.options.services.openthread-border-router.enable.description 2>&1 | head -3

Expected: a JSON string describing the option (e.g. "Whether to enable the OpenThread Border Router.").

  • Step 4: Verify the service is currently disabled
nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable

Expected: false.

  • Step 5: Verify whole config still evaluates
nix eval .#nixosConfigurations.jupiter.config.system.build.toplevel.drvPath

Expected: a /nix/store/...drv path. Pre-existing trace warnings (the *.service ordered after network-online.target ones) are fine; no errors.

  • Step 6: Commit
git add machines/configuration.nix modules/environments/home-assistant/default.nix
git commit -m "$(cat <<'EOF'
feat(home-assistant): import openthread-border-router module from unstable

Pulls the services.openthread-border-router NixOS module directly from
nixpkgs-unstable since it isn't in 25.11 yet. Service stays disabled
in this commit; configuration follows.

Also promotes `self` from `_module.args` to `specialArgs` in
machines/configuration.nix, since `imports` are evaluated before
`config` and so can't reach `_module.args.self`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
EOF
)"

Task 3: Operator handoff — get the ZBT-2 device path from jupiter

This task has no code. It collects the runtime parameter (USB serial number) that Task 4 needs.

Files: (none)

  • Step 1: Hand off

Tell the operator:

"Plug the ZBT-2 into a USB-2 port on jupiter (it's still on stock Zigbee firmware — that's fine for this step). Then run ls -l /dev/serial/by-id/ on jupiter and paste the full output back. We're after the line that contains Nabu_Casa_Home_Assistant_Connect_ZBT-2."

  • Step 2: Wait for the operator's pasted output

Expected shape: a line like lrwxrwxrwx 1 root root 13 May 10 14:30 usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_<serial-string>-if00 -> ../../ttyACM0

  • Step 3: Record the by-id path

Capture the value /dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_<serial-string>-if00 for use in Task 4. Use the by-id path (not /dev/ttyACM0) so USB renumbering can't break OTBR.


Task 4: Enable OTBR + add HA otbr/thread components

Files:

  • Modify: modules/environments/home-assistant/default.nix

  • Step 1: Write the failing eval checks

On dev Mac:

nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable

Expected: false (still disabled from Task 2).

nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app"] — no otbr, no thread yet.

  • Step 2: Add "otbr" and "thread" to extraComponents

In modules/environments/home-assistant/default.nix, locate the extraComponents list (currently [ "matter" "mobile_app" ]) and replace it with:

      extraComponents = [
        "matter"
        "mobile_app"
        "otbr"
        "thread"
      ];
  • Step 3: Add the services.openthread-border-router block

In the same file, after the services.home-assistant.config = { ... }; block and before my.homepage.services, add:

    services.openthread-border-router = {
      enable = true;
      package = pkgs.unstable.openthread-border-router;
      openFirewall = true;
      backboneInterfaces = [ "enp3s0" ];
      radio.device = "<PASTE-BY-ID-PATH-FROM-TASK-3>";
    };

Replace <PASTE-BY-ID-PATH-FROM-TASK-3> with the literal string captured in Task 3 step 3 (e.g. "/dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_AB12CD34-if00").

  • Step 4: Run the green eval checks
nix eval --json .#nixosConfigurations.jupiter.config.services.home-assistant.extraComponents

Expected: ["matter","mobile_app","otbr","thread"].

nix eval --json .#nixosConfigurations.jupiter.config.services.openthread-border-router.enable

Expected: true.

nix eval --raw .#nixosConfigurations.jupiter.config.services.openthread-border-router.radio.url

Expected: a string like spinel+hdlc+uart:///dev/serial/by-id/usb-Nabu_Casa_..._ZBT-2_<serial>-if00?uart-baudrate=115200 (the module composes this from radio.device automatically).

  • Step 5: Full eval — system derivation must build
nix eval .#nixosConfigurations.jupiter.config.system.build.toplevel.drvPath

Expected: a /nix/store/...drv path with no eval errors.

  • Step 6: nix flake check for good measure
nix flake check

Expected: no errors. (Same pre-existing trace warnings as before are acceptable.)

  • Step 7: Commit
git add modules/environments/home-assistant/default.nix
git commit -m "$(cat <<'EOF'
feat(home-assistant): enable OTBR for ZBT-2 + add HA otbr/thread components

Brings up otbr-agent against the ZBT-2 over Spinel/UART, opens the
REST API on :8081, and wires HA's otbr + thread integrations so
Matter-over-Thread devices can commission through the existing
matter-server.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
EOF
)"

Task 5: Operator handoff — flash OpenThread RCP firmware on the dongle

The dongle is currently running Zigbee NCP firmware and won't speak Spinel until reflashed. This must happen before Task 6's rebuild (otherwise otbr-agent will try to talk to a Zigbee-firmware dongle and fail).

Files: (none on dev Mac)

  • Step 1: Hand off — fetch firmware

Tell the operator:

"On any machine with a browser: download the latest ZBT-2 OpenThread RCP .gbl from https://github.com/NabuCasa/silabs-firmware-builder/releases. The asset name will look like ot-rcp-zbt-2-<version>.gbl. Get it onto jupiter — scp it over, or just curl from jupiter's shell. Confirm by running ls ~/ot-rcp-zbt-2-*.gbl on jupiter and pasting the result."

  • Step 2: Wait for confirmation

Expected: a single matching path, e.g. /home/finn/ot-rcp-zbt-2-2025.10.0.gbl.

  • Step 3: Hand off — flash

Tell the operator:

"On jupiter, run (substituting the actual by-id path from Task 3 and the actual .gbl filename):

nix shell nixpkgs#python313Packages.universal-silabs-flasher -c \
  universal-silabs-flasher \
    --device /dev/serial/by-id/usb-Nabu_Casa_Home_Assistant_Connect_ZBT-2_<serial>-if00 \
    flash --firmware ~/ot-rcp-zbt-2-<version>.gbl

Paste the full output. Expected duration: ~30 seconds. The tool detects the running firmware, drops the dongle into bootloader mode, writes the .gbl, and reboots back to RCP."

  • Step 4: Verify the flash succeeded

Expected output ends with something like Firmware update complete (or equivalent success message). If the tool reports CRC failure / partial write — re-run; the bootloader stays addressable.

If the operator reports --help shows different subcommand syntax (universal-silabs-flasher's CLI has changed across versions), have them check universal-silabs-flasher --help and adapt — but the flash --firmware <path> form has been stable since 1.0.x.


Task 6: Operator handoff — nixos-rebuild switch on jupiter

Files: (none on dev Mac)

  • Step 1: Push the branch so jupiter can fetch it

On dev Mac:

git push -u origin feature/ha-zbt-2-thread

(If the operator pulls via a different mechanism — local checkout, fileshare — adapt accordingly. The standard pattern in this repo is git pull on jupiter.)

  • Step 2: Hand off — pull + rebuild

Tell the operator:

"On jupiter:

cd ~/development/nixos      # or wherever the flake lives on jupiter
git fetch origin
git checkout feature/ha-zbt-2-thread
sudo nixos-rebuild switch --flake .#jupiter

Paste the tail of the output (everything from the first building ... line onward). Expected: build completes, switch to the new generation, no errors."

  • Step 3: Verify the switch succeeded

If the operator's pasted output includes error: or the switch failed mid-activation, stop here. Common failure: option name mismatch with whatever version of nixos-unstable is locked in the flake. Fix on dev Mac, push, ask operator to pull + rebuild again.

If the rebuild succeeded, proceed to Task 7.


Task 7: Operator handoff — service-level verification on jupiter

Files: (none)

  • Step 1: Hand off — service health

Tell the operator:

"On jupiter, run each command and paste output:

systemctl status otbr-agent.service --no-pager
journalctl -u otbr-agent.service -n 50 --no-pager
ip link show wpan0
```"
  • Step 2: Verify

Expected:

  • systemctl status reports active (running).
  • journalctl shows OTBR startup messages, no repeated restart loops.
  • ip link show wpan0 shows the interface exists; state DOWN is correct (HA hasn't formed a network yet).

If otbr-agent is in restart loop with Failed to open device: device path mismatch. Re-check Task 3's path.

  • Step 3: Hand off — mDNS publication

Tell the operator:

"On jupiter:

avahi-browse -r -t _meshcop._udp
```"

Expected: one entry whose hostname matches jupiter, advertising port 8081.

If empty: backboneInterfaces is wrong. On jupiter, run ip link show and tell operator to paste; pick the actual primary LAN interface, update backboneInterfaces, re-rebuild.

  • Step 4: Hand off — REST API reachability

Tell the operator:

"On jupiter:

curl -s http://127.0.0.1:8081/node/state
```"

Expected: a JSON state string, most likely "disabled" (HA hasn't formed a network yet).

If connection refused: OTBR isn't actually listening — re-check journalctl.


Task 8: Operator handoff — HA UI smoke test

Files: (none)

  • Step 1: Hand off — confirm discovery

Tell the operator:

"Open http://jupiter:8123 in a browser. Go to Settings → Devices & Services. Within ~30s of the rebuild, you should see 'Open Thread Border Router' under 'Discovered'. Click Configure. Let HA form a new Thread network (or import existing dataset if you have one). Tell me when that's done — and paste any errors if it doesn't work."

  • Step 2: Wait for confirmation

Expected: HA reports the Thread network is formed; the OTBR integration appears under 'Configured'.

If discovery doesn't happen: cross-check with Task 7 step 3 (avahi-browse). HA reads from the system's avahi cache.

  • Step 3: Hand off — Matter-over-Thread pairing

Tell the operator:

"Pick one Matter-over-Thread device. Use the HA Companion app, scan its Matter QR code, and follow the prompts. Tell me when it's paired — or paste any errors. Pairing should complete in 3090s."

  • Step 4: Wait for confirmation

Expected: device appears under both Matter and Thread integrations in HA, and is controllable from the dashboard.

If pairing times out: see "Failure modes" table in the spec — most likely Thread mesh prefix isn't routed back to LAN. Operator runs nft list ruleset and ip -6 route on jupiter; debug from there.


Task 9: Merge to master

Files: (none)

  • Step 1: Final branch state

On dev Mac:

git log --oneline master..feature/ha-zbt-2-thread

Expected (in chronological order from oldest to newest):

  1. e8d09f4 — original ZHA commit
  2. dbeda27 — design spec
  3. <revert hash> — Revert "feat(home-assistant): enable ZHA for ZBT-2 Zigbee dongle"
  4. <task-2 hash> — feat(home-assistant): import openthread-border-router module from unstable
  5. <task-4 hash> — feat(home-assistant): enable OTBR for ZBT-2 + add HA otbr/thread components

That's a fine history to merge as-is (the ZHA→revert pair is honest about the pivot).

  • Step 2: Hand off — merge

The user runs the merge themselves (per repo policy: never commit to master without explicit consent). Tell the operator:

"If the smoke tests in Task 8 worked, merge with:

git switch master
git merge --no-ff feature/ha-zbt-2-thread
git push origin master

Or open a merge request / PR if you prefer review first."

  • Step 3: Optional cleanup

After merge:

git branch -d feature/ha-zbt-2-thread
git push origin --delete feature/ha-zbt-2-thread

Self-Review

Spec coverage:

  • Goals (4 bullets) → Tasks 2 (OTBR module wiring), 4 (OTBR enable + HA components), 5 (firmware flash), 8 (Matter-over-Thread smoke test) ✓
  • Non-goals → respected; no multipan, no auto-flash, no fallback paths ✓
  • Architecture diagram → Task 4 produces the wiring shown; Tasks 68 verify it ✓
  • File changes (one module) → Tasks 1, 2, 4 ✓
  • Reverts of prior ZHA commit → Task 1 ✓
  • Operator workflow steps 07 → Tasks 1, 2, 3, 4, 5, 6, 7, 8 ✓
  • Verification (eval-only / service-level / functional) → Tasks 2/4/6/7/8 ✓
  • Failure-mode table → referenced in Tasks 6, 7, 8 for triage ✓

Placeholder scan:

  • <PASTE-BY-ID-PATH-FROM-TASK-3> in Task 4 step 3 is intentional — it's a runtime parameter the operator fills in, captured in Task 3.
  • <serial>, <version> in shell commands are intentional placeholders for operator substitution.
  • No "TBD", "TODO", "implement later", or vague "handle errors" steps.

Type / name consistency:

  • services.openthread-border-router used consistently (matches the unstable module's option path).
  • pkgs.unstable.openthread-border-router matches the overlay (machines/configuration.nix:11).
  • extraComponents strings ("otbr", "thread") match HA Core integration names.
  • radio.deviceradio.url relationship documented (module composes url from device).