Troubleshooting¶

For failed commands, runtime connection issues, stale values, or browser/runtime surfaces that do not match the expected state. For product questions, use FAQ.

First three things to try¶

run build/validate first
confirm the config files and transport choice are the ones you think they are
reduce the system to the smallest local path that still reproduces the issue

Problem routing¶

If the problem sounds like...	Go to
runtime-to-runtime communication	Program / Communication / Runtime To Runtime
hardware or fieldbus issues	Program / I/O And Hardware
editor/runtime panel issues	Program / PLC Programming / Debugging
HMI issues	Run / Operator HMI
runtime-cloud issues	Run / Runtime Cloud

Common symptoms¶

Symptom	First check	Then go to
no `Structured Text:` commands in VS Code	extension enabled and command palette opened with `Ctrl/Cmd+Shift+P`	Installation
build fails before runtime starts	diagnostics and config files	Build, Validate, Test
runtime panel cannot connect	control endpoint, local process, and port	Debugging And Runtime Panel
values stay stale	I/O mapping and driver choice	I/O Binding
`/hmi` opens but values are wrong	runtime freshness and descriptor bindings	HMI And Web UI
remote node cannot pair	discovery, pairing, and firewall boundary	Discovery And Pairing

Common first checks¶

run diagnostics: Agent Quickstart or Build, Validate, Test
inspect config files: Config reference
verify transport/protocol choice: Protocol Matrix
verify runtime control/reload path: Compile, Validate, Reload

Runtime Cloud Troubleshooting¶

Use this runbook when runtime cloud behavior is not as expected.

1) First-Tier Checks¶

trust-runtime ctl --project <project> status
trust-runtime ctl --project <project> config.get
curl -s http://<host>:<port>/api/runtime-cloud/state | jq

Confirm:

runtime is healthy
config/profile matches intent
runtime cloud state/topology is populated

HardRT truth (non-negotiable):

T0 HardRT is same-host only.
Generic IP mesh (T1/T2/T3) is non-HardRT by design and must never be treated as a deterministic fallback for T0.

2) Symptom Matrix¶

Symptom	Likely Cause	Deterministic Fix
Peer runtime not visible	discovery disabled or blocked	enable `[runtime.discovery]`, verify network/mDNS, check service name uniqueness
Peer visible but stale/degraded	mesh metadata/path unavailable	verify `[runtime.mesh]` listen/auth/tls and peer reachability
`not_configured` deny on preflight	profile preconditions not met	for `plant`/`wan`, enforce `runtime.web.auth = "token"` and `runtime.web.tls = true`
`permission_denied` on cross-site write	WAN allowlist missing	add `runtime.cloud.wan.allow_write` rule for action+target and retry dry-run preflight
`acl_denied_cfg_write`	actor role too low for config write	use actor with required role (`operator`/`admin` by policy)
`contract_violation` with T0 + mesh/IP route text	attempted non-HardRT fallback	re-bind handles to `T0_HardRt`; keep mesh path for ops/diag only
revision/etag conflict	concurrent config write	re-fetch config snapshot, rebase change, retry
dispatch lacks `audit_id`	outdated runtime or failed dispatch path	verify runtime version and inspect per-target result
UI shows healthy when peer unreachable	stale transition not applied yet	re-check `/api/runtime-cloud/state` status and mesh liveliness metadata

3) Profile-Specific Checks¶

`dev`¶

runtime.cloud.profile = "dev"
local auth/TLS flexibility is expected

`plant`¶

requires token auth and TLS
denies remote dispatch when target secure metadata is missing

`wan`¶

all plant requirements, plus cross-runtime write default-deny
explicit allowlist required for write actions

4) Preflight Before Dispatch¶

Use /api/runtime-cloud/actions/preflight and inspect:

allowed
denial_code
denial_reason
per-target state (reachable, stale, partitioned)

Do not skip preflight for cross-site or multi-target writes.

5) Reliability and Stability Gates¶

cargo test -p trust-runtime --test runtime_reliability
./scripts/runtime_load_test.sh tests/fixtures/runtime_reliability_bundle
./scripts/runtime_mesh_tls_stability_gate.sh --iterations 3
./scripts/runtime_comms_conformance_gate.sh
./scripts/check_zenoh_baseline.sh

6) Evidence Capture Pattern¶

Store:

request payload (request_id retained)
preflight/dispatch responses
relevant /api/runtime-cloud/state snapshot
runtime logs and gate logs

Recommended location:

target/gate-artifacts/<timestamp>/...

7) Escalation Criteria¶

Escalate immediately if:

HA split-brain safeguards fail
cross-runtime writes bypass expected deny rules
realtime/T0 behavior appears to fall back to mesh/IP
audit correlation (request_id/audit_id) is missing for protected actions