# Engine — WinDivert + SOCKS5 transparent proxy for Discord **Status**: design accepted 2026-05-01. **Replaces**: stub `StartEngine`/`StopEngine` in `internal/gui/app.go` that just toggle a flag. **Implements**: Phase 2 from `docs/planning/cuddly-baking-taco.md`. ## Why The checker proves the upstream SOCKS5 proxy works. The engine is what actually routes Discord's traffic through it. Without the engine, every diagnostic in the world is theatre — the GUI just sits there saying "Active" while Discord still talks direct to discord.com. Phase 2 turns that "Active" state into reality: kernel-level packet capture (WinDivert), NAT-style TCP redirect to a loopback listener, SOCKS5 UDP ASSOCIATE for voice, and a polished lifecycle so the user can install once, click "autostart at login", and forget the thing exists until Discord stops working — at which point the tray icon turns yellow and explains why. ## Architecture decisions (locked-in 2026-05-01) | # | Decision | Rationale | |---|---|---| | **A** | GUI-only single-process; no Windows service | Friends-and-family Windows-PC, Discord runs only when user is logged in. Service mode is overengineering for v1; can be added in v0.4 if a power user asks. | | **B1** | UAC prompt at every launch; no scheduled-task trampoline | User chose simplicity over polish. Each `drover.exe` invocation re-elevates if not admin. Autostart via `HKCU\...\Run` triggers the same prompt at login. | | **C1** | No DPI bypass (no fake QUIC injection) | Start with the simplest pipeline that works. If a friend reports voice not working on a DPI-active provider, add C2/C3 in v0.4. | | **D1** | Window X = hide-to-tray + first-time toast; quit only via tray menu | Industry-standard (Steam, Discord, Telegram). One-shot toast prevents the "where did it go?" surprise. | | **E3** | Contextual recovery: driver-loss → 1 reopen retry → fail-stop; proxy-loss → infinite exp-backoff (Reconnecting state); panic → fail-stop with crash dump; sleep/resume → graceful pause/resume | Different failure classes need different responses. Aggressive auto-restart on every error masks bugs; honest fail-stop on every error annoys the user during transient network blips. | ## High-level architecture ``` ┌─────────────────────────────────────┐ │ drover.exe (single binary) │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ Wails GUI │ │ systray │ │ │ └──────┬───────┘ └──────┬───────┘ │ │ └───────┬────────┘ │ │ ┌─────────▼──────────┐ │ │ │ Engine │ │ │ │ state machine │ │ │ │ Idle / Starting / │ │ │ │ Active / Reconn / │ │ │ │ Failed │ │ │ └─────────┬──────────┘ │ │ ┌─────────┼─────────────┐ │ │ ▼ ▼ ▼ │ │ ┌──────┐ ┌────────┐ ┌──────────┐ │ │ │divert│ │redirect│ │ procscan │ │ │ │ pkt │ │ TCP+UDP│ │ (2s tick)│ │ │ └──┬───┘ └───┬────┘ └────┬─────┘ │ │ ▼ ▼ │ │ │ WinDivert socks5 │ │ │ .sys client │ │ └──────────────────────────────┼──────┘ │ ┌────────────┐ ┌─────────────▼───┐ │ kernel │ │ upstream SOCKS5 │ │ packet cap │ │ (mihomo) │ └────────────┘ └─────────────────┘ ``` ## File layout ``` cmd/drover/ main.go existing — extend with engine startup, single-instance check uac_windows.go new — IsAdmin, ReElevate console_windows.go existing autoupdate_windows.go existing internal/engine/ engine.go new — orchestration, state machine, lifecycle state.go new — Idle/Starting/Active/Reconnecting/Failed enum + transitions recovery.go new — failure classifier → action mapper health.go new — heartbeat timer, traffic detector power_windows.go new — WM_POWERBROADCAST listener (sleep/resume) internal/divert/ divert.go new — WinDivert handle wrapper filter.go new — filter expression builder packet.go new — IPv4 + TCP/UDP parse + checksum recompute installer.go new — extract embedded WinDivert.sys/.dll on first run divert_arm64.go new — stub returning "ARM64 not supported" internal/socks5/ NEW — production client (separate from internal/checker/socks5.go) client.go new — TCP CONNECT + greet/auth udp.go new — UDP ASSOCIATE + encapsulate/decapsulate pool.go new — control-channel pool (deferred to P2.5 if needed) internal/redirect/ tcp.go new — NAT-loopback redirect listener + per-flow pump udp.go new — per-flow UDP tracker + encap/decap internal/procscan/ procscan.go new — Toolhelp32 snapshot, periodic PID resolver internal/tray/ tray.go new — getlantern/systray icon + menu icons.go new — embed idle/active/reconnecting/error ICOs internal/autostart/ autostart_windows.go new — HKCU\...\Run registry toggle internal/single/ single_windows.go new — named mutex + activation pipe internal/config/ config.go new — TOML schema + defaults loader.go new — load/save with file lock watcher.go new — fsnotify hot-reload internal/gui/ app.go existing — extend with engine bindings frontend/... existing — wire engine controls + autostart checkbox third_party/windivert/ existing — WinDivert64.sys, WinDivert.dll, LICENSE-LGPL third_party/icons/ new — tray/{idle,active,reconnecting,error}.ico ``` ## Engine state machine ``` ┌────────┐ │ Idle │ ◄────────────────── (initial) └────┬───┘ │ user clicks "Start engine" ▼ ┌────────────┐ ┌──────│ Starting │── any error ───┐ │ └─────┬──────┘ │ │ │ all checks ok │ │ ▼ │ │ ┌────────────┐ │ │ │ Active │ ◄─── recover ─┐ │ │ └────┬───────┘ │ │ │ proxy lost / SOCKS5 │ │ │ control channels died │ │ ▼ │ │ ┌─────────────┐ │ │ │Reconnecting │── 5 min cap ──┐ │ │ └────┬────────┘ │ │ │ recovered │ │ ▼ │ │ back to Active │ │ │ │ Stop button ─►───────────────────┐│ │ ▼▼ │ ┌────────┐ └──── Stop ───────────────────►│ Failed │ └────┬───┘ │ user clicks Retry ▼ (back to Starting) ``` States visible to GUI as `EngineStatus`: - `Idle` — engine off, tray icon grey, GUI shows "Start" button - `Starting` — handle being opened, procscan running, health-check; tray yellow with spin - `Active` — packets flowing; tray green; live stats updating - `Reconnecting` — proxy unreachable, exponential backoff in progress; tray yellow; "Reconnecting (3rd attempt)" - `Failed` — driver lost twice OR panic OR Reconnecting hit 5 min cap. Tray red. GUI shows error message + Retry button. ## E3 recovery rules (failure classifier) ```go // internal/engine/recovery.go type FailureClass int const ( ClassDriverLost FailureClass = iota // WinDivert handle invalid, ERROR_INVALID_HANDLE on Recv ClassDriverGone // WinDivertOpen returns ERROR_FILE_NOT_FOUND or similar ClassProxyUnreachable // SOCKS5 control TCP connection rejected/timeout ClassPanic // recover() in goroutine ClassSleep // WM_POWERBROADCAST suspend ClassResume // WM_POWERBROADCAST resume ClassFatal // anything we can't classify ) type Action int const ( ActionRetryOnce Action = iota // sleep 2s, reopen, if fails again → Failed ActionExpBackoff // 1s → 5s → 30s cap, infinite, max 5min cumulative ActionFailStop // straight to Failed, write crash dump ActionPause // drain in-flight, close sockets, transition to Reconnecting ActionResume // wait 5s, reopen handle, transition to Active ) func ClassifyFailure(err error, class FailureClass) Action ``` | Class | Action | UI feedback | |---|---|---| | `DriverLost` | RetryOnce | Status="reopening driver" | | `DriverGone` | FailStop | "Driver missing — reinstall Drover" | | `ProxyUnreachable` | ExpBackoff | "Reconnecting (Nth attempt)…" | | `Panic` | FailStop | "Engine crashed — log saved to %PROGRAMDATA%\\Drover\\logs\\crash-*.txt" | | `Sleep` | Pause | "Paused (system sleep)" | | `Resume` | Resume | "Resuming…" then back to Active | **Health-check before Start engine**: GUI's Start button first runs `internal/checker.Run` with a reduced subset (tcp + greet + udp tests, 2s budget, no voice-quality). If any fails, the engine doesn't start and the GUI shows what failed. Prevents the "I clicked Start but Discord still doesn't work" mystery. **Heartbeat timer**: every 5s, sample `(rxBytes_now - rxBytes_5sAgo) > 0`. If false for 30s while Active and procscan reports Discord PIDs > 0, set status=`Active (no traffic)` (informational sub-state, tray green→yellow but state machine stays in Active). User sees this and can investigate (Discord might just be idle). **Crash dumps**: panic recover in any engine goroutine writes `%PROGRAMDATA%\Drover\logs\crash-YYYYMMDD-HHMMSS.txt` with full stack + goroutine dump + version. Then transitions to Failed. ## WinDivert layer ### Filter expression (rebuilt on PID list change) ``` outbound and (tcp or udp) and ip and (processId == 12345 or processId == 67890 or ...) and processId != and ip.DstAddr != and not (ip.DstAddr >= 224.0.0.0 and ip.DstAddr <= 239.255.255.255) and not (ip.DstAddr >= 127.0.0.0 and ip.DstAddr <= 127.255.255.255) and not (ip.DstAddr >= 169.254.0.0 and ip.DstAddr <= 169.254.255.255) ``` Notes: - `ip` (IPv4) only — no `ipv6` clause. Discord client falls back to v4 in ~150ms via Happy Eyeballs. - `processId != own_pid` is critical — without it our own SOCKS5 traffic to upstream gets caught and infinite-looped. - Multicast/loopback/link-local explicitly excluded (Discord never talks to those, but extra safety). If the upstream proxy IP cannot be resolved at engine start, we fail-stop with a clear message — we cannot build a correct filter without it. ### Library choice Use `github.com/imgk/divert-go` v0.1.0 (existing dep proposal — verify it still maintained when implementing P2.1). If unmaintained / broken, write thin syscall bindings directly — WinDivert C API is small (~6 functions used). ### Driver lifecycle 1. **First run**: extract embedded `WinDivert64.sys` + `WinDivert.dll` from Go `embed.FS` into `%PROGRAMDATA%\Drover\windivert\`. SHA256-verify against expected hashes (compiled in at build time). 2. **Open handle**: `WinDivertOpen(filter, layer=NETWORK, priority=0, flags=0)`. The driver auto-installs as a Windows service named "WinDivert" on first open. 3. **Driver remains installed across reboots** — we don't uninstall on Stop. Uninstaller (Inno Setup) explicitly does `sc stop WinDivert && sc delete WinDivert` on uninstall. ### Driver edge cases (D-series in matrix) - **D-1: not installed** → embedded copy + auto-install on WinDivertOpen. - **D-2: old v1.x** (zapret legacy) → `WinDivertOpen` returns `ERROR_DRIVER_FAILED_PRIOR_UNLOAD`. Detect: query service "WinDivert" via `OpenServiceW` + `QueryServiceStatusEx` to read binary path → check version resource. Show "Outdated WinDivert detected from another tool. Stop the other tool and reboot." - **D-3: corrupted .sys** → SHA256 mismatch on extract. Reinstall path (delete + recopy + retry). - **D-4: AV quarantine** → embedded bytes don't match expected → show specific error: "Antivirus may have quarantined WinDivert64.sys. Add `%PROGRAMDATA%\Drover\` to your AV exclusions and restart Drover." - **D-5: reboot pending** → install successful but service not started → show "Reboot required to activate driver" with no retry button. - **D-7: ARM64** → `runtime.GOARCH` check at startup; on ARM64 show "Drover requires x86-64 Windows. WinDivert does not support ARM64." ## TCP redirect (NAT-loopback) ### Mechanism 1. On engine start, bind a TCP listener on `127.0.0.1:0` (OS picks unused port). Save the port number. 2. WinDivert sees a new SYN from `Discord.exe → real_target_ip:real_target_port`. Engine: a. Modifies the IP header: `dst_addr = 127.0.0.1`, `dst_port = listener_port`. Stores mapping `(src_port → real_target_ip:port)` in a `sync.Map` with TTL 30 min. b. Recomputes IP + TCP checksums. c. Reinjects via `WinDivertSend` with direction=outbound. The kernel routes to loopback because dst is now 127.0.0.1. 3. Listener `accept()` returns a conn from `127.0.0.1:src_port`. Engine looks up mapping by `src_port`, finds real_target. 4. Engine opens fresh SOCKS5 control TCP to upstream, does greet + (auth if config) + CONNECT to real_target_ip:port. 5. Once SOCKS5 returns REP=00, `io.Copy` pumps bytes both directions until EOF on either side. 6. Conn close → drop mapping. ### TCP edge cases - **T-1: listener bind fails** → fail-stop "could not bind loopback listener". Should never happen (random unused port). - **T-2: 100+ concurrent flows** — sync.Map scales fine. Bound only by Discord's TCP usage (typically 50). - **T-3: TCP retransmits** — handled by OS at both sides of the loopback. - **T-4: IPv6** — dropped at filter level. Discord falls back to v4. - **T-5: half-closed** — `io.Copy` returns on EOF in one direction; we close the other side via `defer conn.Close()`. - **T-6: mapping leak** if conn never properly closes — TTL 30min sweeper goroutine deletes stale entries. ## UDP redirect (SOCKS5 UDP ASSOCIATE) ### Mechanism 1. WinDivert sees outbound UDP from `Discord.exe:src_port → real_target_ip:port`. Engine: a. Looks up mapping by `(src_ip, src_port, real_target_ip, real_target_port)`. If absent: b. **Open new SOCKS5 control TCP** to upstream. Greet + (auth) + UDP ASSOCIATE. c. Receive relay endpoint `(relay_ip, relay_port)` — if BND.ADDR is `0.0.0.0` substitute `upstream_proxy_ip`. d. Open client-side UDP socket on `127.0.0.1:0`. Save mapping `flow_id → {control_tcp, relay, client_udp}`. 2. **Outbound packet path**: encap with SOCKS5 UDP header `00 00 | 00 | ATYP=01 | DST_IP(4) | DST_PORT(2) | DATA`. Send via `client_udp.WriteTo(packet, relay)`. Don't reinject the original packet — drop it (we sent the encapsulated version through the relay). 3. **Inbound packet path** (separate goroutine per flow): `client_udp.ReadFrom(buf)` → strip 10-byte SOCKS5 header → fabricate an IPv4+UDP packet with `src=real_target_ip:port, dst=Discord_src_ip:src_port`, recompute checksums → `WinDivertSend` direction=inbound. Discord sees a normal reply from real_target. 4. Idle TTL 5 min: any flow with no packets for 5 min → close control_tcp + client_udp + remove mapping. ### UDP edge cases - **U-1**: each flow gets its own control TCP. No pool in v1 (overhead is ~5KB per flow, fine for ~10 active flows). - **U-2: idle leak** → 5min TTL. - **U-3: Discord changes voice region** mid-call → old flow goes idle (5min TTL), new flow opens. Brief glitch. - **U-4: UDP fragments** → SOCKS5 RFC 1928 doesn't support FRAG. Drop. Discord packets are typically <1500 bytes; fragmentation rare. - **U-5: control TCP dies** → next packet detects via `Write` error → close mapping → next-next packet opens fresh control. Audio glitch ~2-3s. ## Process scanning ### Mechanism `internal/procscan` runs every 2 seconds: 1. `CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0)` → enumerate via `Process32First`/`Process32Next`. Microseconds. 2. Filter by `szExeFile` against config `targets.processes` (case-insensitive on Windows). 3. Diff vs previous PID set. If different → notify engine to rebuild filter expression and reopen WinDivert handle. ### Race: Discord starts up to 2s before procscan catches it Mitigation: at engine `Start`, do **synchronous initial scan** before opening WinDivert handle. After that, the periodic 2s tick handles ongoing changes. ### Process edge cases - **P-1: Discord PID changes** → 2s scan + 50ms reopen gap with direct traffic. Acceptable. - **P-2: multiple Discord variants**: default config includes `Discord.exe`, `DiscordCanary.exe`, `DiscordPTB.exe`, `Update.exe`. Vesktop **opt-in** via config (not default). - **P-3: Update.exe** (Discord's updater) included in default — it downloads patches via HTTP and we want those proxied too. - **P-5: PID re-use** (Discord exits, Chrome takes the PID before next scan) → 2s window where Chrome packets get proxied. Cosmetic, low-impact. ## Self-loop protection The engine itself opens TCP/UDP connections to the upstream proxy. Without protection, the WinDivert filter would catch our own packets, encapsulate them in another SOCKS5 layer, infinite loop in seconds. Three layers of defense: 1. `processId != own_pid` in the filter expression. 2. `ip.DstAddr != ` (resolved once at engine start; if upstream uses DDNS we re-resolve every 30s of failed reconnects). 3. Listener and SOCKS5 client always bind to `127.0.0.1` — even if filter leaks, loopback traffic is excluded by `not (ip.DstAddr >= 127.0.0.0 ...)`. ## UAC + autostart (B1) ### Elevation `cmd/drover/main.go` startup sequence: ```go func main() { // 1. AttachConsole for CLI compatibility (existing) attachConsole() // 2. Single-instance check (mutex). If second instance, send "show" to first and exit. if !single.AcquireMutex() { single.ActivateExistingInstance() os.Exit(0) } // 3. Parse Cobra commands. CLI sub-commands like `--check` and `--version` don't need admin // and can run as user. The default GUI mode requires admin for WinDivert. if cmdNeedsAdmin() && !uac.IsAdmin() { uac.ReElevate(os.Args[1:]) // ShellExecute("runas", ...) + exit os.Exit(0) } // 4. Auto-update check (existing). Replace exe + relaunch if needed. autoUpdateOnStartup() // 5. Boot Wails GUI + engine. gui.Run(Version) } ``` `uac.ReElevate` uses `ShellExecuteW` with `lpVerb="runas"`. If user cancels UAC, `ShellExecute` returns `SE_ERR_ACCESSDENIED` → we exit cleanly without an error dialog (the user already saw their cancel intent). ### Autostart Implemented via `HKCU\Software\Microsoft\Windows\CurrentVersion\Run\DroverGo`: - Value type: REG_SZ, value: full path to `drover.exe` with no args - Set on toggle ON, deleted on toggle OFF - GUI Settings tab has a checkbox "Запускать при входе в Windows" that reads/writes this key **Edge case A-5**: User disables autostart via Task Manager → Startup Apps. Windows writes a `Disabled` mark in `HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\StartupApproved\Run`. On GUI mount we check both keys; if Disabled → checkbox shown unchecked (user wins). **Edge case A-6**: Stale path (drover.exe was moved). On every launch we re-write the key value to `os.Executable()` if autostart is enabled. Self-healing. ## Tray + window (D1) ### Tray icon (4 ICO files embedded) | State | Icon | When shown | |---|---|---| | `idle` | grey | Engine not running | | `active` | green | Engine running, traffic flowing | | `reconnecting` | yellow | Reconnecting state OR no-traffic-detected | | `error` | red | Failed state | ### Tray menu (right-click) ``` [●] Active · 2h 14m · ↑ 142 KB/s ↓ 1.2 MB/s [disabled status row, dynamic] ───────────────────────────────────── [⏸] Stop proxying [primary action, contextual] [🔍] Run check [opens window + auto-runs check] ───────────────────────────────────── [🪟] Show window [hidden when window is visible] [📁] Open log file ───────────────────────────────────── [🔄] Check for updates [ℹ] About ───────────────────────────────────── [✕] Quit ``` The status row is updated every 1s while engine is running. ### Click behaviors - Single-click tray icon → toggle window visibility - Double-click tray icon → open window (no toggle, always show) - X on window title bar → hide to tray (D1) - First-time only: toast "Drover свёрнут в трей. Engine продолжает работать. Закрыть полностью — через меню трея → Quit." Track via `config.ui.shown_tray_toast = true`. - Quit from tray menu → graceful engine stop → exit cleanly ### Library `github.com/getlantern/systray`. Stable on Win10/11 modulo the explorer-restart edge case which the library handles internally. ## Single-instance enforcement Mutex name: `Global\DroverGoInstance-` where `installID = SHA256(os.Executable())[:16]`. This way: - Installed copy at `C:\Program Files\Drover\drover.exe` and a portable copy at `D:\portable\drover.exe` get different mutexes — both can run. - Two simultaneous launches of the same install fight over the mutex; second loses. Activation pipe: `\\.\pipe\drover-gui-`. Second instance opens it, writes `{"action":"show"}`, closes. First instance's listener goroutine pops the window to foreground. If first instance crashes without cleanup → mutex disappears at process death (kernel handle table cleanup). Next launch acquires normally. ## Sleep/resume handling `WM_POWERBROADCAST` listener via Windows message loop in a dedicated goroutine. Uses `RegisterPowerSettingNotification` for fine-grained events. | Event | Action | |---|---| | `PBT_APMSUSPEND` | Engine: drain in-flight packets (give 200ms), close all SOCKS5 control TCPs, close WinDivert handle, set status="paused (sleep)" | | `PBT_APMRESUMEAUTOMATIC` or `PBT_APMRESUMESUSPEND` | Wait 5s for network reconnect (poll `GetIpForwardTable2` for default route presence), reopen WinDivert handle, run health-check, transition Active | ## Stats counters Atomic counters in `internal/engine/stats.go`: - `bytesIn uint64` — bytes received from upstream (decapsulated UDP + TCP `io.Copy` returns) - `bytesOut uint64` — bytes sent to upstream - `tcpFlowsActive int32` — current count of open TCP redirects - `udpFlowsActive int32` — current count of open UDP flows - `startedAt time.Time` — engine start time (for uptime) Per-flow counters discarded on flow close (no aggregation needed for v1). Tray status row updates from these every 1s. GUI live stats panel does the same via Wails event `stats:update` (existing path). Lifetime totals persisted to `%PROGRAMDATA%\Drover\stats.json` every 60s and on Stop. ## Config schema (TOML) `%APPDATA%\Drover\config.toml`: ```toml # Drover-Go config — auto-managed by GUI; manual edits hot-reload via fsnotify. version = 1 [proxy] host = "95.165.72.59" port = 12334 auth = false login = "" password = "" udp_associate_timeout = "5s" tcp_connect_timeout = "10s" [targets] processes = ["Discord.exe", "DiscordCanary.exe", "DiscordPTB.exe", "Update.exe"] include_vesktop = false [skip] # CIDR ranges to never proxy. Local + link-local always implicitly skipped at filter level. extra_skip_cidrs = [] multicast = true [ui] log_level = "info" log_max_mb = 10 log_backups = 3 tray_icon = true auto_start = false # mirror of HKCU\...\Run shown_tray_toast = false # one-shot first-close toast tracking theme = "dark" # dark | light | auto [update] check_on_startup = true forgejo_repo = "git.okcu.io/root/drover-go" [engine] heartbeat_interval = "5s" no_traffic_warn_after = "30s" reconnect_backoff_initial = "1s" reconnect_backoff_max = "30s" reconnect_total_cap = "5m" ``` Edge cases: - **M-4 corrupted TOML** → log warning + use defaults + GUI shows banner "Config error line N — running with defaults". - **M-7 hot-reload** → fsnotify on the file. On change: re-parse → if proxy section changed → engine restart (Stop → wait clean → Start). Other sections apply live. - **Config migration** v1→v2 handled by `version` field; missing version assumes 1. ## Edge case matrix (full) This is the master list. Every row must have a corresponding test or explicit "verified manually" note in the implementation plan. | # | Edge case | Mitigation | Test | |---|---|---|---| | **D-1** | WinDivert.sys not installed | Embed binary, copy to %PROGRAMDATA%, WinDivertOpen auto-loads | manual: clean Win11 VM | | **D-2** | Old WinDivert v1.x present (zapret legacy) | Service version query → "remove old version first" error | manual: install zapret first, verify error | | **D-3** | Driver corrupted | SHA256 verify on extract → reinstall flow with progress | unit test: SHA256 mismatch path | | **D-4** | AV quarantines our embedded .sys | Specific AV-friendly error message + README link | manual: Defender enabled + first run | | **D-5** | Reboot pending after install | Show "Reboot to activate driver" | manual: trigger via DISM | | **D-7** | ARM64 Windows | Detect at startup, refuse install | unit: GOARCH=arm64 build returns expected error | | **P-1** | Discord PID changes | 2s procscan + filter rebuild | integration: kill+restart Discord, verify continuity | | **P-3** | Update.exe traffic | Default list includes it | integration: trigger Discord update, verify Update.exe traffic proxied | | **P-5** | PID re-use | Cosmetic 2s window | accept | | **L-1** | Self-loop (drover's own SOCKS5 traffic) | Filter excludes own_pid + upstream IP | unit: filter expression builder verifies own PID in output | | **T-4** | IPv6 Discord targets | Drop at filter level; Happy Eyeballs falls back | manual: verify with `netsh interface ipv6 set route ::/0 disabled` | | **T-6** | TCP mapping leak | 30min TTL cleanup | unit: TTL sweeper test | | **U-2** | Idle UDP flow leak | 5min TTL cleanup | unit: TTL sweeper test | | **U-4** | UDP fragments | Drop (SOCKS5 doesn't support FRAG) | accept (rare) | | **A-1** | User non-admin | UAC re-launch on startup | manual: standard user account | | **A-2** | UAC cancelled | Clean exit, no error dialog | manual: cancel UAC prompt | | **A-3** | UAC at every login (autostart) | Accepted per B1 | document in README | | **A-5** | Autostart disabled via Task Manager | Detect StartupApproved key, sync GUI checkbox | unit: registry mock | | **TR-1** | Tray icon disappears on explorer.exe restart | systray library handles re-attach | manual: kill+restart explorer.exe | | **TR-3** | First-time tray toast | Track `ui.shown_tray_toast` in config | unit: config writer | | **SI-1** | Mutex collision portable vs installed | installID = SHA256(exe path)[:16] | unit: two paths → two mutexes | | **SI-3** | First instance crashed without cleanup | Kernel cleans mutex on process death | manual: kill -9 first, launch second | | **SR-1** | System sleep | WM_POWERBROADCAST listener → graceful pause | manual: trigger sleep on test machine | | **SR-2** | System resume | Wait 5s network → reopen handle → resume | manual: wake from sleep | | **UP-1** | Auto-update during active engine | Graceful shutdown → replace exe → relaunch with prior state | manual: stage v0.1 → v0.2 update during voice call | | **M-1** | VPN concurrent | WinDivert ловит до VPN encap; SOCKS5 traffic to upstream IP — норма | manual: with WireGuard + Drover both active | | **M-4** | Config corrupted | Use defaults + warning banner | unit: malformed TOML → defaults applied | | **M-5** | Proxy IP changed (DDNS) | Re-resolve hostname every 30s of failed reconnect | unit: hostname resolver retry | | **M-7** | Hot-reload config | fsnotify → engine restart | integration: edit TOML, observe restart | ## Out of scope (Phase 3+) - DPI bypass / fake QUIC injection (decision **C1**) — add as opt-in toggle in v0.4 if needed - Windows service mode (decision **A**) — add for power users in v0.4 if requested - IPv6 SOCKS5 ATYP=04 — add when we hit a v6-only proxy - ARM64 Windows — add when WinDivert ships ARM64 driver (waiting on basil00 upstream) - Multi-user PC scenarios — single-user assumption baked in - Vesktop default-on — stays opt-in via `targets.include_vesktop = true` - Custom DNS resolver / DNS-over-proxy — out of scope; DNS goes direct, document in README ## Phase 2 milestones Each milestone is a separate `writing-plans` invocation followed by `subagent-driven-development` execution. ### P2.1 — TCP-only MVP (3-4 days) **Scope**: WinDivert handle, filter expression, packet parser, TCP NAT-loopback redirect, SOCKS5 client (TCP CONNECT only), procscan, self-loop protection, basic engine state machine (Idle/Starting/Active/Failed without Reconnecting yet). **Acceptance**: - Run drover.exe on Win11 with admin - Discord chat + Discord API requests routed through SOCKS5 (verify via tcpdump on mihomo: should see TCP CONNECT to discord.com:443 from upstream IP) - Voice does NOT yet work (UDP path absent) — documented expectation - Stop button cleanly closes everything in <500ms - Driver remains installed after exit (verify `sc query WinDivert`) - No self-loop infinite traffic (verify: bytes in == bytes out, not exponentially growing) ### P2.2 — UDP voice (3-4 days) **Scope**: SOCKS5 UDP ASSOCIATE primitives (production-grade, not the diagnostic-only fork in checker), UDP flow tracker, packet encap/decap, IPv4-fabrication-and-reinject for inbound path. **Acceptance**: - Voice call in Discord through proxy works without audible degradation - Up to 4 simultaneous voice calls (ish) work without flow leakage - Idle voice flow cleanup at 5min TTL (verified via debug log) - Mid-call proxy disconnect → flow drops → re-opens within 2s on next outbound packet → ~2-3s audible glitch - No memory leak after 1h voice call (RSS stable ±5MB) ### P2.3 — E3 recovery + sleep/resume (2 days) **Scope**: failure classifier, contextual retry policies, Reconnecting state, exponential backoff, WM_POWERBROADCAST listener, heartbeat health-check. **Acceptance**: - Stop mihomo on LXC 102 mid-session → engine transitions Active → Reconnecting → Active when mihomo back up (within 30s of recovery) - Trigger machine sleep mid-voice-call → engine pauses gracefully → wake → engine resumes within 10s after network up → voice continues (Discord client itself reconnects) - WinDivert handle externally killed (`sc stop WinDivert && sc start WinDivert`) → engine reopens once → if second kill within 30s → Failed with crash log - Heartbeat detects "no traffic" while Discord open and idle → tray turns yellow with "no traffic" tooltip → no Failed transition ### P2.4 — Tray + autostart + engine UI (2-3 days) **Scope**: getlantern/systray integration, 4 ICO icons, tray menu (D1 + first-time toast), autostart checkbox in GUI Settings tab, Start/Stop buttons in main window wired to engine, status indicator with state machine awareness, single-instance enforcement. **Acceptance**: - Toggle autostart on → reboot → drover launches at login (after UAC accept) - X on window → first-time toast → second X → silent hide - Start button only enabled when checker passed (or in Failed state with Retry) - Tray icon updates within 200ms of state change - Two simultaneous launches → second activates first's window and exits silently - Status row in tray menu updates every 1s while Active ### P2.5 — Polish (2-3 days) **Scope**: crash dumps, config hot-reload via fsnotify, AV-friendly error messages, all remaining edge cases from matrix, README troubleshooting, install/uninstall verification on clean Win11 VM. **Acceptance**: - Every edge case in the matrix has either a passing test or a verified manual reproduction note in `docs/testing/p2-edge-cases.md` - Install on clean Win11 VM, run for 1 hour without intervention, no errors - Uninstall via Apps & Features removes everything except optionally-kept config (asked at uninstall) - README has SmartScreen + AV troubleshooting sections with screenshots **Total**: ~12-16 days to v1.0.0. ## Testing strategy ### Unit tests (per-package) - `divert/filter`: filter expression builder produces expected strings for various PID lists - `divert/packet`: parse + serialize + checksum recompute is round-trip identity - `engine/recovery`: failure classifier returns expected Action for each FailureClass - `socks5/udp`: encap/decap round-trip - `procscan`: snapshot diffing, mocked toolhelp32 - `autostart`: registry read/write/disabled-detection (with mock registry) - `single`: mutex acquire + release lifecycle - `config`: defaults applied, malformed TOML → defaults + warning, version migration ### Integration tests (each milestone has its own) - `engine_test.go`: mock WinDivert + mock SOCKS5 server in-process, exercise full pipeline - `redirect_test.go`: spin up TCP listener, fake Discord client, fake SOCKS5 server, verify bytes flow ### Manual test plan (per milestone, in `docs/testing/p2--manual.md`) Each manual test case is a numbered step-by-step with expected outcome. Run on clean Win11 VM snapshot before each milestone tag. ### End-to-end (manual, before v1.0.0) Full user journey in `docs/testing/p2-e2e.md`: 1. Download installer from Forgejo release 2. Install via setup.exe (UAC prompt) 3. First launch: configure proxy, run check, click Start 4. Run Discord, place voice call → verify routing via mihomo logs 5. Toggle autostart on 6. Reboot → verify drover starts at login (UAC accept) 7. Sleep + wake cycle → verify continuity 8. Stop mihomo → verify Reconnecting state → restart mihomo → verify recovery 9. Quit via tray menu → verify clean shutdown 10. Uninstall → verify cleanup ## Open questions / assumptions to validate during P2.1 1. **`imgk/divert-go` v0.1.0 still works with WinDivert v2.2.2?** If not, switch to direct syscall bindings. Verify in P2.1 day 1. 2. **Filter expression length limit** — WinDivert filter expressions have a max length. With 4 Discord PIDs + own PID + upstream IP exclusion + multicast we should be well under, but if user adds 10+ Vesktop variants we might hit it. Verify and document limit during P2.1. 3. **`WinDivertSend` for inbound packets we synthesize** — does the kernel correctly route a fabricated `dst=Discord_IP, src=real_target_IP` packet back to Discord's socket? Most divert-based tools do this; verify in P2.2 day 1 with a tracer. 4. **Embedded ICO size on disk** — 4 icons × ~5KB = 20KB. Negligible. ## Files to read before implementation - `imgk/shadow/pkg/divert/` — opens handle + read packets pattern (downloaded already) - `imgk/divert-go` README + `addr.go` — API surface - `runetfreedom/force-proxy/proxy.cpp` — correct SOCKS5 UDP ASSOCIATE flow (local at `/tmp/drover-cmp/force-proxy/`) - `wailsapp/wails/v2/examples/react` — Wails patterns for Engine bindings - This spec.