ea4202d4a3
Old single-shot stun test only proved one UDP packet round-tripped
through the relay. To predict whether voice will actually work the
checker now does two stronger tests:
- voice-quality: 30-packet STUN burst with loss/jitter/p50 metrics,
with a "warn" tier between hard pass and hard fail.
- voice-srv: concurrent DNS resolve + SOCKS5 TCP probe to a list of
Discord voice region hostnames; passes if any region is reachable.
Adds StatusWarn ("soft pass — show hint anyway") so the GUI can
distinguish "voice will work but glitchy" from green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
195 lines
9.4 KiB
Markdown
195 lines
9.4 KiB
Markdown
# Checker — 7-step SOCKS5 diagnostic
|
|
|
|
**Status**: design accepted 2026-05-01.
|
|
**Replaces**: stub `RunCheck` in `internal/gui/app.go` that emits fake events.
|
|
|
|
## Why
|
|
|
|
The Wails GUI exposes a "Check connection" button that the user presses
|
|
before turning the engine on. Today it walks through a hard-coded scenario
|
|
in Go, returning bogus metrics. The user can't tell whether their proxy
|
|
is alive, supports UDP, or whether Discord blocks it. We need an honest
|
|
diagnostic that tells the user exactly which capability of their SOCKS5
|
|
proxy works and which doesn't, with hex-level evidence on failure.
|
|
|
|
## API surface
|
|
|
|
```go
|
|
// internal/checker/checker.go
|
|
package checker
|
|
|
|
type Status string
|
|
|
|
const (
|
|
StatusRunning Status = "running"
|
|
StatusPassed Status = "passed"
|
|
StatusFailed Status = "failed"
|
|
StatusSkipped Status = "skipped"
|
|
)
|
|
|
|
type Result struct {
|
|
ID string `json:"id"`
|
|
Status Status `json:"status"`
|
|
Metric string `json:"metric,omitempty"`
|
|
Error string `json:"error,omitempty"`
|
|
Hint string `json:"hint,omitempty"`
|
|
RawHex string `json:"raw_hex,omitempty"`
|
|
Duration time.Duration `json:"duration_ms"`
|
|
Attempt int `json:"attempt"`
|
|
}
|
|
|
|
type Config struct {
|
|
ProxyHost string
|
|
ProxyPort int
|
|
UseAuth bool
|
|
ProxyLogin string
|
|
ProxyPassword string
|
|
|
|
PerTestTimeout time.Duration
|
|
MaxRetries int
|
|
RetryBackoff time.Duration
|
|
|
|
DiscordGateway string
|
|
DiscordAPI string
|
|
StunServer string
|
|
|
|
// voice-quality burst tuning
|
|
VoiceBurstCount int // default 30
|
|
VoiceBurstInterval time.Duration // default 20ms
|
|
|
|
// voice-srv probe — empty list means "use the built-in default
|
|
// (russia/russia2/frankfurt/europe/singapore/japan/us-east/us-west/
|
|
// brazil/india/hongkong/southkorea/sydney/southafrica/dubai/atlanta).discord.media"
|
|
VoiceServerHostnames []string
|
|
}
|
|
|
|
// StatusWarn is a "soft pass" — the test technically succeeded but
|
|
// the user should know about a degradation (e.g. voice quality at the
|
|
// upper end of acceptable). Frontend renders it like StatusPassed but
|
|
// keeps the Hint visible.
|
|
const StatusWarn Status = "warn"
|
|
|
|
// Run streams Results to the returned channel and closes it when finished
|
|
// or when ctx is cancelled. The first event for each test is Status=running;
|
|
// the next is the final state (passed/failed/skipped). On retry, another
|
|
// running+final pair is emitted with Attempt > 1.
|
|
func Run(ctx context.Context, cfg Config) <-chan Result
|
|
```
|
|
|
|
Defaults applied when zero values are passed: PerTestTimeout=5s, MaxRetries=1,
|
|
RetryBackoff=500ms, DiscordGateway="gateway.discord.gg:443",
|
|
DiscordAPI="https://discord.com/api/v9/gateway",
|
|
StunServer="stun.l.google.com:19302".
|
|
|
|
## The seven tests
|
|
|
|
Sequential. Each test reuses sockets opened by previous tests when sensible.
|
|
|
|
| ID | What it does | Considered failed when | Skip rule |
|
|
|----|--------------|------------------------|-----------|
|
|
| `tcp` | `net.DialTimeout("tcp", host:port)` | dial error | never |
|
|
| `greet` | Sends SOCKS5 client greeting `05 02 00 02` (or `05 01 00` if UseAuth=false). Reads 2 bytes. Pass = `05 00` (no auth) or `05 02` (auth required). Fail on `05 FF`, anything else, or short read | proxy returned non-SOCKS5 / refused all auth methods | skipped if `tcp` failed |
|
|
| `auth` | Only emitted when UseAuth=true. RFC 1929 sub-negotiation: `01 LEN_LOGIN LOGIN LEN_PASS PASS`. Reads 2 bytes, expects `01 00`. | bad credentials (`01 != 00`) / short read | not in test list when UseAuth=false; skipped if `greet` failed |
|
|
| `connect` | SOCKS5 CONNECT to `gateway.discord.gg:443` (ATYP=03 domain). Reads 10 bytes. Pass = REP=0x00. | REP != 0 (0x05 = connection refused, etc) / timeout | skipped if `greet`/`auth` failed |
|
|
| `udp` | UDP ASSOCIATE: opens **second** TCP control channel, redoes greeting+auth there, sends `05 03 00 01 00000000 0000`, reads 10-byte reply. Pass = REP=0x00 + valid relay endpoint in BND.ADDR/BND.PORT. | REP=0x07 (cmd unsupported), other REP, short read | skipped if `greet` failed |
|
|
| `voice-quality` | Through the relay: send `VoiceBurstCount` (default 30) STUN binding requests to `cfg.StunServer`, spaced `VoiceBurstInterval` (default 20ms). Listen until `last_send + 1.5*PerTestTimeout`. Compute `loss%`, `jitter` (mean abs delta of inter-arrival deltas, à la RFC 3550 simplified), `p50 RTT`. Metric = `"loss=2% jitter=14ms p50=42ms"`. **Pass** = loss ≤ 5% AND jitter ≤ 30ms AND p50 ≤ 250ms. **Warn-pass** (status=passed but Hint set) = loss ≤ 15% AND jitter ≤ 60ms — voice will work with audible glitches. **Fail** = anything worse. | loss > 15% OR jitter > 60ms OR p50 > 400ms OR no replies at all | skipped if `udp` failed |
|
|
| `voice-srv` | Probe Discord voice servers. Concurrently DNS-resolve a hardcoded list of `<region>.discord.media` hostnames (`russia`, `russia2`, `frankfurt`, `europe`, `singapore`, `japan`, `us-east`, `us-west`, `brazil`, `india`, `hongkong`, `southkorea`, `sydney`, `southafrica`, `dubai`, `atlanta`) using OS resolver, 2s budget. For every resolved hostname: SOCKS5 CONNECT through proxy to `host:443` with 1s dial timeout, run them concurrently with a small worker pool (8). Metric = `"<N> regions reachable: russia, frankfurt, europe"` (top 3). **Pass** = ≥ 1 region reachable. **Warn-pass** = 0 reachable but ≥ 1 resolved (proxy filters Discord media IPs even though DNS works) — Hint will warn that voice may not work despite checks 1-5 passing. **Fail** = 0 hostnames resolved at all (DNS broken or Discord changed naming) | 0 hostnames resolved at all | skipped if `connect` failed |
|
|
| `api` | TCP CONNECT through the proxy to `discord.com:443`, do a tiny HTTPS GET `/api/v9/gateway`. Pass = HTTP 200 or 401 (Discord returns 401 unauthenticated, that still proves reachability). | non-200/401 / TLS handshake failed / connect refused | skipped if `connect` failed |
|
|
|
|
For each fail, the `Hint` field carries a Russian explanation (the GUI is
|
|
RU-localized) and `RawHex` carries the first 32 bytes of any unexpected
|
|
response (for the expand-debug section in the UI).
|
|
|
|
## Cancel & retry
|
|
|
|
- `ctx` is honoured at every blocking call (Dial uses DialContext, reads
|
|
use SetDeadline derived from PerTestTimeout). On cancel, current test
|
|
emits a final `failed` result with Error="cancelled" and the channel
|
|
closes; remaining tests get a single `skipped` event each.
|
|
- Auto-retry once on transient errors:
|
|
- timeout (`net.Error.Timeout()`)
|
|
- "connection reset by peer"
|
|
- DNS temporary failure
|
|
- NOT retried (likely user-config error or hard failure):
|
|
- connection refused
|
|
- bad credentials (REP=0x02, AUTH=0x01)
|
|
- REP=0x07 (cmd unsupported)
|
|
- HTTP 4xx/5xx other than 401 on `api`
|
|
- Between attempts: sleep `RetryBackoff`.
|
|
|
|
## Wails integration
|
|
|
|
`internal/gui/app.go::RunCheck(cfg Config)` becomes:
|
|
|
|
```go
|
|
func (a *App) RunCheck(cfg Config) {
|
|
ctx, cancel := context.WithCancel(a.ctx)
|
|
a.muCheck.Lock()
|
|
a.cancelCheck = cancel
|
|
a.muCheck.Unlock()
|
|
|
|
go func() {
|
|
ck := mapToCheckerConfig(cfg)
|
|
var passed, failed int
|
|
for r := range checker.Run(ctx, ck) {
|
|
runtime.EventsEmit(a.ctx, "check:result", r)
|
|
if r.Status == checker.StatusPassed { passed++ }
|
|
if r.Status == checker.StatusFailed { failed++ }
|
|
}
|
|
runtime.EventsEmit(a.ctx, "check:done", map[string]int{
|
|
"total": passed + failed, "passed": passed, "failed": failed,
|
|
})
|
|
}()
|
|
}
|
|
|
|
func (a *App) CancelCheck() {
|
|
a.muCheck.Lock()
|
|
if a.cancelCheck != nil { a.cancelCheck() }
|
|
a.muCheck.Unlock()
|
|
}
|
|
```
|
|
|
|
A new `CancelCheck` binding lets the GUI's Cancel button stop a running
|
|
diagnostic. The frontend's `useDrover` hook gets a `cancelCheck()`
|
|
callback that calls it.
|
|
|
|
## Testing
|
|
|
|
- Unit tests for each test function with a fake SOCKS5 server (`net.Listen`,
|
|
hand-rolled byte responses) — covers happy path, every documented failure
|
|
mode, malformed responses (truncated, wrong protocol, garbage).
|
|
- STUN test uses a real `pion/stun` server in-process via `net.Listen("udp")`.
|
|
- Discord-API and `connect` tests use the same fake SOCKS5 server tunneling
|
|
to `httptest.NewTLSServer` and `net.Listen("tcp")`.
|
|
- One end-to-end test against a real `mihomo` instance is documented in
|
|
`docs/testing/checker-e2e.md` but not part of `go test ./...` (requires
|
|
network).
|
|
|
|
## Files
|
|
|
|
```
|
|
internal/checker/
|
|
checker.go ─ public API: Run, Result, Config
|
|
socks5.go ─ greeting, auth, CONNECT, UDP ASSOCIATE primitives
|
|
stun.go ─ STUN binding-request encode/decode (no library —
|
|
we already vendor enough; ~80 LOC)
|
|
retry.go ─ classify(err) -> transient | permanent
|
|
hints.go ─ map test failure → user hint (RU)
|
|
checker_test.go ─ Run-level integration with fake server
|
|
socks5_test.go ─ per-primitive table tests
|
|
stun_test.go ─ encode/decode + RTT mock
|
|
```
|
|
|
|
`internal/gui/app.go` gets `RunCheck` rewritten and a new `CancelCheck`
|
|
method. The fake SCENARIOS path in app.go is removed.
|
|
|
|
## Out of scope (future work)
|
|
|
|
- IPv6 SOCKS5 ATYP=04. Discord today is IPv4; we'll add when we hit a
|
|
proxy that's v6-only.
|
|
- Parallel test execution (e.g. running `connect` and `udp` simultaneously
|
|
on separate sessions). Sequential is clearer for the UI; we'll revisit
|
|
if total runtime exceeds 10s on common networks.
|
|
- TLS certificate pinning on `api`. The `tls.Config` is default — fine for
|
|
reachability check.
|