Lilith Version 2 - tests and benchmarks

Design tests cover chat history plus Kokoro TTS live and sanitizer suites. Recorded console benchmarks below include execution time, model name, and TTS-enabled runs for Version 2.

Bench machine

Current system specs

CPUAMD Ryzen 5 4500 6-Core Processor
GPUNVIDIA GeForce RTX 3060 12GB
RAM32 GB
OSWindows 11 (10.0.26200)
Ollama127.0.0.1:11434
Recorded2026-05-21 11:10
Validation

Recorded console sessions

Use these sessions to validate spoken-reply latency on your hardware. Run `DesignTests/run-design-tests.py --live` locally when Ollama and Kokoro assets are installed.