Lilith Version 8 - tests and benchmarks
Design tests cover memory, workspace, tool calling, TTS, and history. Add self-improvement validation on your machine before promoting sandbox builds.
Bench machine
Current system specs
CPUAMD Ryzen 5 4500 6-Core Processor
GPUNVIDIA GeForce RTX 3060
RAM32 GB
OSWindows 11 (10.0.26200)
Ollama127.0.0.1:11434
Recorded2026-05-25 12:00
Validation
Recorded console sessions
Run MemorySystemTests and ToolCallingTests in `DesignTests/` after sandbox edits. Self-improvement promotes only after `self_improve_verify_sandbox_tool` reports success.