Lilith Version 8 - tests and benchmarks

Design tests cover memory, workspace, tool calling, TTS, and history. Add self-improvement validation on your machine before promoting sandbox builds.

Bench machine

Current system specs

CPUAMD Ryzen 5 4500 6-Core Processor
GPUNVIDIA GeForce RTX 3060
RAM32 GB
OSWindows 11 (10.0.26200)
Ollama127.0.0.1:11434
Recorded2026-05-25 12:00
Validation

Recorded console sessions

Run MemorySystemTests and ToolCallingTests in `DesignTests/` after sandbox edits. Self-improvement promotes only after `self_improve_verify_sandbox_tool` reports success.