medium severityChatDev tester agent, code execution
During tester phase or manual execution of WareHouse code: Compilation/runtime errors like ModuleNotFoundError, NameError, ImportError. Code compiles but fails functional tests or crashes (e.g., missing pygame). Tester reports bugs but iterations may not fully resolve.[ChatDev Paper](https://arxiv.org/html/2307.07924v5)
Root cause
LLMs overlook basic code elements like import statements during generation (ModuleNotFound 45.76%, NameError/ImportError 15.25% each). Unclear requirements lead to placeholders/unimplemented methods. Limited context causes hallucinations/incomplete code.[ChatDev Paper](https://arxiv.org/html/2307.07924v5)
ChatDevtesterModuleNotFoundErrorNameErrorImportErrorcode-generationLLM-hallucination
Citations