Tag: TerminalBench

Terminal-Bench 2.0 launches alongside Harbor, a new framework for testing agents in containers

The builders of Terminal-Bench, a benchmark suite for evaluating the efficiency of autonomous AI brokers on real-world terminal-based

By saad