POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents
Researchers introduced POLAR-Bench, a benchmark for evaluating privacy-utility trade-offs in large language model agents.
The benchmark identifies weaknesses in smaller models' privacy performance, revealing significant data leakage risks.
This record is extracted from a published AI Today issue and tied to the original source URL. Treat the source as the record of evidence for the summary.