[2605.19433] Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation
Researchers proposed a new method, MOTAB, to improve large language model reasoning distillation.
MOTAB addresses dual exposure biases, enhancing performance by about 3% in reasoning tasks.
This record is extracted from a published AI Today issue and tied to the original source URL. Treat the source as the record of evidence for the summary.