Source record / Research

[2605.19433] Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation

Researchers proposed a new method, MOTAB, to improve large language model reasoning distillation.

Why this matters

MOTAB addresses dual exposure biases, enhancing performance by about 3% in reasoning tasks.

Source check

This record is extracted from a published AI Today issue and tied to the original source URL. Treat the source as the record of evidence for the summary.

Open original source (opens in new tab)