Artificial Intelligence

173 readers

13 users here now

Chat about and share AI stuff

founded 2 years ago

MODERATORS

Pokey@lemmy.sdf.org

Zoom researchers detail a “chain of draft” method to let LLMs accurately solve reasoning problems with as little as 7.6% of the tokens used by current methods. (arxiv.org)

submitted 14 hours ago by Tea@programming.dev to c/artificialintelligence@lemmy.sdf.org

2 comments fedilink hide all child comments

Large Language Models (LLMs) have demonstrated remarkable performance in solving complex reasoning tasks through mechanisms like Chain-of-Thought (CoT) prompting, which emphasizes verbose, step-by-step reasoning. However, humans typically employ a more efficient strategy: drafting concise intermediate thoughts that capture only essential information. In this work, we propose Chain of Draft (CoD), a novel paradigm inspired by human cognitive processes, where LLMs generate minimalistic yet informative intermediate reasoning outputs while solving tasks. By reducing verbosity and focusing on critical insights, CoD matches or surpasses CoT in accuracy while using as little as only 7.6% of the tokens, significantly reducing cost and latency across various reasoning tasks.

top 2 comments

sorted by: hot top controversial new old

[–] TxTechnician@lemmy.ml 2 points 6 hours ago

Answer the question directly. Do not return any
preamble, explanation, or reasoning.

Chain-of-Thought
Think step by step to answer the following question.
Return the answer at the end of the response after a
separator ####.

Chain-of-Draft
Think step by step, but only keep a minimum draft for
each thinking step, with 5 words at most. Return the
answer at the end of the response after a separator
 ####.

Thats interesting. Good tip.

https://arxiv.org/pdf/2502.18600

[–] oktoberpaard@feddit.nl 2 points 13 hours ago* (last edited 13 hours ago)

Looking at their repo, they’ve tested this with LLM models that have not been trained to generate chain of thought outputs, by varying the system prompts. It’s therefore more of a proof of concept, but I can imagine that if you train a model to do this natively it could work.

Using the same prompt with QwQ made no difference for me (the chain of thought was still very long and quite verbose), while using it with Qwen2.5 Coder made the output extremely terse and not very useful for open-ended questions.