LLM Decoding Algorithm

15-21 of 8,740 results

Open links in new tab

Any time

arxiv.org
https://arxiv.org › abs
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Mar 18, 2025 · Inference scaling empowers LLMs with unprecedented reasoning ability, with reinforcement learning as the core technique to elicit complex reasoning. However, key technical …
arxiv.org
https://arxiv.org › abs
[2502.16982] Muon is Scalable for LLM Training
Feb 24, 2025 · Recently, the Muon optimizer based on matrix orthogonalization has demonstrated strong results in training small-scale language models, but the scalability to larger models has not …
arxiv.org
https://arxiv.org › pdf
[PDF]
Understanding LLMs: A Comprehensive Overview from Training to ...
This requirement emphasizes the need for researchers developing LLMs to possess significant engineering capabilities in addressing the challenges encountered during LLM development. …
arxiv.org
https://arxiv.org › pdf
[PDF]
Large Language Model based Multi-Agents: A Survey of Progress ...
Timely survey papers systematically summarize the progress of LLM-based agents, as seen in works [Xi et al., 2023; Wang et al., 2023b]. Based on the inspiring capabilities of the single LLM-based agent, …
arxiv.org
https://arxiv.org › abs
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment
Feb 14, 2025 · Despite notable advancements in Multimodal Large Language Models (MLLMs), most state-of-the-art models have not undergone thorough alignment with human preferences. This gap …
arxiv.org
https://arxiv.org › abs
[2412.04315] Densing Law of LLMs
Dec 5, 2024 · To calculate the capacity density of a given target LLM, we first introduce a set of reference models and develop a scaling law to predict the downstream performance of these …
arxiv.org
https://arxiv.org › abs
[2501.01005] FlashInfer: Efficient and Customizable Attention ...
Jan 2, 2025 · Transformers, driven by attention mechanisms, form the foundation of large language models (LLMs). As these models scale up, efficient GPU attention kernels become essential for high …

Pagination
- Previous
- 1
- 2
- 3
- 4
- 5
- Next

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

[2502.16982] Muon is Scalable for LLM Training

Understanding LLMs: A Comprehensive Overview from Training to ...

Large Language Model based Multi-Agents: A Survey of Progress ...

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

[2412.04315] Densing Law of LLMs

[2501.01005] FlashInfer: Efficient and Customizable Attention ...