15-21 of 8,740 results
Open links in new tab
  1. DAPO: An Open-Source LLM Reinforcement Learning System at Scale

    Mar 18, 2025 · Inference scaling empowers LLMs with unprecedented reasoning ability, with reinforcement learning as the core technique to elicit complex reasoning. However, key technical …

  2. [2502.16982] Muon is Scalable for LLM Training

    Feb 24, 2025 · Recently, the Muon optimizer based on matrix orthogonalization has demonstrated strong results in training small-scale language models, but the scalability to larger models has not …

  3. This requirement emphasizes the need for researchers developing LLMs to possess significant engineering capabilities in addressing the challenges encountered during LLM development. …

  4. Timely survey papers systematically summarize the progress of LLM-based agents, as seen in works [Xi et al., 2023; Wang et al., 2023b]. Based on the inspiring capabilities of the single LLM-based agent, …

  5. MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

    Feb 14, 2025 · Despite notable advancements in Multimodal Large Language Models (MLLMs), most state-of-the-art models have not undergone thorough alignment with human preferences. This gap …

  6. [2412.04315] Densing Law of LLMs

    Dec 5, 2024 · To calculate the capacity density of a given target LLM, we first introduce a set of reference models and develop a scaling law to predict the downstream performance of these …

  7. [2501.01005] FlashInfer: Efficient and Customizable Attention ...

    Jan 2, 2025 · Transformers, driven by attention mechanisms, form the foundation of large language models (LLMs). As these models scale up, efficient GPU attention kernels become essential for high …