-
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing
Paper • 2512.23611 • Published • 7 -
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Paper • 2512.22087 • Published • 4 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 64 -
Very Large-Scale Multi-Agent Simulation in AgentScope
Paper • 2407.17789 • Published • 42
Collections
Discover the best community collections!
Collections including paper arxiv:2601.16443
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 19 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 104 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 43
-
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper • 2510.13786 • Published • 33 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 43 -
BitNet Distillation
Paper • 2510.13998 • Published • 61 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 53
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 94 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 42 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
obiwan96/endless-terminals
Updated • 85.9k • 2 -
obiwan96/qwen3-8b-openthinker-sft-endless-terminals
Text Generation • 8B • Updated • 666 -
obiwan96/qwen-2.5-7b-instruct-endless-terminals
Text Generation • 8B • Updated • 9 -
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 19
-
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Paper • 2510.03222 • Published • 76 -
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
Paper • 2510.05592 • Published • 112 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 516 -
Multi-Agent Tool-Integrated Policy Optimization
Paper • 2510.04678 • Published • 31
-
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing
Paper • 2512.23611 • Published • 7 -
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Paper • 2512.22087 • Published • 4 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 64 -
Very Large-Scale Multi-Agent Simulation in AgentScope
Paper • 2407.17789 • Published • 42
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 94 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 42 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 19 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 104 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 43
-
obiwan96/endless-terminals
Updated • 85.9k • 2 -
obiwan96/qwen3-8b-openthinker-sft-endless-terminals
Text Generation • 8B • Updated • 666 -
obiwan96/qwen-2.5-7b-instruct-endless-terminals
Text Generation • 8B • Updated • 9 -
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 19
-
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper • 2510.13786 • Published • 33 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 43 -
BitNet Distillation
Paper • 2510.13998 • Published • 61 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 53
-
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Paper • 2510.03222 • Published • 76 -
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
Paper • 2510.05592 • Published • 112 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 516 -
Multi-Agent Tool-Integrated Policy Optimization
Paper • 2510.04678 • Published • 31