This paper introduces ASearcher, an open-source project for large-scale RL training of search agents. Our key contributions include:
- Scalable fully asynchronous RL training that enables long-horizon search while maintaining high training efficiency.
- A prompt-based LLM agent that autonomously synthesizes high-quality and challenging QAs, creating a large-scale QA dataset.
Model (based on QwQ): https://huggingface.co/inclusionAI/ASearcher-Local-14B