LLMsMedium impactFor DevarXiv LLMs · June 12, 2026

AdaSR: Adaptive Streaming Reasoning with Hierarchical Relative Policy Optimization

AdaSR introduces an adaptive streaming reasoning framework for large models that reason during input stream and finalize decisions with a new policy optimization method, improving reasoning accuracy and efficiency.
Signal strength3.4/5·arXiv LLMs

AdaSR introduces an adaptive streaming reasoning framework for large models that reason during input stream and finalize decisions with a new policy optimization method, improving reasoning accuracy and efficiency.

TL;DR

AdaSR introduces an adaptive streaming reasoning framework for large models that reason during input stream and finalize decisions with a new policy optimization method, improving reasoning accuracy and efficiency.

What happened

Researchers proposed AdaSR, which enables large reasoning models to perform reasoning dynamically during streaming inputs and final deliberation once complete, optimized via Hierarchical Relative Policy Optimization (HRPO) to balance accuracy, computation, and latency. Code is publicly available.

Why it matters

This approach addresses limitations of traditional static read-then-think paradigms for dynamic streaming data, allowing more flexible, latency-aware reasoning that better fits real-world continuous input scenarios like audio and video streams.

Generating deep dive...

AI-powered analysis takes a few seconds