arXiv:2510.14703v2 Announce Type: replace
Abstract: Large language models (LLMs) excel at function calling, but inference scaling has been explored mainly for unstructured generation. We propose an inference-scaling framework for structured outputs that combines fine-grained beam search with textbfToolPRM, a process reward model scoring each intra-call decision (function name and argument filling). We build the first fine-grained intra-call supervision dataset via function masking, rollout collection, and step-level annotation. ToolPRM outperforms outcome and coarse-grained reward models in predictive accuracy and yields consistent test-time gains on multiple function-calling benchmarks. We further show that structured generation follows “textbfexplore more but retain less”, since early JSON errors are unrecoverable.
Differential acceptance of a national digital health platform among community and frontline health workers in Cote d’Ivoire: a cross-sectional study
IntroductionMobile-based digital health solutions are critical technologies that play a significant role in improving the quality of healthcare services. Cote d’Ivoire is digitizing its community-based