Can Agents Generalize to the Open World? Unveiling the Fragility of Static Training in Tool Use

Can Agents Generalize to the Open World?
Unveiling the Fragility of Static Training in Tool Use

¹School of Intelligence Science and Technology, Nanjing University, Nanjing, China
²National Key Laboratory for Novel Software Technology, Nanjing University, China
ICML 2026
^*Equal contribution, ^✉Corresponding Author

Abstract

While Large Language Model (LLM) agents demonstrate proficiency in static benchmarks, their deployment in real-world scenarios is hindered by the dynamic nature of user queries, tool sets, and interaction dynamics. To address this generalization gap, we formalize OpenAgent (Tool-Use Agent in Open-World), a problem setting characterized by distributional shifts across query, action, observation, and domain dimensions.

We construct a controlled sandbox environment where we define fine-grained environmental shifts across a four-tier hierarchy, Perception, Interaction, Reasoning, and Internalization, and conduct a comprehensive series of experiments. Our exhaustive analysis yields a series of key insights, demonstrating that agents trained via both Supervised Fine-Tuning and Reinforcement Learning suffer from varying degrees of performance degradation when confronting open environmental shifts.

Building on these insights, we propose Perturbation-Augmented Fine-Tuning (PAFT), a disturbance-based intervention strategy for SFT that lays the foundation for enhancing agent robustness and utility in realistic environments.

1. The Illusion of Mastery in Closed Environments

Under the prevailing static world assumption, where the distribution of tools, schemas, and interaction logic remains consistent between training and inference, both Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) paradigms demonstrate stable and continuous performance gains, eventually converging on near-perfect success rates. However, this stability is often an artifact of the closed-set nature of current benchmarks.

2. OpenAgent Formulation & The 4-Tier Evaluation Hierarchy

Real-world deployment is fundamentally non-stationary. To rigorously address the generalization gap, we formally define OpenAgent (Tool-Use Agent in Open-World), characterizing non-stationary shifts across User Queries, Tool Sets, and Interaction Dynamics.

We establish a controlled sandbox environment to conduct "controlled probing." This setup allows us to systematically inject open-world perturbations across a comprehensive four-tier diagnostic framework: Perception, Interaction, Reasoning, and Internalization.

3. Experimental Results: The Fragility of Static Training

Our exhaustive evaluations across the four tiers reveal varying degrees of generalization and adaptability in SFT and RL models under open-world settings, identifying critical failure modes in current paradigms:

Tier-1 & 2 (Perception & Interaction): SFT exhibits brittle symbolic anchoring and persistent hallucinations, functioning effectively as a blind open-loop system. Conversely, RL achieves better semantic grounding and leverages explicit guidance for dynamic policy adaptation.
Tier-3 & 4 (Reasoning & Internalization): While RL demonstrates superior local adaptability, both paradigms collapse under global dependency inversion due to rigid topological overfitting. Furthermore, both paradigms exhibit severe boundary blindness in unsolvable states, prioritizing forced completion over active refusal.

BibTeX

@inproceedings{wu2026openagent, title = {Can Agents Generalize to the Open World? Unveiling the Fragility of Static Training in Tool Use}, author = {Wu, Weiming and Lv, Song-Lin and Zhu, Rui and Cheng, Zi-Jian and Guo, Lan-Zhe}, booktitle = {Proceedings of the 43rd International Conference on Machine Learning}, year = {2026} }

Can Agents Generalize to the Open World?Unveiling the Fragility of Static Training in Tool Use