Announcement_7
We released AgentVista, the first benchmark for multimodal agents on realistic, ultra-challenging visual scenarios with long-horizon hybrid tool use.
We released AgentVista, the first benchmark for multimodal agents on realistic, ultra-challenging visual scenarios with long-horizon hybrid tool use.