MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Shilong Li*, Xingyuan Bu*, Wenjie Wang, Jiaheng Liu, et al. "MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents." arXiv 2025.
Shilong Li*, Xingyuan Bu*, Wenjie Wang, Jiaheng Liu, et al. "MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents." arXiv 2025.
Yancheng He*, Shilong Li*, Jiaheng Liu*, et al. "Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?" ACL 2025.
Yancheng He*, Shilong Li*, Jiaheng Liu*, et al. "Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models." ACL 2025.
Shilong Li*, Yancheng He*, Hui Huang, et al. "2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision." NAACL Findings 2025.
Shilong Li*, Yancheng He*, Hangyu Guo*, Xingyuan Bu*, et al. "GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models." EMNLP Findings 2024.
Shilong Li*, Ge Bai*, Zhang Zhang*, et al. "Fusion Makes Perfection: An Efficient Multi-Grained Matching Approach for Zero-Shot Relation Extraction." NAACL 2024.