评测基准 ########### 评测基准 ========== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/Standards/* 数据集-Agent ==================== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/DS_Agents/* 数据集-QA ================== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/DS_QAs/* 数据集-编程 ================== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/DS_Codes/* 数据集-长文本 ================== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/DS_LongCtxs/* 数据集-数学 ================== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/DS_Maths/* 数据集-图片 ================== .. toctree:: :maxdepth: 1 :glob: Benchmarkings/DS_Images/* 数据集 ========= .. toctree:: :maxdepth: 1 :glob: Benchmarkings/Datasets/*