机器学习¶
ML Vision¶
- 1506.02640_You Only Look Once: Unified, Real-Time Object Detection
- 1612.08242_YOLO9000: Better, Faster, Stronger
- 1804.02767_YOLOv3
- 2004.10934_YOLOv4: Optimal Speed and Accuracy of Object Detection
- 2205.00159_SVTR: Scene Text Recognition with a Single Visual Model
- 2207.02696_YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
- 2304.08485_Visual Instruction Tuning
- 2402.13616_YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
- 2405.14458_YOLOv10: Real-Time End-to-End Object Detection
- 2411.15858_SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
ML¶
- 2112.09332_WebGPT: Browser-assisted question-answering with human feedback
- 2203.11147_GopherCite: Teaching language models to support answers with verified quotes
- 2304.09848_Generative_Search: Evaluating Verifiability in Generative Search Engines
- 2305.14251_FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
- 2305.14627_ALCE: Enabling Large Language Models to Generate Text with Citations
- 2307.02185_Citation: A Key to Building Responsible and Accountable Large Language Models
- 2307.16883_HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution