2405.04434_DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

  • https://arxiv.org/abs/2405.04434

  • huggingface paper: https://huggingface.co/papers/2405.04434

  • GitHub: https://github.com/deepseek-ai/DeepSeek-V2