跳转至

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

约 13 个字 预计阅读时间不到 1 分钟

Griffin