FunctionGemma released, a model tuned for function calling! Learn more

RecurrentGemma

RecurrentGemma is an open model based on Griffin, a hybrid model architecture that mixes gated linear recurrences with local sliding window attention.

Like Gemma, RecurrentGemma is well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. However, RecurrentGemma's unique architecture offers the following additional advantages:

  • Reduced memory usage

    Lower memory requirements allow for the generation of longer samples on devices with limited memory, like single GPUs or CPUs.
  • Higher throughput

    RecurrentGemma can perform inference at significantly higher batch sizes, meaning it can generate substantially more tokens per second — especially when generating long sequences.
  • High performance

    RecurrentGemma matches Gemma's performance while requiring less memory and achieving faster inference.

More resources

View the model card

RecurrentGemma's model card contains detailed information about the model, implementation information, evaluation information, model usage and limitations, and more.

View on Kaggle

View more code, Colab notebooks, information, and discussions about RecurrentGemma on Kaggle.

Run on GitHub

Run example Colab notebooks for JAX and PyTorch on GitHub.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024年08月05日 UTC.