Skip to content
Cloudflare Docs

Models

📌
Meta logollama-4-scout-17b-16e-instruct
Text GenerationMeta

Meta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

  • Function calling
📌
Meta logollama-3.3-70b-instruct-fp8-fast
Text GenerationMeta

Llama 3.3 70B quantized to fp8 precision, optimized to be faster.

  • Batch
  • Function calling
📌
Meta logollama-3.1-8b-instruct-fast
Text GenerationMeta

[Fast version] The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

    Google logogemma-3-12b-it
    Text GenerationGoogle

    Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions.

    • LoRA
    MistralAI logomistral-small-3.1-24b-instruct
    Text GenerationMistralAI

    Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.

    • Function calling
    Qwen logoqwq-32b
    Text GenerationQwen

    QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.

    • LoRA
    Qwen logoqwen2.5-coder-32b-instruct
    Text GenerationQwen

    Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

    • LoRA
    b
    bge-reranker-base
    Text Classificationbaai

    Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function.

      Meta logollama-guard-3-8b
      Text GenerationMeta

      Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.

      • LoRA
      DeepSeek logodeepseek-r1-distill-qwen-32b
      Text GenerationDeepSeek

      DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.

        Meta logollama-3.2-1b-instruct
        Text GenerationMeta

        The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

          Meta logollama-3.2-3b-instruct
          Text GenerationMeta

          The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

            Meta logollama-3.2-11b-vision-instruct
            Text GenerationMeta

            The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

            • LoRA
            Black Forest Labs logoflux-1-schnell
            Text-to-ImageBlack Forest Labs

            FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.

              Meta logollama-3.1-8b-instruct-awq
              Text GenerationMeta

              Quantized (int4) generative text model with 8 billion parameters from Meta.

                Meta logollama-3.1-8b-instruct-fp8
                Text GenerationMeta

                Llama 3.1 8B quantized to FP8 precision

                  m
                  melotts
                  Text-to-Speechmyshell-ai

                  MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai.

                    Meta logollama-3.1-8b-instruct
                    Text GenerationMeta

                    The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

                      b
                      bge-m3
                      Text Embeddingsbaai

                      Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.

                      • Batch
                      m
                      meta-llama-3-8b-instruct
                      Text Generationmeta-llama

                      Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

                        OpenAI logowhisper-large-v3-turbo
                        Automatic Speech RecognitionOpenAI

                        Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

                          Meta logollama-3-8b-instruct-awq
                          Text GenerationMeta

                          Quantized (int4) generative text model with 8 billion parameters from Meta.

                            l
                            llava-1.5-7b-hfBeta
                            Image-to-Textllava-hf

                            LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.

                              f
                              una-cybertron-7b-v2-bf16Beta
                              Text Generationfblgit

                              Cybertron 7B v2 is a 7B MistralAI based model, best on it's series. It was trained with SFT, DPO and UNA (Unified Neural Alignment) on multiple datasets.

                                OpenAI logowhisper-tiny-enBeta
                                Automatic Speech RecognitionOpenAI

                                Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.

                                  Meta logollama-3-8b-instruct
                                  Text GenerationMeta

                                  Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

                                    MistralAI logomistral-7b-instruct-v0.2Beta
                                    Text GenerationMistralAI

                                    The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2. Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1: 32k context window (vs 8k context in v0.1), rope-theta = 1e6, and no Sliding-Window Attention.

                                    • LoRA
                                    Google logogemma-7b-it-loraBeta
                                    Text GenerationGoogle

                                    This is a Gemma-7B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

                                    • LoRA
                                    Google logogemma-2b-it-loraBeta
                                    Text GenerationGoogle

                                    This is a Gemma-2B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

                                    • LoRA
                                    m
                                    llama-2-7b-chat-hf-loraBeta
                                    Text Generationmeta-llama

                                    This is a Llama2 base model that Cloudflare dedicated for inference with LoRA adapters. Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

                                    • LoRA
                                    Google logogemma-7b-itBeta
                                    Text GenerationGoogle

                                    Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.

                                    • LoRA
                                    n
                                    starling-lm-7b-betaBeta
                                    Text Generationnexusflow

                                    We introduce Starling-LM-7B-beta, an open large language model (LLM) trained by Reinforcement Learning from AI Feedback (RLAIF). Starling-LM-7B-beta is trained from Openchat-3.5-0106 with our new reward model Nexusflow/Starling-RM-34B and policy optimization method Fine-Tuning Language Models from Human Preferences (PPO).

                                      n
                                      hermes-2-pro-mistral-7bBeta
                                      Text Generationnousresearch

                                      Hermes 2 Pro on Mistral 7B is the new flagship 7B Hermes! Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house.

                                      • Function calling
                                      MistralAI logomistral-7b-instruct-v0.2-loraBeta
                                      Text GenerationMistralAI

                                      The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.

                                      • LoRA
                                      Qwen logoqwen1.5-1.8b-chatBeta
                                      Text GenerationQwen

                                      Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud.

                                        u
                                        uform-gen2-qwen-500mBeta
                                        Image-to-Textunum

                                        UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model was pre-trained on the internal image captioning dataset and fine-tuned on public instructions datasets: SVIT, LVIS, VQAs datasets.

                                          f
                                          bart-large-cnnBeta
                                          Summarizationfacebook

                                          BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. You can use this model for text summarization.

                                            Microsoft logophi-2Beta
                                            Text GenerationMicrosoft

                                            Phi-2 is a Transformer-based model with a next-word prediction objective, trained on 1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding.

                                              t
                                              tinyllama-1.1b-chat-v1.0Beta
                                              Text Generationtinyllama

                                              The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T.

                                                Qwen logoqwen1.5-14b-chat-awqBeta
                                                Text GenerationQwen

                                                Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.

                                                  Qwen logoqwen1.5-7b-chat-awqBeta
                                                  Text GenerationQwen

                                                  Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.

                                                    Qwen logoqwen1.5-0.5b-chatBeta
                                                    Text GenerationQwen

                                                    Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud.

                                                      t
                                                      discolm-german-7b-v1-awqBeta
                                                      Text Generationthebloke

                                                      DiscoLM German 7b is a Mistral-based large language model with a focus on German-language applications. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.

                                                        t
                                                        falcon-7b-instructBeta
                                                        Text Generationtiiuae

                                                        Falcon-7B-Instruct is a 7B parameters causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets.

                                                          o
                                                          openchat-3.5-0106Beta
                                                          Text Generationopenchat

                                                          OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning.

                                                            d
                                                            sqlcoder-7b-2Beta
                                                            Text Generationdefog

                                                            This model is intended to be used by non-technical users to understand data inside their SQL databases.

                                                              DeepSeek logodeepseek-math-7b-instructBeta
                                                              Text GenerationDeepSeek

                                                              DeepSeekMath-Instruct 7B is a mathematically instructed tuning model derived from DeepSeekMath-Base 7B. DeepSeekMath is initialized with DeepSeek-Coder-v1.5 7B and continues pre-training on math-related tokens sourced from Common Crawl, together with natural language and code data for 500B tokens.

                                                                f
                                                                detr-resnet-50Beta
                                                                Object Detectionfacebook

                                                                DEtection TRansformer (DETR) model trained end-to-end on COCO 2017 object detection (118k annotated images).

                                                                  b
                                                                  stable-diffusion-xl-lightningBeta
                                                                  Text-to-Imagebytedance

                                                                  SDXL-Lightning is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.

                                                                    l
                                                                    dreamshaper-8-lcmBeta
                                                                    Text-to-Imagelykon

                                                                    Stable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range.

                                                                      r
                                                                      stable-diffusion-v1-5-img2imgBeta
                                                                      Text-to-Imagerunwayml

                                                                      Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images. Img2img generate a new image from an input image with Stable Diffusion.

                                                                        r
                                                                        stable-diffusion-v1-5-inpaintingBeta
                                                                        Text-to-Imagerunwayml

                                                                        Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.

                                                                          t
                                                                          deepseek-coder-6.7b-instruct-awqBeta
                                                                          Text Generationthebloke

                                                                          Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese.

                                                                            t
                                                                            deepseek-coder-6.7b-base-awqBeta
                                                                            Text Generationthebloke

                                                                            Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese.

                                                                              t
                                                                              llamaguard-7b-awqBeta
                                                                              Text Generationthebloke

                                                                              Llama Guard is a model for classifying the safety of LLM prompts and responses, using a taxonomy of safety risks.

                                                                                t
                                                                                neural-chat-7b-v3-1-awqBeta
                                                                                Text Generationthebloke

                                                                                This model is a fine-tuned 7B parameter LLM on the Intel Gaudi 2 processor from the mistralai/Mistral-7B-v0.1 on the open source dataset Open-Orca/SlimOrca.

                                                                                  t
                                                                                  openhermes-2.5-mistral-7b-awqBeta
                                                                                  Text Generationthebloke

                                                                                  OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.

                                                                                    t
                                                                                    llama-2-13b-chat-awqBeta
                                                                                    Text Generationthebloke

                                                                                    Llama 2 13B Chat AWQ is an efficient, accurate and blazing-fast low-bit weight quantized Llama 2 variant.

                                                                                      t
                                                                                      mistral-7b-instruct-v0.1-awqBeta
                                                                                      Text Generationthebloke

                                                                                      Mistral 7B Instruct v0.1 AWQ is an efficient, accurate and blazing-fast low-bit weight quantized Mistral variant.

                                                                                        t
                                                                                        zephyr-7b-beta-awqBeta
                                                                                        Text Generationthebloke

                                                                                        Zephyr 7B Beta AWQ is an efficient, accurate and blazing-fast low-bit weight quantized Zephyr model variant.

                                                                                          Stability.ai logostable-diffusion-xl-base-1.0Beta
                                                                                          Text-to-ImageStability.ai

                                                                                          Diffusion-based text-to-image generative model by Stability AI. Generates and modify images based on text prompts.

                                                                                            b
                                                                                            bge-large-en-v1.5
                                                                                            Text Embeddingsbaai

                                                                                            BAAI general embedding (Large) model that transforms any given text into a 1024-dimensional vector

                                                                                            • Batch
                                                                                            b
                                                                                            bge-small-en-v1.5
                                                                                            Text Embeddingsbaai

                                                                                            BAAI general embedding (Small) model that transforms any given text into a 384-dimensional vector

                                                                                            • Batch
                                                                                            Meta logollama-2-7b-chat-fp16
                                                                                            Text GenerationMeta

                                                                                            Full precision (fp16) generative text model with 7 billion parameters from Meta

                                                                                              MistralAI logomistral-7b-instruct-v0.1
                                                                                              Text GenerationMistralAI

                                                                                              Instruct fine-tuned version of the Mistral-7b generative text model with 7 billion parameters

                                                                                              • LoRA
                                                                                              b
                                                                                              bge-base-en-v1.5
                                                                                              Text Embeddingsbaai

                                                                                              BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector

                                                                                              • Batch
                                                                                              HuggingFace logodistilbert-sst-2-int8
                                                                                              Text ClassificationHuggingFace

                                                                                              Distilled BERT model that was finetuned on SST-2 for sentiment classification

                                                                                                Meta logollama-2-7b-chat-int8
                                                                                                Text GenerationMeta

                                                                                                Quantized (int8) generative text model with 7 billion parameters from Meta

                                                                                                  Meta logom2m100-1.2b
                                                                                                  TranslationMeta

                                                                                                  Multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation

                                                                                                  • Batch
                                                                                                  Microsoft logoresnet-50
                                                                                                  Image ClassificationMicrosoft

                                                                                                  50 layers deep image classification CNN trained on more than 1M images from ImageNet

                                                                                                    OpenAI logowhisper
                                                                                                    Automatic Speech RecognitionOpenAI

                                                                                                    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

                                                                                                      Meta logollama-3.1-70b-instruct
                                                                                                      Text GenerationMeta

                                                                                                      The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.