Inference Endpoints (dedicated) documentation

Supported Transformers & Diffusers Tasks

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Supported Transformers & Diffusers Tasks

Inference Endpoints offers out-of-the-box support for Machine Learning tasks from the following libraries:

  • Transformers
  • Sentence-Transformers
  • Diffusers (for the Text To Image task)

Below is a table of Hugging Face managed supported tasks for Inference Endpoint. These tasks don’t require any form of code or β€œcustom container” to deploy an Endpoint. If you want to customize any of the tasks below, or want to write your own custom task, check out the β€œCreate your own inference handler” section for more information.

Most of the tasks below uses the pipeline object, and more information about what additional parameters can be sent to the endpoint is available here.

Task Framework Out of the box Support
Text To Image Diffusers βœ…
Text Classification Transformers βœ…
Zero Shot Classification Transformers βœ…
Token Classifiation Transformers βœ…
Question Answering Transformers βœ…
Fill Mask Transformers βœ…
Summarization Transformers βœ…
Translation Transformers βœ…
Text to Text Generation Transformers βœ…
Text Generation Transformers βœ…
Feature Extraction Transformers βœ…
Sentence Embeddings Sentence Transformers βœ…
Sentence Similarity Sentence Transformers βœ…
Ranking Sentence Transformers βœ…
Image Classification Transformers βœ…
Automatic Speech Recognition Transformers βœ…
Audio Classification Transformers βœ…
Object Detection Transformers βœ…
Image Segmentation Transformers βœ…
Table Question Answering Transformers βœ…
Conversational Transformers βœ…
Custom Custom βœ…
Visual Question Answering Transformers ❌
Zero Shot Image Classification Transformers ❌

Example Request payloads

See the following request examples for some of the tasks:

Custom Handler

{
  "inputs": "This is a sample input",
  "moreData": 1,
  "customTask": true
}

Text Classification

For additional parameters, see this reference.

Classifying a single text

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Classifying a text pair

{
  "inputs": {
    "text": "This sound track was beautiful!",
    "text_pair": "It paints the scenery in your mind so well I would recomend it even to people who hate vid. game music!"
  } 
}

Zero Shot Classification

For additional parameters, see this reference.

{
  "inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
  "parameters": {
    "candidate_labels": ["refund", "legal", "faq"]
  }
}

Token Classifiation

For additional parameters, see this reference.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Question Answering

For additional parameters, see this reference.

{
  "inputs": {
    "question": "What is used for inference?",
    "context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference."
  }
}

Fill Mask

For additional parameters, see this reference.

{
  "inputs": "This sound track was <mask>! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Summarization

For additional parameters, see this reference.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Translation

For additional parameters, see this reference.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Text to Text Generation

For additional parameters, see this reference.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Text Generation

For additional parameters, see this reference.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Feature Extraction

For additional parameters, see this reference.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Sentence Embeddings

If using a TEI container, see this reference for additional parameters.

{
  "inputs": "This sound track was beautiful! It paints the scenery in your mind so well I would recomend it
  even to people who hate vid. game music!"
}

Sentence similarity

{
  "inputs": {
    "sentences": ["This sound track was beautiful!", "It paints the scenery in your mind so well"],
    "source_sentence": "What a wonderful day to listen to music"
  }
}

Ranking

{
  "inputs": ["This sound track was beautiful!", "It paints the scenery in your mind so well"]
}

Image Classification

Image Classification can receive json payloads or binary data from a image directly.

JSON

{
  "inputs": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgI"
}

Binary

curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: image/jpg' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@test.jpg'

Automatic Speech Recognition

Automatic Speech Recognition can receive json payloads or binary data from a audio directly. For additional parameters, see this reference.

JSON

{
  "inputs": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgI"
}

Binary

curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: audio/x-flac' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@sample.flac'

Audio Classification

Audio Classification can receive json payloads or binary data from a audio directly. For additional parameters, see this reference.

JSON

{
  "inputs": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgI"
}

Binary

curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: audio/x-flac' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@sample.flac'

Object Detection

Object Detection can receive json payloads or binary data from a image directly. For additional parameters, see this reference.

JSON

{
  "inputs": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgI"
}

Binary

curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: image/jpg' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@test.jpg'

Image Segmentation

Image Segmentation can receive json payloads or binary data from a image directly. For additional parameters, see this reference.

JSON

{
  "inputs": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgI"
}

Binary

curl --request POST \
  --url https://{ENDPOINT}/ \
  --header 'Content-Type: image/jpg' \
  --header 'Authorization: Bearer {HF_TOKEN}' \
  --data-binary '@test.jpg'

Table Question Answering

For additional parameters, see this reference.

{
  "inputs": {
    "query": "How many stars does the transformers repository have?",
    "table": {
      "Repository": ["Transformers", "Datasets", "Tokenizers"],
      "Stars": ["36542", "4512", "3934"],
      "Contributors": ["651", "77", "34"],
      "Programming language": ["Python", "Python", "Rust, Python and NodeJS"]
    }
  }
}

Conversational

For additional parameters, see this reference.

{"inputs": [
  {
      "role": "user",
      "content": "Which movie is the best ?"
  },
  {
      "role": "assistant",
      "content": "It's Die Hard for sure."
  },
  {
      "role": "user",
      "content": "Can you explain why?"
  }
]}

Text To Image

{        
  "inputs": "realistic render portrait realistic render portrait of group of flying blue whales towards the moon, intricate, toy, sci - fi, extremely detailed, digital painting, sculpted in zbrush, artstation, concept art, smooth, sharp focus, illustration, chiaroscuro lighting, golden ratio, incredible art by artgerm and greg rutkowski and alphonse mucha and simon stalenhag",
}

For text-to-image models, note that currently your model repo needs to be a diffusers model with the full weights in it (i.e., not just a LoRA).

Additional parameters

You can add additional parameters, which are supported by the pipelines api from transformers.

For Example if you have a text-generation pipeline you can provide generation_kwargs for repetition_penalty or max_length

{
  "inputs": "Hugging Face, the winner of VentureBeat’s Innovation in Natural Language Process/Understanding Award for 2021, is looking to level the playing field. The team, launched by ClΓ©ment Delangue and Julien Chaumond in 2016, was recognized for its work in democratizing NLP, the global market value for which is expected to hit $35.1 billion by 2026. This week, Google’s former head of Ethical AI Margaret Mitchell joined the team.",
  "parameters": {
    "repetition_penalty": 4.0,
    "max_length": 128
  }
}
< > Update on GitHub