Llama from Meta is an example of which type of transformer?

Study for the Hugging Face Agent Certification. Prepare with interactive quizzes and multiple-choice questions, complete with explanations and hints. Ace your exam!

Multiple Choice

Llama from Meta is an example of which type of transformer?

Explanation:
Decoder-only transformer design is what Llama uses. In this setup, the model stacks transformer blocks that use masked self-attention, so each token can only attend to previous tokens when predicting the next one. There is no separate encoder that processes an input sequence to produce representations for a decoder. That would be encoder-based transformers, like BERT, which are designed for understanding tasks rather than autoregressive generation. Seq2Seq models pair an encoder with a decoder to map inputs to outputs, which isn’t how Llama is structured. Because it’s a single stack of decoder blocks optimized for autoregressive text generation, it’s best described as a decoder-based transformer.

Decoder-only transformer design is what Llama uses. In this setup, the model stacks transformer blocks that use masked self-attention, so each token can only attend to previous tokens when predicting the next one. There is no separate encoder that processes an input sequence to produce representations for a decoder. That would be encoder-based transformers, like BERT, which are designed for understanding tasks rather than autoregressive generation. Seq2Seq models pair an encoder with a decoder to map inputs to outputs, which isn’t how Llama is structured. Because it’s a single stack of decoder blocks optimized for autoregressive text generation, it’s best described as a decoder-based transformer.

Subscribe

Get the latest from Passetra

You can unsubscribe at any time. Read our privacy policy