Which describes how LoRA modifies transformer layers during training?

Study for the Hugging Face Agent Certification. Prepare with interactive quizzes and multiple-choice questions, complete with explanations and hints. Ace your exam!

Multiple Choice

Which describes how LoRA modifies transformer layers during training?

Explanation:
LoRA trains transformer layers by inserting tiny, trainable adapter components into the layer while keeping the original pre-trained weights fixed. The idea is to learn a small set of additional parameters that adjust how the layer behaves for a new task, rather than updating the entire large model. During training, only these adapter parameters are updated; the base model stays unchanged, so the knowledge learned during pre-training remains intact. Because these adapters are low-rank matrices, they add far fewer parameters than retraining the whole model, making fine-tuning more parameter-efficient in terms of both memory and computation. This approach isn’t about removing parts of the model or shrinking its size. It’s about augmenting the existing layers with small, trainable modules that capture task-specific adaptations while preserving the backbone’s original capabilities. You can then reuse the same backbone across multiple tasks by swapping in different adapters.

LoRA trains transformer layers by inserting tiny, trainable adapter components into the layer while keeping the original pre-trained weights fixed. The idea is to learn a small set of additional parameters that adjust how the layer behaves for a new task, rather than updating the entire large model.

During training, only these adapter parameters are updated; the base model stays unchanged, so the knowledge learned during pre-training remains intact. Because these adapters are low-rank matrices, they add far fewer parameters than retraining the whole model, making fine-tuning more parameter-efficient in terms of both memory and computation.

This approach isn’t about removing parts of the model or shrinking its size. It’s about augmenting the existing layers with small, trainable modules that capture task-specific adaptations while preserving the backbone’s original capabilities. You can then reuse the same backbone across multiple tasks by swapping in different adapters.

Subscribe

Get the latest from Passetra

You can unsubscribe at any time. Read our privacy policy