Low-Rank Adaptation

Low-Rank Adaptation (or LoRA) is an approach to model fine-tuning which reduces the number of weights or parameters that are updated. It is based on the empirical observation that the updates in any given matrix of weights (for example, one representing a fully-connected layer) actually have a low 'intrinsic rank' (which is the maximum number of linearly independent columns or rows). Thus, the weight update ∆W is represented via the product BA, where the number of columns of B, as well as the number of rows of A, is set to a target low rank r.
Related concepts:
Model Fine-Tuning
External reference:
https://arxiv.org/abs/2106.09685