speechbrain.nnet.adapters 模块

SpeechBrain 实现的各种预训练模型适配器，例如 LoRA, Houlsby

作者

Titouan Parcollet 2024
Peter Plantinga 2024

摘要

类

`AdaptedModel`	给定任何 torch 模型（例如 asr_brain.modules.Transformer）和一个适配器类（例如 HoulsbyAdapter），此类将用这个新的适配器类替换目标层（同时保留参数）。
`HoulsbyAdapterLinear`	此类实现了 Houlsby 适配器，如论文“Parameter-Efficient Transfer Learning for NLP” https://arxiv.org/abs/1902.00751 中所述。
`LoRA`	此类实现了 LoRA 适配器，如论文“LoRA: Low-Rank Adaptation of Large Language Models” https://arxiv.org/abs/2106.09685 中所述。

函数

`is_layer_adaptable`	检查层是否在要适配的层列表中。
`replace_module`	根据父级分配用新模块替换层。

参考

class speechbrain.nnet.adapters.AdaptedModel(model_to_adapt: Module, adapter_class: Module, all_linear: bool = False, all_conv: bool = False, target_layers: list = [], unfrozen_layers: list = [], adapter_kwargs: dict = {}, manual_adapter_insertion: bool = False)[source]

基础类: Module

给定任何 torch 模型（例如 asr_brain.modules.Transformer）和一个适配器类（例如 HoulsbyAdapter），此类将用这个新的适配器类替换目标层（同时保留参数）。

参数:

model_to_adapt (nn.Module) – 要添加适配器的基础 PyTorch 模型。
adapter_class (class) – 此 SpeechBrain 库的（未初始化）适配器。
all_linear (bool) – 是否将适配器添加到所有线性层（默认值: False）
all_conv (bool) – 是否将适配器添加到所有卷积层（默认值: False）
target_layers (list of str) – 给定模型中应被替换的模块名称列表。支持 Unix shell 风格的通配符 (*, ?, [seq], [!seq])，使用 fnmatch。
unfrozen_layers (list of str) – 训练期间要解冻的层列表。支持 Unix shell 风格的通配符 (*, ?, [seq], [!seq])，使用 fnmatch。
adapter_kwargs (dict) – 应传递给适配器的参数集合。
manual_adapter_insertion (bool) – 默认值 (False) 会在初始化时插入适配器。然而，在某些情况下，最好等待参数加载后再插入适配器，例如当需要加载预训练参数时。在这种情况下，可以将此设置为 True 并在参数加载后手动调用 insert_adapters。

示例

>>> from collections import OrderedDict
>>> model = torch.nn.Sequential(
...   OrderedDict([
...     ("layer1", torch.nn.Linear(10, 20)),
...     ("layer2", torch.nn.Linear(20, 20)),
...     ("layer3", torch.nn.Linear(20, 10)),
...   ])
... )
>>> lora_model = AdaptedModel(
...   model_to_adapt=model,
...   adapter_class=LoRA,
...   target_layers=["layer[13]"],
...   unfrozen_layers=["layer2"],
...   adapter_kwargs={"rank": 2},
... )
>>> lora_model
AdaptedModel(
  (adapted_model): Sequential(
    (layer1): LoRA(
      (pretrained_module): Linear(in_features=10, out_features=20, bias=True)
      (adapter_down_proj): Linear(in_features=10, out_features=2, bias=False)
      (adapter_up_proj): Linear(in_features=2, out_features=20, bias=False)
    )
    (layer2): Linear(in_features=20, out_features=20, bias=True)
    (layer3): LoRA(
      (pretrained_module): Linear(in_features=20, out_features=10, bias=True)
      (adapter_down_proj): Linear(in_features=20, out_features=2, bias=False)
      (adapter_up_proj): Linear(in_features=2, out_features=10, bias=False)
    )
  )
)

insert_adapters()[source]: 如果此函数在 __init__ 中，则与 Pretrainer 冲突。确保在训练前准确调用此函数一次。请参阅 __init__.manual_adapter_insertion

forward(*args, **kwargs)[source]: 将参数传递给适配后的模型。

saver(path)[source]: 仅保存可训练参数。

loader(path, end_of_epoch)[source]: 加载基础模型和训练好的参数。

parameter_transfer(path)[source]: 避免仅加载训练好的参数引起的警告。

__getattr__(item)[source]: 覆盖 getattr 将项目访问传递给适配前的模型。

speechbrain.nnet.adapters.is_layer_adaptable(name, module, all_linear, all_conv, target_layers)[source]

检查层是否在要适配的层列表中。

参数:

name (str) – 要检查的模块名称。
module (torch.nn.Module) – 要检查的模块。
all_linear (bool) – 是否应适配所有线性层。
all_conv (bool) – 是否应适配所有卷积层。
target_layers (str or list of str) – 请参阅 add_adapters_to_model

返回值:

层是否应被适配。

返回类型:

bool

speechbrain.nnet.adapters.replace_module(model: Module, name: str, new_module: Module)[source]

根据父级分配用新模块替换层。这用于用一个封装了原始层的适配器层替换层。因此，旧参数被保留，并添加新参数。

参数:

model (nn.Module) – 包含要替换模块的模型。
name (str) – 要替换的目标模块名称。
new_module (nn.Module) – 由旧参数和新参数构成的新模块。

class speechbrain.nnet.adapters.HoulsbyAdapterLinear(target_linear, projection_size, activation=<class 'speechbrain.nnet.activations.Swish'>, bias=True)[source]

基础类: Module

此类实现了 Houlsby 适配器，如论文“Parameter-Efficient Transfer Learning for NLP” https://arxiv.org/abs/1902.00751 中所述。

参数:

target_linear (nn.Module) – 对应于将被此适配器封装的预训练 Linear 模块。
projection_size (int) – 投影层的大小（通常较小）。
activation (nn.Module) – 激活函数。默认是 Swish。
bias (bool) – 是否在线性投影中使用偏置。

示例

>>> import torch
>>> x = torch.rand((8, 60, 64))
>>> base_linear = nn.Linear(64, 64)
>>> adapt = HoulsbyAdapterLinear(base_linear, 8)
>>> output = adapt(x)
>>> output.shape
torch.Size([8, 60, 64])

forward(x: Tensor)[source]

将 HoulsbyAdapter 应用于输入张量 x。

参数:: x (torch.Tensor) – 适配器模块的输入张量。形状：[B, Time, X]
返回类型:: 线性输出

class speechbrain.nnet.adapters.LoRA(target_module, rank=16, alpha=1.0)[source]

基础类: Module

此类实现了 LoRA 适配器，如论文“LoRA: Low-Rank Adaptation of Large Language Models” https://arxiv.org/abs/2106.09685 中所述。

参数:

target_module (nn.Module) – 对应于将被此适配器封装的预训练层模块。适用于 nn.Linear 和 nn.Conv
rank (int) – 投影层的大小或秩（通常较小）。
alpha (float) – 用于控制 LoRA 中缩放的值。默认值为 1。

示例

>>> import torch
>>> x = torch.rand((8, 60, 64))
>>> base_linear = nn.Linear(64, 64)
>>> adapt = LoRA(base_linear, 64, 4)
>>> output = adapt(x)
>>> output.shape
torch.Size([8, 60, 64])

forward(x: Tensor)[source]

应用 LoRA 适配器。

参数:: x (torch.Tensor) – 适配器模块的输入张量。
返回类型:: 线性输出