speechbrain.inference.text 模块

指定用于文本处理模块的推理接口。

作者

Aku Rouhe 2021
Peter Plantinga 2021
Loren Lugosch 2020
Mirco Ravanelli 2020
Titouan Parcollet 2021
Abdel Heba 2021
Andreas Nautsch 2022, 2023
Pooneh Mousavi 2023
Sylvain de Langen 2023
Adel Moumen 2023
Pradnya Kandarkar 2023

摘要

类

`GPTResponseGenerator`	一个即开即用的响应生成模型
`GraphemeToPhoneme`	一个预训练的模型实现，用于接受原始自然语言文本作为输入并执行 Grapheme-to-Phoneme (G2P) 转换
`Llama2ResponseGenerator`	一个即开即用的响应生成模型
`ResponseGenerator`	一个即开即用的响应生成模型

参考

类 speechbrain.inference.text.GraphemeToPhoneme(*args, **kwargs)[source]

基类： Pretrained, EncodeDecodePipelineMixin

一个预训练的模型实现，用于接受原始自然语言文本作为输入并执行 Grapheme-to-Phoneme (G2P) 转换

参数：

*args (元组)
**kwargs (dict) – 参数被转发到 Pretrained 父类。

示例

>>> text = ("English is tough. It can be understood "
...         "through thorough thought though")
>>> from speechbrain.inference.text import GraphemeToPhoneme
>>> tmpdir = getfixture('tmpdir')
>>> g2p = GraphemeToPhoneme.from_hparams('path/to/model', savedir=tmpdir)
>>> phonemes = g2p.g2p(text)

INPUT_STATIC_KEYS = ['txt']

OUTPUT_KEYS = ['phonemes']

property phonemes: 返回可用的音素

property language: 返回此模型可用的语言

g2p(text)[source]

执行字素到音素的转换

参数：: text (str 或 list[str]) – 要编码为音素的单个字符串 - 或字符串序列
返回值:: result – 如果提供了单个示例，则返回值为一个音素列表
返回类型:: list

load_dependencies()[source]: 加载任何相关的模型依赖项

__call__(text)[source]

一个便捷的可调用包装器 - 与 G2P 相同

参数：: text (str 或 list[str]) – 要编码为音素的单个字符串 - 或字符串序列
返回值:: result – 如果提供了单个示例，则返回值为一个音素列表
返回类型:: list

forward(noisy, lengths=None)[source]: 对噪声输入运行增强

class speechbrain.inference.text.ResponseGenerator(*args, **kwargs)[source]

基类: Pretrained

一个即开即用的响应生成模型

此类可用于根据用户输入生成和继续对话。给定的 YAML 必须包含 *_NEEDED[] 列表中指定的字段。它需要与 custom.py 一起使用，以加载带有添加的标记（如 bos、eos 和说话者标记）的扩展模型。

参数：

*args (元组)
**kwargs (dict) – 参数被转发到 Pretrained 父类。

MODULES_NEEDED = ['model']

generate_response(turn)[source]

根据用户输入完成对话。 :param turn: 用户输入，即对话的最后一轮。 :type turn: str

返回值:: 基于对话历史为用户输入生成的响应。
返回类型:: response

prepare_input()[source]: 用户应根据自己的任务修改此函数。

generate()[source]: 用户应根据自己的任务修改此函数。

class speechbrain.inference.text.GPTResponseGenerator(*args, **kwargs)[source]

基类: ResponseGenerator

一个即开即用的响应生成模型

此类可用于根据用户输入生成和继续对话。给定的 YAML 必须包含 *_NEEDED[] 列表中指定的字段。它需要与 custom.py 一起使用，以加载带有添加的标记（如 bos、eos 和说话者标记）的扩展 GPT 模型。

参数：

*args (元组)
**kwargs (dict) – 参数被转发到 Pretrained 父类。

示例

>>> from speechbrain.inference.text import GPTResponseGenerator

>>> tmpdir = getfixture("tmpdir")
>>> res_gen_model = GPTResponseGenerator.from_hparams(source="speechbrain/MultiWOZ-GPT-Response_Generation",
... pymodule_file="custom.py")
>>> response = res_gen_model.generate_response("I want to book a table for dinner")

generate(inputs)[source]

根据用户输入完成对话。

参数：: inputs (tuple) – history_bos，它是经过分词的历史+输入值，并在每一轮之前附加了适当的说话者标记；以及 history_token_type，它根据说出该标记的人（用户或系统）确定每个标记的类型。
返回值:: 基于对话历史为用户输入生成的假设。
返回类型:: response

prepare_input()[source]

将用户输入和先前的历史转换为 GPT 模型可接受的格式。: 它附加所有先前的历史和输入，并根据 max_history 值对其进行截断。然后对输入进行分词，并生成额外的输入，用于确定每个标记的类型（系统或用户）。

返回值:

history_bos (torch.Tensor) – 经过分词的历史+输入值，并在每一轮之前附加了适当的说话者标记。
history_token_type (torch.LongTensor) – 根据说出该标记的人（用户或系统）确定的每个标记的类型。

class speechbrain.inference.text.Llama2ResponseGenerator(*args, **kwargs)[source]

基类: ResponseGenerator

一个即开即用的响应生成模型

此类可用于根据用户输入生成和继续对话。给定的 YAML 必须包含 *_NEEDED[] 列表中指定的字段。它需要与 custom.py 一起使用，以加载带有添加的标记（如 bos、eos 和说话者标记）的扩展 Llama2 模型。

参数：

*args (元组)
**kwargs (dict) – 参数被转发到 Pretrained 父类。

示例

>>> from speechbrain.inference.text import Llama2ResponseGenerator

>>> tmpdir = getfixture("tmpdir")
>>> res_gen_model = Llama2ResponseGenerator.from_hparams(source="speechbrain/MultiWOZ-Llama2-Response_Generation",
... pymodule_file="custom.py")
>>> response = res_gen_model.generate_response("I want to book a table for dinner")

generate(inputs)[source]

根据用户输入完成对话。 :param inputs: 要传递给 llama2 模型进行生成的提示输入。 :type inputs: prompt_bos

返回值:: 基于对话历史为用户输入生成的假设。
返回类型:: response

prepare_input()[source]

将用户输入和先前的历史转换为 Llama2 模型可接受的格式。: 它附加所有先前的历史和输入，并根据 max_history 值对其进行截断。然后对输入进行分词并添加提示。

返回值:: prompt_bos – 经过分词的历史+输入值，并带有适当的提示。
返回类型:: torch.Tensor