需要哪些测试覆盖方法?

  1. 依赖项:版本控制 (检查 commit ID 日期)
    参见: requirements.txt
    运行: find *txt . | grep extra

  2. Docstring 测试:注释掉的函数签名
    (针对外部调用的函数)

  3. 按功能关键代码块的单元测试

  4. 针对普通实验的集成测试,以涵盖通用任务基础上的用例

  5. 回归测试:稳定接口及其重构

  6. 用于自动检查和修正 Python 和 YAML 代码风格的 Linter

如何参与贡献?

  1. 提出您的问题并在Discussion 区参与讨论

  2. 报告错误或请求功能,在Issue 区提出

  3. 贡献Pull request

  4. 通过 SpeechBrain 发布预训练模型
    例如,将 HuggingFace 账户链接到 SpeechBrain 以托管您的模型卡

GitHub 工作流程:基于配置的策略

API 配置位于.github/workflows
(所有配置都创建一个一次性的 ubuntu-latest 环境)


信息:尽管我们的 PyTorch 要求是

torch>=1.9.0
torchaudio>=0.9.0

我们的测试仅覆盖一个 PyTorch 版本,即最新版本

pre-commit.yml

SpeechBrain pre-commit / pre-commit (pull_request)

  • python-version: ‘3.8’

  • 运行 pre-commit action,配置在.pre-commit-config.yaml

    • hook: https://github.com/pre-commit/pre-commit-hooks
      trailing-whitespace
      end-of-file-fixer
      requirements-txt-fixer
      mixed-line-ending
      check-added-large-files

    • hook: https://github.com/psf/black
      black
      click

    • hook: https://gitlab.com/pycqa/flake8.git
      flake8; 参见: .flake8

    • hook: https://adrienverge.net/yamllint/
      yamllint; 参见: .yamllint.yaml

pythonapp.yml

SpeechBrain toolkit CI / Tests (3.7) (pull_request)
SpeechBrain toolkit CI / Tests (3.8) (pull_request)
SpeechBrain toolkit CI / Tests (3.9) (pull_request)

  • python-version: [3.7, 3.8, 3.9]

  • 创建全新的环境

    sudo apt-get install -y libsndfile1
    pip install -r requirements.txt
    pip install --editable .
    pip install ctc-segmentation
    
  • 运行 PyTest 检查
    参见: pytest.ini - 文件: test_*.py; check_*.py; example_*.py 及 norecursedirs
    参见: conftest.py - 准备测试项收集和直接发现

    # excerpts
    parser.addoption("--device", action="store", default="cpu")
    ...
    try:
      import numba  # noqa: F401
    except ModuleNotFoundError:
      collect_ignore.append("speechbrain/nnet/loss/transducer_loss.py")
    ...
    
    • a. hook: 使用 pytest 进行一致性测试
      pytest tests/consistency

    • b. hook: 使用 pytest 进行单元测试
      pytest tests/unittests

    • c. hook: 使用 pytest 进行 Doctest
      pytest --doctest-modules speechbrain

    • d. hook: 使用 pytest 进行集成测试
      pytest tests/integration

verify-docs-gen.yml [I.2.a]

验证文档生成 / docs (pull_request)

  • python-version: ‘3.8’

  • 创建全新的环境

    pip install -r requirements.txt
    pip install --editable .
    pip install -r docs/docs-requirements.txt
    
  • 生成文档

    cd docs
    make html
    
  • 对比: .readthedocs.yaml - python version: 3.8

newtag.yml

在推送新 tag 时起草版本发布

release.yml

发布到 PyPI

  • python-version: 3.8

  • action: checkout 到 main 分支

  • creates: pypa/build 用于构建二进制 wheel 和源 tarball

  • action: 通过 pypa/gh-action-pypi-publish@master 发布到 PyPI
    意味着使用

这些检查中使用的工具/hook 的版本通过 lint-requirements.txt 控制,它是 requirements.txt 中的一个嵌套依赖。随着 SpeechBrain 主要版本的发布,每个 hook 的版本应随之更新——同时确保源文件、测试和构建(包括运行拼写检查)中的依赖一致性。

注意:PyTorch 关于 Python 版本的说明 (截至 2022-11-09)

建议您使用 Python 3.6、3.7 或 3.8

用于报告代码覆盖率的 PyTest

如何了解待合并 Open PR 的测试覆盖率变化?
(仅 CPU 片段)

# Example: install more dependencies to avoid ignoring modules
sudo apt install -y libsndfile1
pip install ctc_segmentation

# install coverage
pip install pytest-cov

# run the test (w/ duration reporting)
pytest --durations=0 --cov=speechbrain --cov-context=test --doctest-modules speechbrain tests --ignore=speechbrain/nnet/loss/transducer_loss.py

示例:收集 459 个测试项后,报告“缺失”语句 4481/16782 (覆盖率 73%)。

然而——核心模块的 Python 代码并非全部需要覆盖;到目前为止,仅保证了一致性。


延伸阅读
pytest 与 coverage - https://breadcrumbscollector.tech/how-to-use-code-coverage-in-python-with-pytest/ (@Adel-Moumen 推荐)


pytest --durations=0 --cov=speechbrain --cov-context=test --doctest-modules speechbrain tests --ignore=speechbrain/nnet/loss/transducer_loss.py

---------- coverage: platform linux, python 3.9.12-final-0 -----------
Name                                                      Stmts   Miss  Cover
-----------------------------------------------------------------------------
speechbrain/alignment/aligner.py                            380     61    84%
speechbrain/alignment/ctc_segmentation.py                   189     10    95%
speechbrain/core.py                                         424    155    63% <== < 80%
speechbrain/dataio/batch.py                                  99      8    92%
speechbrain/dataio/dataio.py                                279     50    82%
speechbrain/dataio/dataloader.py                            140     25    82%
speechbrain/dataio/dataset.py                               100      8    92%
speechbrain/dataio/encoder.py                               328     46    86%
speechbrain/dataio/iterators.py                              80     62    22% <== < 80%
speechbrain/dataio/legacy.py                                121     41    66% <== < 80%
speechbrain/dataio/preprocess.py                             22      4    82%
speechbrain/dataio/sampler.py                               224     61    73% <== < 80%
speechbrain/dataio/wer.py                                    63     54    14% <== < 80%
speechbrain/decoders/ctc.py                                 111     89    20% <== < 80%
speechbrain/decoders/seq2seq.py                             370     46    88%
speechbrain/decoders/transducer.py                          133     64    52% <== < 80%
speechbrain/lm/arpa.py                                       77      3    96%
speechbrain/lm/counting.py                                   37      4    89%
speechbrain/lm/ngram.py                                      36      1    97%
speechbrain/lobes/augment.py                                154     55    64% <== < 80%
speechbrain/lobes/beamform_multimic.py                       20     14    30% <== < 80%
speechbrain/lobes/features.py                                96      9    91%
speechbrain/lobes/models/CRDNN.py                            52     12    77% <== < 80%
speechbrain/lobes/models/ContextNet.py                       83      3    96%
speechbrain/lobes/models/ECAPA_TDNN.py                      157      7    96%
speechbrain/lobes/models/HifiGAN.py                         321    146    55% <== < 80%
speechbrain/lobes/models/MetricGAN.py                        74     29    61% <== < 80%
speechbrain/lobes/models/Tacotron2.py                       364     66    82%
speechbrain/lobes/models/conv_tasnet.py                     121      6    95%
speechbrain/lobes/models/dual_path.py                       357     55    85%
speechbrain/lobes/models/fairseq_wav2vec.py                  93     93     0% <== < 80%
speechbrain/lobes/models/g2p/dataio.py                      136    107    21% <== < 80%
speechbrain/lobes/models/g2p/homograph.py                   118     20    83%
speechbrain/lobes/models/g2p/model.py                       132    109    17% <== < 80%
speechbrain/lobes/models/huggingface_wav2vec.py             145     47    68% <== < 80%
speechbrain/lobes/models/resepformer.py                     180     21    88%
speechbrain/lobes/models/segan_model.py                     102     88    14% <== < 80%
speechbrain/lobes/models/transformer/Conformer.py           111      7    94%
speechbrain/lobes/models/transformer/Transformer.py         180     22    88%
speechbrain/lobes/models/transformer/TransformerASR.py       92     28    70% <== < 80%
speechbrain/lobes/models/transformer/TransformerLM.py        47      5    89%
speechbrain/lobes/models/transformer/TransformerSE.py        20      2    90%
speechbrain/lobes/models/transformer/TransformerST.py        81     60    26% <== < 80%
speechbrain/lobes/models/wav2vec.py                         123     55    55% <== < 80%
speechbrain/nnet/CNN.py                                     417     56    87%
speechbrain/nnet/RNN.py                                     471     51    89%
speechbrain/nnet/activations.py                              39      1    97%
speechbrain/nnet/attention.py                               234     44    81%
speechbrain/nnet/complex_networks/c_CNN.py                  130     23    82%
speechbrain/nnet/complex_networks/c_RNN.py                  374     67    82%
speechbrain/nnet/complex_networks/c_normalization.py        277     68    75% <== < 80%
speechbrain/nnet/complex_networks/c_ops.py                  108     40    63% <== < 80%
speechbrain/nnet/containers.py                              139     14    90%
speechbrain/nnet/linear.py                                   27      1    96%
speechbrain/nnet/loss/si_snr_loss.py                         20     16    20% <== < 80%
speechbrain/nnet/loss/stoi_loss.py                           81      1    99%
speechbrain/nnet/loss/transducer_loss.py                    136    136     0% <== < 80%
speechbrain/nnet/losses.py                                  323    112    65% <== < 80%
speechbrain/nnet/normalization.py                           142      6    96%
speechbrain/nnet/pooling.py                                 156     31    80%
speechbrain/nnet/quantisers.py                               47      2    96%
speechbrain/nnet/quaternion_networks/q_CNN.py               150     25    83%
speechbrain/nnet/quaternion_networks/q_RNN.py               370     59    84%
speechbrain/nnet/quaternion_networks/q_linear.py             50     11    78% <== < 80%
speechbrain/nnet/quaternion_networks/q_normalization.py      44      4    91%
speechbrain/nnet/quaternion_networks/q_ops.py               229    122    47% <== < 80%
speechbrain/nnet/schedulers.py                              363    103    72% <== < 80%
speechbrain/nnet/transducer/transducer_joint.py              33      5    85%
speechbrain/pretrained/fetching.py                           48      6    88%
speechbrain/pretrained/interfaces.py                        786    338    57% <== < 80%
speechbrain/pretrained/training.py                           33     28    15% <== < 80%
speechbrain/processing/PLDA_LDA.py                          345     96    72% <== < 80%
speechbrain/processing/decomposition.py                     102      8    92%
speechbrain/processing/diarization.py                       319    157    51% <== < 80%
speechbrain/processing/features.py                          359     75    79% <== < 80%
speechbrain/processing/multi_mic.py                         345      2    99%
speechbrain/processing/signal_processing.py                 166     39    77% <== < 80%
speechbrain/processing/speech_augmentation.py               386     34    91%
speechbrain/tokenizers/SentencePiece.py                     181     74    59% <== < 80%
speechbrain/utils/Accuracy.py                                24     17    29% <== < 80%
speechbrain/utils/DER.py                                     44     33    25% <== < 80%
speechbrain/utils/bleu.py                                    50     43    14% <== < 80%
speechbrain/utils/callchains.py                              28      5    82%
speechbrain/utils/checkpoints.py                            294     52    82%
speechbrain/utils/data_pipeline.py                          181     15    92%
speechbrain/utils/data_utils.py                             197     77    61% <== < 80%
speechbrain/utils/depgraph.py                                82      1    99%
speechbrain/utils/distributed.py                             61     37    39% <== < 80%
speechbrain/utils/edit_distance.py                          180     50    72% <== < 80%
speechbrain/utils/epoch_loop.py                              55     22    60% <== < 80%
speechbrain/utils/hparams.py                                  2      1    50% <== < 80%
speechbrain/utils/hpopt.py                                  134     41    69% <== < 80%
speechbrain/utils/logger.py                                  73     45    38% <== < 80%
speechbrain/utils/metric_stats.py                           285     48    83%
speechbrain/utils/parameter_transfer.py                      87     17    80%
speechbrain/utils/profiling.py                              191     54    72% <== < 80%
speechbrain/utils/superpowers.py                             20      6    70% <== < 80%
speechbrain/utils/text_to_sequence.py                        77     22    71% <== < 80%
speechbrain/utils/torch_audio_backend.py                      9      2    78% <== < 80%
speechbrain/utils/train_logger.py                           150    113    25% <== < 80%
speechbrain/wordemb/transformer.py                           90     67    26% <== < 80%
-----------------------------------------------------------------------------
TOTAL                                                     16782   4481    73%