1/README.md
2025-04-18 19:56:58 +08:00

98 lines
2.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 前言
使用环境:
- Anaconda 3
- Python 3.8
- Pytorch 1.13.1
- Windows 10 or Ubuntu 18.04
# 项目特性
1. 支持模型EcapaTdnn、TDNN、Res2Net、ResNetSE
2. 支持池化层AttentiveStatsPool(ASP)、SelfAttentivePooling(SAP)、TemporalStatisticsPooling(TSP)、TemporalAveragePooling(TAP)
3. 支持损失函数AAMLoss、AMLoss、ARMLoss、CELoss
4. 支持预处理方法MelSpectrogram、Spectrogram、MFCC
## 安装环境
- 首先安装的是Pytorch的GPU版本如果已经安装过了请跳过。
```shell
conda install pytorch==11.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
```
- 安装ppvector库。
使用pip安装命令如下
```shell
python -m pip install mvector -U -i https://pypi.tuna.tsinghua.edu.cn/simple
```
# 使用指南
## 1. 环境准备
### 1.1 安装依赖
```shell
# 使用conda创建环境可选
conda create -n voiceprint python=3.8
conda activate voiceprint
# 安装项目依赖
pip install -r requirements.txt
```
### 1.2 准备音频数据
-`audio_db/`目录存放注册语音建议16kHz单通道wav格式
- 测试音频建议存放至`test_audio/`目录
## 2. 核心功能使用
### 2.1 训练声纹模型
```shell
python train.py \
--config_path configs/ecapa_tdnn.yml \
--augmentation_config configs/augmentation.json \
--save_dir models/
```
### 2.2 声纹注册入库
```python
from mvector import MVector
mvector = MVector()
mvector.register_user(name="user1", audio_path="audio_db/user1.wav")
```
### 2.3 实时声纹识别
```shell
python infer_recognition.py \
--model_path models/ecapa_tdnn.pth \
--audio_path test_audio/unknown.wav
```
### 2.4 声纹对比验证
```shell
python infer_contrast.py \
--audio1 audio_db/user1.wav \
--audio2 test_audio/sample.wav \
--threshold 0.7
```
## 3. 降噪预处理
```python
from Reduction_Noise import NoiseReducer
reducer = NoiseReducer("Reduction_Noise/pytorch_model.bin")
clean_audio = reducer.process("noisy_audio.wav")
```
## 4. 模型评估
```shell
python eval.py \
--model_path models/ecapa_tdnn.pth \
--test_csv eval_samples.csv \
--batch_size 32
```