Skip to content

Inconsistent librosa versions PyTorch/SpeechSynthesis/All and CUDA-Optimized/FastSpeech #1369

Open
@xvdp

Description

@xvdp

PyTorch/SpeechSynthesis/All and CUDA-Optimized/FastSpeech

librosa is used through all audio projects although only a few functions. requirements files refer to different versions. But not all syntax is coherent with the versions 'required`.

The main change in librosa > 7 is that many of the functions require kwargs, only positional args allowed are typically the data.
e.g. librosa.core.resample(y: 'np.ndarray', *, orig_sr: 'float', target_sr: 'float', .. etc

  1. PyTorch/SpeechSynthesis/ project requirements ask for
  • PyTorch/SpeechSynthesis/Tacotron2/requirements.txt requires librosa
  • PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp/src/trt/requirements.txt librosa==0.7.0
  • PyTorch/SpeechSynthesis/HiFiGAN/requirements.txt librosa==0.9.0
  • PyTorch/SpeechSynthesis/FastPitch/requirements.txt librosa==0.9.0

For consistency they should all require the same version. All but one function - listed below - can run on librosa 10

  1. On the frameworks requiring the newer pytorch, some files use the old syntax.
  • PyTorch/SpeechSynthesis/FastPitch/hifigan/data_function.py line 72 librosa_mel_fn(sampling_rate, n_fft, num_mels, fmin, fmax)
  • PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai/client/speech_ai_demo/utils/jasper/speech_utils.py lines 386 & 389 samples = librosa.core.resample(samples, sample_rate, target_sr) librosa.effects.trim(samples, trim_db)

*CUDA-Optimized/FastSpeech/generate.py uses deprecated librosa.output.write_wav(path, wav, hp.sr) see librosa/librosa#1062

  • CUDA-Optimized/FastSpeech/tacotron2/audio_processing.py line 82 win_sq = librosa_util.pad_center(win_sq, n_fft)

Several of those functions will. It is simple enough to clean the code.

Environment
*Driver Version: 535.129.03
*NVIDIA GeForce RTX 3080

  • github cloned over docker image nvidia/cuda:12.1.0-devel-ubuntu22.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions