You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Honestly, following his tutorial, I was able to achieve decent results with a Spanish model that he trained. I found the phase where the BPE tokenizer is extended interesting, similar to the scripts by @IIEleven11 in All talk.
At first listen, the quality is superior to XTTS (plus I understand it is partially based on it), but it still struggles to read tildes and some difficult letters, syllables, or words in Spanish. This could stem from either the BPE tokenizer or the lack of official pre-trained models.
This project really has potential as a successor to XTTS (especially in the customization of different emotions), but as long as there are no strong pre-trained models, that potential remains limited.
I leave this message here in case this manages to evolve in the future and @erew123 is interested in creating an inference and training GUI inspired by All talk, which is fantastic.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Recently, the channel Jarods Journeys released a tutorial on how to train Index TTS 2 for new languages.
Honestly, following his tutorial, I was able to achieve decent results with a Spanish model that he trained. I found the phase where the BPE tokenizer is extended interesting, similar to the scripts by @IIEleven11 in All talk.
At first listen, the quality is superior to XTTS (plus I understand it is partially based on it), but it still struggles to read tildes and some difficult letters, syllables, or words in Spanish. This could stem from either the BPE tokenizer or the lack of official pre-trained models.
This project really has potential as a successor to XTTS (especially in the customization of different emotions), but as long as there are no strong pre-trained models, that potential remains limited.
I leave this message here in case this manages to evolve in the future and @erew123 is interested in creating an inference and training GUI inspired by All talk, which is fantastic.
Jarod´s Pre-trained Spanish model
Beta Was this translation helpful? Give feedback.
All reactions