AWAS666@alien.topBtoLocalLLaMA•StyleTTS 2 - Closes gap further on TTS quality + Voice generation from samplesEnglish
1·
1 year agoVery fast, RTF of below 0.1 so processing time is 10x faster than spoken time.
On cpu btw.
Very fast, RTF of below 0.1 so processing time is 10x faster than spoken time.
On cpu btw.
Yep it is, takes around 4 hours on a 3090.
Fine tune and around an hour worth of data.