Did you know I can change my voice ? No ? Try listening to these samples:
Impressed ? I s’ld give a try to dubbing movies I guess 😉
These “modulations” are powered by a service called Modulate.ai, a company based in Cambridge, Massachusetts. They use a combination of machine learning, signal processing, and hard-won intuition to develop real-time “voice skins”. Given now AI has started getting “creative”, I thought of coming up with a series of blog posts around these – I call it “Attach of the Fakes” as this is what I see happening in very near future.
Those who have been following Generative Adversarial Networks or GANs can already guess the use of adversarial training. Modulate team has detailed their approach in this excellent blog post. They have built a new kind of voice conversion architecture: parallel sample generation, adversarially trained for voice conversion, inspired by parallel wavenet architecture. They secured a seed round of $2M just last week but have got a lot of buzz and excitement around their technology already. To read more you can go to this post as well on MIT Technology Review which has featured them last week.
Now with this technology you can already imagine yourself impersonating _ANYBODY_ and hence this is rightfully giving nightmares to activists already. But this is true for a lot of recent development around AI and Modulate team is very well aware of the potential of misuse. Given the fact that humanity has overcome so many challenges I am sure this is a problem which will be solved 🙂
Btw, I have more of my “voice skins”, listen to me as “Evan” and “Katie” below:
DO try it out yourself at modulate.ai/convert – its fun 🙂