OpenAI unveils new voice cloning AI bot that can say anything after sampling 15-sec clip

OpenAI, the renowned artificial intelligence research organization, has recently unveiled Voice Engine, a groundbreaking voice-cloning technology capable of emulating any speaker by analyzing a mere 15-second audio sample.

Promising “natural-sounding speech” with dynamic and realistic voices, the ChatGPT-maker’s new model is built upon the foundation of OpenAI’s existing text-to-speech API and has been developing since 2022. The company has already integrated a version of this toolset to power preset voices within their current text-to-speech API and Read Aloud feature, showcasing samples on their official blog that closely resemble authentic voices.

While OpenAI envisions numerous beneficial applications for Voice Engine, such as aiding in reading assistance, language translation, and assisting individuals with speech impairments, they are aware of the potential misuse of the technology.

The specter of deepfake manipulation looms large, prompting concerns regarding privacy and ethical implications. Consequently, OpenAI asserts that Voice Engine is still being prepared for widespread implementation, citing the imperative to address serious privacy concerns before proceeding with a full-scale rollout.

Acknowledging the significant risks associated with this technology, particularly during an election year, OpenAI underscores its commitment to soliciting feedback from various stakeholders spanning government, media, entertainment, education, and civil society. All preview testers have consented to adhere to OpenAI’s usage policies, which prohibit the impersonation of individuals without their explicit consent or legal authorization.

Moreover, users of Voice Engine must disclose to their audience that the voices being utilized are AI-generated. OpenAI has implemented additional safety measures, including watermarking to trace audio origin and proactive monitoring of system usage. Upon official release, a “no-go voice list” will be implemented to detect and prevent using AI-generated speakers resembling prominent figures.

The launch of OpenAI’s Sora and now Voice Engine has come close to one of the most hotly contested elections in the US. Naturally, concerns about the misuse of these new AI tools among political analysts and technology pundits run high.

As for pricing and availability, OpenAI has remained tight-lipped. However, potential pricing data suggests Voice Engine may undercut competitors in the market. Speculations indicate a cost of $15 per one million characters, roughly equivalent to 162,500 words, positioning Voice Engine as a cost-effective solution for audiobook production.

Additionally, OpenAI hints at an “HD” version that costs double the cost, although specifics regarding its functionality remain undisclosed.

In parallel to this announcement, OpenAI has forged another significant partnership with Microsoft to develop an AI-based supercomputer dubbed “Stargate,” with projected costs reaching $100 billion, as reported by The Information. These recent endeavors underscore OpenAI’s continued commitment to pioneering advancements in artificial intelligence and collaborative innovation with industry leaders.

Share your love
Facebook
Twitter
LinkedIn
WhatsApp

Newsletter

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Unauthorized Content Copy Is Not Allowed