AI Voice Tools Enter a New Era of Risk

Mark Reddish

August 6, 2024

On July 30, OpenAI began rolling out advanced Voice Mode, which offers natural, real-time conversations that sense and respond to emotions. While AI-generated voice technology isn’t new, this latest degree of sophistication - remarkably human “pacing, intonation, even fake breathing” - could mark a new era for AI’s benefits and risks.

Speech-based interaction introduces additional layers of complexity to the already intricate landscape of AI. While the benefits such as education, companionship, and productivity for a variety of tasks are exciting, human-quality speech presents a heightened risk that AI is used for fraud, misinformation, and manipulation.

Our brains process information from speech and text differently. Research has shown that hearing information, rather than reading it, can make the source seem more intelligent and credible, and create stronger social bonds. The voice-based AI threats we’ve witnessed - like tricking voters into staying home or tricking a parent into paying a ransom - might soon seem quaint compared to the harms from increasingly sophisticated, compelling, dangerous, and accessible AI voice tools.

The government is not equipped to keep up with these evolving threats. Yes, we have laws against fraud, extortion, etc. And we’re seeing efforts to adapt laws and regulations, such as the Federal Communications Commission’s new proposal to protect consumers from the abuse of AI in robocalls and robotexts. But AI capabilities are changing too quickly, and the risks are too significant to be sufficiently managed with current legal and regulatory frameworks.

Part of the challenge arising from advanced AI systems is that the guardrails intended to prevent misuse can be circumvented (sometimes quite easily), and even the systems’ creators lack the ability to understand and explain the systems’ behavior. AI developers and researchers are working to improve explainability and safety, but right now the situation is reactionary (safety measures usually aren’t developed until after innocent bystanders are harmed) and voluntary (companies aren’t required to develop any safety measures at all). At some point, reasonable minds will agree that the voluntary and reactionary approach to safety isn’t enough. If we aren’t at that point with AI’s capabilities already, we likely will be soon. Mitigating the threats from AI requires mandatory implementation of mechanisms to prevent harm and better tools for the government to respond when harm occurs.

The Center for AI Policy has drafted comprehensive model legislation, as well as a 2024 Action Plan that prioritizes targeted measures for AI safety through cybersecurity standards, emergency preparedness, and whistleblower protections. Acting on any of these proposals would be helpful, but fundamentally, what’s needed is recognition that neither government nor industry should rely on its ability to deal with AI’s problems after they arise.

A safety-oriented approach is part and parcel of maintaining America’s leadership in AI. As the U.S. Artificial Intelligence Safety Institute puts it, “Safety breeds trust, trust provides confidence in adoption, and adoption accelerates innovation." As we move into this new era, with AI voice or whatever enhancement comes next, safety and trust are essential.

The future of AI is talking to us. Are we ready to listen?

Both Nobel Laureates and Everyday Americans Recognize the Need for AI Safety

Why is there still inaction in Congress?

The Risks and Rewards of AI Agents Cut Across All Industries

Unfortunately, not all AI agent applications are low-risk

When Polling Is Ahead of Politicians

Many voters prefer a careful, safety-focused approach to AI development over a strategy emphasizing speed and geopolitical competition