AI voice assistants evolve, promising deeper interactions

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

The landscape of generative AI is shifting, with tech giants betting on advanced voice assistants as the next frontier.

Google’s recent launch of Gemini Live for Android users marks a significant milestone in this AI marathon, closely following OpenAI’s development of ChatGPT’s Advanced Voice Mode. These next-generation voice assistants represent a leap forward from their predecessors like Apple’s Siri and Amazon’s Alexa.

“Google’s Gemini Live focuses on seamless integration with existing ecosystems and devices, while OpenAI’s GPT-4 emphasizes human-like conversation with a low millisecond response delay,” says Stephen Kowski, Field CTO at SlashNext Email Security+. “Both push boundaries in emotional recognition, contextual understanding and handling interruptions.”

Google’s Gemini Live, available to Gemini Advanced subscribers for $20 per month, aims to become a digital sidekick rather than a simple voice app. It promises deep integration with Google’s ecosystem, allowing users to interact with apps like Gmail, Calendar and Maps through natural conversation. Similarly, OpenAI’s Advanced Voice Mode, currently in alpha testing, boasts human-like interactions and demonstrated musical abilities in earlier versions.

Meanwhile, Apple is gearing up to release a generative AI-powered upgrade to Siri with iOS 18 this fall, promising more natural and contextually relevant interactions. Amazon, too, is reportedly developing a subscription-based, AI-enhanced version of Alexa to compete in this evolving market. And IBM recently introduced new features for its watsonx Assistant that leverage large speech models (LSMs) to enhance speech recognition in phone channels. These advancements, which IBM claims outperform OpenAI’s Whisper model in specific customer service scenarios, aim to transform call center operations by offering more natural and accurate voice interactions.

This push towards more sophisticated voice AI reflects a broader industry trend. Tech companies are betting that voice will become a primary interface for AI interactions, offering a more natural and intuitive way for users to access the power of large language models in their daily lives.

As these assistants become more capable and integrated into our routines, they promise to revolutionize our interactions with technology. From managing schedules and summarizing emails to providing on-the-fly information about locations or videos, these AI companions aim to blend seamlessly into our digital experiences.

However, this rapid advancement raises important questions about privacy, data collection and the ethical implications of increasingly human-like AI interactions. Kowski notes, “As AI voice assistants become more integrated, concerns arise around data collection, storage and potential misuse of personal information. There are also ethical considerations regarding consent, transparency about AI interactions and the potential for manipulation or misinformation.”

IBM is a leading global hybrid cloud and AI, and business services provider, helping clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM's hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently, and securely. IBM's breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM's legendary commitment to trust, transparency, responsibility, inclusivity, and service.

For more information, visit: www.ibm.com.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  •  

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: