OpenAI has introduced long-awaited enhancements to its well-known chatbot, ChatGPT, which will now enable it to communicate through images and voices. This release marks a significant milestone in OpenAI's goal of achieving artificial general intelligence capable of interpreting and analyzing information from various sources, not just text.

Iran PressSci & Tech: "We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about," OpenAI said in its official blog post.

OpenAI said the new ChatGPT-Plus will include voice chat powered by a novel text-to-speech model capable of mimicking human voices, and the ability to discuss images thanks to integration with the company’s image generation models. The new features seem to be part of what is known as GPT Vision (or GPT-V, which is often confused with a theoretical GPT-5) and represent key components of the enhanced multimodal version of GPT-4 that OpenAI teased earlier this year

This upgrade comes right after OpenAI unveiled DALL-E 3, its most advanced text-to-image generator yet. Hailed as "insane” by early testers due to its quality and accuracy, DALL-E 3 can create high-fidelity images from text prompts while understanding complex context and concepts expressed in natural language. It will be built into ChatGPT Plus, a subscription-based service that offers a ChatGPT powered by GPT-4.

The integration of DALL-E 3 and conversational voice chat signifies OpenAI’s push towards AI assistants that can perceive the world more like humans do - with multiple senses. According to the company: “Voice and image give you more ways to use ChatGPT in your life. Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it.”

214

Read More:

New AI generates impressive typography