Tech
ChatGPT can now see through your phone’s camera and screen — one of its most impressive features yet
- OpenAI launched its widely anticipated video feature for ChatGPT’s Advanced Voice Mode.
- It allows users to incorporate live video and screen sharing into conversations with ChatGPT.
- ChatGPT can interpret emotions, assist with homework, and provide real-time visual context.
ChatGPT’s Advanced Voice Mode can now help provide real-time design tips for your home, assistance with math homework, or instant replies to your texts from the Messages app.
After teasing the public with a glimpse of the chatbot’s ability to “reason across” vision along with text and audio during OpenAI’s Spring Update in May, the company finally launched the feature on Thursday as part of day six of OpenAI’s “Shipmas.”
“We are so excited to start the rollout of video and screen share in Advanced Voice today,” the company said in the livestream on Thursday. “We know this is a long time coming.”
OpenAI initially said the voice and video features would be rolling out in the weeks after its Spring Update. However, Advanced Voice Mode didn’t end up launching to users until September, and the video mode didn’t come out until this week.
The new capabilities help provide more depth in conversations with ChatGPT by adding “realtime visual context” with live video and screen sharing. Users can access the live video by selecting the Advanced Voice Mode icon in the ChatGPT app and then choosing the video button on the bottom far left.
In the livestream demonstration on Thursday, ChatGPT helped an OpenAI employee make pour-over coffee. The chatbot noticed details like what the employee was wearing and then walked him through the steps of making he drink, elaborating on certain parts of the process when asked. The chatbot also gave him feedback on his technique.
To share your screen with ChatGPT, hit the drop-down menu and select “Share Screen.” In the “Shipmas” demo, ChatGPT could identify that the user was in the Messages app, understand the message sent, and then help formulate a response after the user asked.
During the company’s Spring Update, OpenAI showed off some other uses of the video mode. The chatbot was able to interpret emotions based on facial expressions and also demonstrated its ability to act as a tutor. OpenAI Research Lead Barret Zoph walked through an equation on a whiteboard (3x+1=4) and ChatGPT provided him with hints to find the value of x.
The feature had a couple of stumbles during the Spring Update demonstration, like referring to one of the employees as a “wooden surface” or trying to solve a math problem before it was shown.
Now that it’s out, we decided to give the feature a whirl — and so far, it seems pretty impressive.
We showed the chatbot an office plant and asked it to tell us about it, give context on whether it’s healthy, and explain what the watering schedule should look like. The chatbot accurately described browning and drying on the leaf tips and identified it as an Aloe Vera plant, which seems to fit the right description.
The new video feature will be rolling out this week in the latest version of the ChatGPT mobile app to Team and most Plus and Pro users. The feature isn’t available in the EU, Switzerland, Iceland, Norway, and Liechtenstein yet, but OpenAI said it will be as soon as possible.