Tech
Google Lens can now answer questions about videos | TechCrunch
Google is upgrading its visual search app, Lens, with the ability to answer near-real-time questions about your surroundings.
English-speaking Android and iOS users with the Google app installed can now start capturing a video via Lens and ask questions about objects of interest in the video.
Lou Wang, director of product management for Lens, said the feature uses a “customized” Gemini model to make sense of the video and pertinent questions. Gemini is Google’s family of AI models and powers a number of products across the company’s portfolio.
“Let’s say you want to learn more about some interesting fish,” Wang said in a press briefing. “[Lens will] produce an overview that explains why they’re swimming in a circle, along with more resources and helpful information.”
To access Lens’ new video analysis feature, you must sign up for Google’s Search Labs program, as well as opt in to the “AI Overviews and more” experimental features in Labs. In the Google app, holding your smartphone’s shutter button activates Lens’ video-capturing mode.
Ask a question while recording a video, and Lens will link out to an answer supplied by AI Overviews, the feature in Google Search that uses AI to summarize information from around the web.
According to Wang, Lens uses AI to determine which frames in a video are most “interesting” and salient — and above all, relevant to the question being asked — and uses these to “ground” the answer from AI Overviews.
“All this comes from an observation of how people are trying to use things like Lens right now,” Wang said. “If you lower the barrier of asking these questions and helping people satisfy their curiosity, people are going to pick this up pretty naturally.”
The launch of video for Lens comes on the heels of a similar feature Meta previewed last month for its AR glasses, Ray-Ban Meta. Meta plans to bring real-time AI video capabilities to the glasses, letting wearers ask questions about what’s around them (e.g., “What type of flower is this?”).
OpenAI has also teased a feature that lets its Advanced Voice Mode tool understand videos. Eventually, Advanced Voice Mode — a premium ChatGPT feature — will be able to analyze videos in real time and take context into account as it answers you.
Google has beaten both companies to the punch, it seems — minus the fact that Lens is asynchronous (you can’t chat with it in real time), and assuming that the video feature works as advertised. We weren’t shown a live demo during the press briefing, and Google has a history of overpromising when it comes to its AI’s capabilities.
Aside from video analysis, Lens can also now search with images and text in one go. English-speaking users, including those not enrolled in Labs, can launch the Google app and hold the shutter button to take a photo, then ask a question by speaking out loud.
Finally, Lens is getting new e-commerce-specific functionality.
Starting today, when Lens on Android or iOS recognizes a product, it’ll display information about it, including the price and deals, brand, reviews, and stock. Product ID works on uploaded and newly snapped photos (but not videos), and it is limited to select countries and certain shopping categories, including electronics, toys, and beauty, for now.
“Let’s say you saw a backpack, and you like it,” Wang said. “You can use Lens to identify that product and you’ll be able to instantly see details you might be wondering about.”
There’s an advertising component to this, too. The results page for Lens-identified products will also show “relevant” shopping ads with options and prices, Google says.
Why stick ads in Lens? Because roughly 4 billion Lens searches each month are related to shopping, per Google. For a tech giant whose lifeblood is advertising, it’s simply too lucrative an opportunity to pass up.