Tech
I Tried the iPhone 16's New Visual Intelligence, and It Feels Like the Future
As I walked past a Japanese tea shop in New York’s Bowery Market, all I had to do was point my iPhone at the storefront and hold a button on the side of my phone to see its hours, photos from customers and options to call the shop or place an order.
Apple’s new Visual Intelligence tool, which will be available for the iPhone 16 lineup, is meant to cut out the middle step of unlocking your phone, opening Google or ChatGPT, and typing in a query or uploading a photo to get an answer. An early version of the feature is available as part of Apple’s iOS 18.2 developer beta, which launched to those in the program on Wednesday.
Although the version I tried is an early preview meant for developers rather than general users, it gave me a sense of how Visual Intelligence works and what it could bring to the iPhone experience. In my brief time testing this very early version, I found that it works best for quickly pulling up information about points of interest on the fly. While it can be convenient, I also imagine it will take consumers time to embrace the feature once it launches since it represents a new way of thinking about how we operate our phones.
Still, it hints at a future in which we don’t have to open as many apps to get things done on our mobile devices, and that’s promising.
But I’ll have more to say about it once I’ve spent more time with it and after the final version launches.
Read more: ‘A Cambrian Explosion:’ AI’s Radical Reshaping of Your Phone, Coming Soon
How Visual Intelligence works
Visual Intelligence relies on the new Camera Control button on the iPhone 16, 16 Plus, 16 Pro and 16 Pro Max. Just press and hold the button, and you’ll see a prompt explaining what Visual Intelligence is and informing you that images aren’t stored on your iPhone or shared with Apple.
When the Visual Intelligence interface is open, just tap the camera shutter button to take a photo. From there, you can tap an on-screen button to ask ChatGPT about the image, or you can press the searchbutton to launch a Google search. You can choose to use ChatGPT with or without an account; requests will remain anonymous and won’t be used to train ChatGPT’s model if you don’t sign in.
In the current version of Visual Intelligence, there’s also an option to report a concern by pressing the icon that looks like three dots. If you want to get rid of the image and take a different one instead, you can tap the X icon where the shutter button is usually located on screen.
Beyond using Google or ChatGPT, the iPhone will also surface certain options based on what you’re pointing the camera at, such as store hours if you’re pointing it at a shop or restaurant.
What it’s like to use it
During the short time I’ve spent with Visual Intelligence so far, I’ve used it to learn about restaurants and stores, ask questions about video games and more.
While it’s a quick and convenient way to access ChatGPT or Google, what’s most interesting to me is the way it can identify restaurants and stores. So far, this worked best when pointing the camera at a storefront rather than a sign or banner.
For example, when scanning the exterior of Kettl, the Japanese tea shop I mentioned earlier, Visual Intelligence automatically pulled up helpful information such as photos of the various drinks. It reacted similarly when I snapped a picture of a vintage video game store near my office. After hitting the shutter button, Apple displayed the name of the shop along with photos of the inside, a link to visit its website and the option to call the store.
Once I went inside, I used Visual Intelligence to ask ChatGPT for game recommendations based on titles in the store and to learn more about consoles and games in the shop. Its answers were pretty spot on, although it’s always worth remembering that chatbots like ChatGPT aren’t always accurate.
When I asked ChatGPT for games similar to the Persona Dancing series after taking a picture of the games on a shelf, it suggested other titles that are also music- and story-driven. That seems like a sensible answer since the Persona Dancing games are rhythm-based spin offs of the popular Persona Japanese roleplaying games. To find out that the GameBoy Color launched in 1998, all I had to do was snap a quick photo and ask when it was released. (For what it’s worth, I got similar results when asking the same questions in the ChatGPT app.)
Although I’ve been enjoying experimenting with Visual Intelligence so far, I feel like it would be far more useful when traveling. Being able to just point my iPhone at a landmark, store or restaurant to learn more about it would have come in handy during my trips to France and Scotland earlier this year. In a city I’m already familiar with, I don’t often find myself in quick demand for more information about nearby locations.
Read more: What I Learned After Swapping My Apple Watch for Samsung’s Galaxy Ring
Visual Intelligence and the future of phones
It feels impossible not to compare Visual Intelligence to Google Lens, which also lets you learn about the world around you by using your phone’s camera instead of typing in a search term. In its current form (which, again, is an early preview meant for developers), Visual Intelligence almost feels like a dedicated Google Lens/ChatGPT button.
That can make it feel like it’s not new or different, considering Google Lens has existed for years. But the fact that this type of functionality is so important it’s getting its own button on the newest iPhone is telling. It indicates that Apple believes there may be a better way to search for and get things done on our phones.
Apple is far from alone in that conviction; Google, OpenAI, Qualcomm and startups like Rabbit all believe AI can put the camera to use in new ways on our mobile devices by making it more of a discovery tool rather than just a means of capturing photos. At its annual Snapdragon Summit this week, Qualcomm showed off a virtual assistant concept that uses the camera to do things like split the bill three ways at a restaurant based on a photo of the receipt.
The trick is getting regular people to adopt it. Even if it’s faster and more efficient, I’m betting muscle memory may prevent many from ditching the old ways of tapping and swiping in favor of snapping photos.
Building new habits takes time. But Visual Intelligence is only in an early preview stage, so there’s much more to come.