Tech

Microsoft brings out a small language model that can look at pictures

Published

7 months ago

May 21, 2024

Admin

Microsoft brings out a small language model that can look at pictures

Phi-3-vision is a multimodal model — aka it can read both text and images — and is best used on mobile devices. Microsoft says Phi-3-vision, now available on preview, is a 4.2 billion parameter model (parameters refer to how complex a model is and how much of its training it understands) that can do general visual reasoning tasks like asking questions about charts or images.

But Phi-3-vision is far smaller than other image-focused AI models like OpenAI’s DALL-E or Stability AI’s Stable Diffusion. Unlike those models, Phi-3-vision doesn’t generate images, but it can understand what’s in an image and analyze it for a user.

Microsoft announced Phi-3 in April with the release of Phi-3-mini, the smallest Phi-3 model at 3.8 billion parameters. The Phi-3 family has two other members: Phi-3-small (7 billion parameters) and Phi-3-medium (14 billion parameters).

AI model developers have been putting out small, lightweight AI models like Phi-3 as demand to use more cost-effective and less compute-intensive AI services grows. Small models can be used to power AI features on devices like phones and laptops without the need to take up too much computer memory. Microsoft already released other small models in addition to Phi-3 and its predecessor, Phi-2. Its math problem solving model, Orca-Math, reportedly answers math questions better than its bigger counterparts, like Google’s Gemini Pro.

Phi-3-vision is now available on preview. Other members of the Phi-3 family — Phi-3-mini, Phi-3-small, and Phi-3-medium — are now available through Azure’s model library.

Crunchbase News Today

Microsoft brings out a small language model that can look at pictures

Tech

Microsoft brings out a small language model that can look at pictures

Fashion Briefing: How The Row cemented its luxury leader status in 2024

Boom in US retail real estate defies prediction of ecommerce apocalypse

“Get 10 times larger or…’: Salesforce CEO Marc Benioff recalls career-defining advice from Steve Jobs

Christmas travel rush kicks off in West Tennessee – WBBJ TV

Lunar Horoscopes 2025: Discover what your Moon sign says about the future – Times of India

Oakland business holds annual Christmas brunch for those in need

La Crosse County DA plans series of information sessions about how prosecutors do their jobs

Adam Schefter blasts Mike Florio’s criticism of ESPN’s Pete Carroll scoop: ‘100 percent wrong’

Suspended jail term for actress Han So-hee’s mother over illegal gambling

Lakers star Anthony Davis leaves Christmas game vs. Warriors early after rolling ankle