Tech

New Open-Source Champion Reflection 70B Outperforms GPT-4o and Claude Sonnet 3.5

Published

3 months ago

September 6, 2024

Admin

New Open-Source Champion Reflection 70B Outperforms GPT-4o and Claude Sonnet 3.5

Matt Shumer, co-founder and CEO of AI writing startup HyperWrite recently launched a new model called Reflection 70B.

I’m excited to announce Reflection 70B, the world’s top open-source model.

Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.

405B coming next week – we expect it to be the best model in the world.

Built w/ @GlaiveAI.

Read on ⬇️: pic.twitter.com/kZPW1plJuo

— Matt Shumer (@mattshumer_) September 5, 2024

The model has emerged as a leading open-source language model, outperforming top closed-source models like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5. The model, developed using a novel technique called Reflection-Tuning, showcases significant improvements in benchmark tests, including MMLU, MATH, IFEval, and GSM8K.

The Reflection-Tuning technique allows Reflection 70B to detect and correct its own mistakes before finalising an answer. This advancement aims to address the common issue of model hallucinations and improve reasoning accuracy.

The model outputs its internal reasoning in tags and final answers in tags, with additional tags used for correcting any detected errors.

Currently, Reflection 70B holds the top position in several benchmarks and demonstrates superior performance over GPT-4o and Llama 3.1 405B. The upcoming Reflection 405B model, expected next week, is anticipated to further elevate the standard for LLMs globally.

This is second model this week outperforming GPT-4o and Claude Sonnet 3.5

Alibaba recently released Qwen2-VL, the latest model in its vision-language series. The new model can chat via camera, play card games, and control mobile phones and robots by acting as an agent. It is available in three versions: the open source 2 billon and 7 billion models, and the more advanced 72 billion model, accessible using API.

The advanced 72 billion model of Qwen2-VL achieved SOTA visual understanding across 20 benchmarks. “Overall, our 72B model showcases top-tier performance across most metrics, often surpassing even closed-source models like GPT-4o and Claude 3.5-Sonnet,”said the company in a blog post, saying that it demonstrates a significant edge in document understanding.

Crunchbase News Today

New Open-Source Champion Reflection 70B Outperforms GPT-4o and Claude Sonnet 3.5

Tech

New Open-Source Champion Reflection 70B Outperforms GPT-4o and Claude Sonnet 3.5

Here’s what Pa Gov. learned from small businesses at a West Philly barbershop

What’s Next for Airbnb – Skift Travel Podcast

World Day of Remembrance 2024 – Bicycle Coalition of Greater Philadelphia

Bentley on Track to be the First Fair Trade-Certified U.S. Business University

Perspectives in Artificial Intelligence: Creating jobs, not replacing them | Marquette Today

Olivia Rodrigo’s Most Recent No. 1 Hit Returns In Spectacular Fashion

DineDrinkCLE talks The W Sports Bar, West Side Market holiday happens, Abundance Culinary, more

CIMMYT at the World Agri-Tech Innovation Summit 2024

Insider notebook: Michigan’s secret week with Bryce Underwood, Big Ten coach predicts Ohio State-Indiana score

Tiny Nick’s Gambling Picks: 11/22 – Zone Coverage