Connect with us

Tech

OpenAI brings fine-tuning to GPT-4o with 1M free tokens per day through Sept. 23

Published

on

OpenAI brings fine-tuning to GPT-4o with 1M free tokens per day through Sept. 23

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


OpenAI today announced that it is allowing third-party software developers to fine-tune — or modify the behavior of — custom versions of its signature new large multimodal model (LMM), GPT-4o, making it more suitable for the needs of their application or organization.

Whether it’s adjusting the tone, following specific instructions, or improving accuracy in technical tasks, fine-tuning enables significant enhancements with even small datasets.

Developers interested in the new capability can visit OpenAI’s fine-tuning dashboard, click “create,” and select gpt-4o-2024-08-06 from the base model dropdown menu.

The news comes less than a month after the company made it possible for developers to fine-tune the model’s smaller, faster, cheaper variant, GPT-4o mini — which is however, less powerful than the full GPT-4o.

“From coding to creative writing, fine-tuning can have a large impact on model performance across a variety of domains,” state OpenAI technical staff members John Allard and Steven Heidel in a blog post on the official company website. “This is just the start—we’ll continue to invest in expanding our model customization options for developers.”

Free tokens offered now through September 23

The company notes that developers can achieve strong results with as few as a few dozen examples in their training data.

To kick off the new feature, OpenAI is offering up to 1 million tokens per day for free to use on fine-tuning GPT-4o for any third-party organization (customer) now through September 23, 2024.

Tokens refer to the numerical representations of letter combinations, numbers, and words that represent underlying concepts learned by an LLM or LMM.

As such, they effectively function like an AI model’s “native language” and are the measurement used by OpenAI and other model providers to determine how much information a model is ingesting (input) or providing (output). In order to fine-tune an LLM or LMM such as GPT-4o as a developer/customer, you need to convert the data relevant to your organization, team, or individual use case into tokens that it can understand, that is, tokenize it, which OpenAI’s fine-tuning tools provide.

However, this comes at a cost: ordinarily it will cost $25 per 1 million tokens to fine-tune GPT-4o, while running the inference/production model of your fine-tuned version costs $3.75 per million input tokens and $15 per million output tokens.

For those working with the smaller GPT-4o mini model, 2 million free training tokens are available daily until September 23.

This offering extends to all developers on paid usage tiers, ensuring broad access to fine-tuning capabilities.

The move to offer free tokens comes as OpenAI faces steep competition in price from other proprietary providers such as Google and Anthropic, as well as from open-source models such as the newly unveiled Hermes 3 from Nous Research, a variant of Meta’s Llama 3.1.

However, with OpenAI and other closed/proprietary models, developers don’t have to worry about hosting the model inference or training it on their servers — they can use OpenAI’s for those purposes, or link their own preferred servers to OpenAI’s API.

Success stories highlight fine-tuning potential

The launch of GPT-4o fine-tuning follows extensive testing with select partners, demonstrating the potential of custom-tuned models across various domains.

Cosine, an AI software engineering firm, has leveraged fine-tuning to achieve state-of-the-art (SOTA) results of 43.8% on the SWE-bench benchmark with its autonomous AI engineer agent Genie — the highest of any AI model or product publicly declared to datre.

Another standout case is Distyl, an AI solutions partner to Fortune 500 companies, whose fine-tuned GPT-4o ranked first on the BIRD-SQL benchmark, achieving an execution accuracy of 71.83%.

The model excelled in tasks such as query reformulation, intent classification, chain-of-thought reasoning, and self-correction, particularly in SQL generation.

Emphasizing safety and data privacy even as it’s used to fine-tune new models

OpenAI has reinforced that safety and data privacy remain top priorities, even as they expand customization options for developers.

Fine-tuned models allow full control over business data, with no risk of inputs or outputs being used to train other models.

Additionally, the company has implemented layered safety mitigations, including automated evaluations and usage monitoring, to ensure that applications adhere to OpenAI’s usage policies.

Yet research has shown that fine-tuning models can cause them to deviate from their guardrails and safeguards, and reduce their overall performance. Whether organizations believe it is worth the risk is up to them — however, clearly OpenAI thinks it is and is encouraging them to consider fine-tuning as a good option.

Indeed, when announcing new fine-tuning tools for developers back in April — such as epoch-based checkpoint creation — OpenAI stated at that time that  “We believe that in the future, the vast majority of organizations will develop customized models that are personalized to their industry, business, or use case.”

The release of new GPT-4o fine tuning capabilities today underscores OpenAI’s ongoing commitment to that vision: a world in which every org has its own custom AI model.

Continue Reading