OpenAI has released o3-mini, a new AI model in its reasoning series that focuses on STEM capabilities, especially coding, math, and science.
The AI firm announced the new AI model in a blog post.
OpenAI o3-mini is our first small reasoning model that supports highly requested developer features including function calling(opens in a new window), Structured Outputs(opens in a new window), and developer messages(opens in a new window), making it production-ready out of the gate. Like OpenAI o1-mini and OpenAI o1-preview, o3-mini will support streaming(opens in a new window). Also, developers can choose between three reasoning effort(opens in a new window) options—low, medium, and high—to optimize for their specific use cases. This flexibility allows o3-mini to “think harder” when tackling complex challenges or prioritize speed when latency is a concern. o3-mini does not support vision capabilities, so developers should continue using OpenAI o1 for visual reasoning tasks. o3-mini is rolling out in the Chat Completions API, Assistants API, and Batch API starting today to select developers in API usage tiers 3-5(opens in a new window).
OpenAI says its o1 models remains its flagship reasoning model, but o3-mini provides a specialized experience for those that need it.
In ChatGPT, o3-mini uses medium reasoning effort to provide a balanced trade-off between speed and accuracy. All paid users will also have the option of selecting o3-mini-high in the model picker for a higher-intelligence version that takes a little longer to generate responses. Pro users will have unlimited access to both o3-mini and o3-mini-high.
Interestingly, the o3-mini model outperforms o1 in some situations, especially within the STEM arena.
Similar to its OpenAI o1 predecessor, OpenAI o3-mini has been optimized for STEM reasoning. o3-mini with medium reasoning effort matches o1’s performance in math, coding, and science, while delivering faster responses. Evaluations by expert testers showed that o3-mini produces more accurate and clearer answers, with stronger reasoning abilities, than OpenAI o1-mini. Testers preferred o3-mini’s responses to o1-mini 56% of the time and observed a 39% reduction in major errors on difficult real-world questions. With medium reasoning effort, o3-mini matches the performance of o1 on some of the most challenging reasoning and intelligence evaluations including AIME and GPQA.
OpenAI also touts the speed and efficiency of the o3-mini model.
With intelligence comparable to OpenAI o1, OpenAI o3-mini delivers faster performance and improved efficiency. Beyond the STEM evaluations highlighted above, o3-mini demonstrates superior results in additional math and factuality evaluations with medium reasoning effort. In A/B testing, o3-mini delivered responses 24% faster than o1-mini, with an average response time of 7.7 seconds compared to 10.16 seconds.
The o3-mini models continues OpenAI’s efforts to offer a variety of AI models, tuned to specific tasks and uses.
from WebProNews https://ift.tt/3ydDPBt
No comments:
Post a Comment