2 days ago7 min read

Sailing Through Shipmas: Unpacking OpenAI’s 12 Days of Updates

Last Friday OpenAI wrapped up its festive “12 Days of OpenAI” (a.k.a "Shipmas"), delivering a sleigh-load of updates, tools, and capabilities. To help digest the whirlwind of announcements, I’ve summarized the eight most significant announcements below, and shared my personal insights on each.

Super Models

OpenAI o1: Better, Smarter and More Accessible

OpenAI released the production version of its o1 reasoning model, which boasts 34% fewer errors and significantly better performance in math and coding tasks compared to its preview version. It’s now available to all ChatGPT paying users, with usage limitations due to its computational intensity. I’ve used o1 and it’s really great - it produced a piece of code that worked on the first try, while both GPT-4o and Claude Sonnet 3.5 failed to provide me with the correct code even after several iterations.

OpenAI also introduced a new subscription tier called “ChatGPT Pro”, which costs $200 a month (compared to 20$ for ChatGPT Plus). This tier provides unlimited access to GPT-4o, o1 and Advanced Voice Mode, and limited access to a new mode of o1 called “o1 pro mode”, which uses more compute than the regular o1 to “think harder” and provide even better answers. While the performance gap between o1 and o1 pro is minimal (see the graph above), a ChatGPT Pro subscription is worthwhile for heavy users requiring unlimited access to GPT-4o and the regular o1 model. There is nothing more annoying than paying for a premium service (ChatGPT Plus), and then getting blocked for 3 hours after you’ve reached your usage limits…

OpenAI o3: A Leap Towards AGI

The highlight of Shipmas came on the last day, when OpenAI announced o3, their next-generation reasoning model. o3 delivers unprecedented performance in reasoning and problem-solving tasks, and excels in handling highly complex multi-step problems across fields such as advanced mathematics and scientific research.

OpenAI's o3 achieved a groundbreaking 25.2% success rate on the new FrontierMath benchmark, which includes exceptionally challenging problems requiring extensive logical reasoning, while all previous AI models scored below 2%. o3 also demonstrated remarkable improvements in coding tests with a 22.8% gain over o1 on the SWE-Bench Verified benchmark, and a 2727 "Codeforces" rating, which ranks o3 as the #175 best human competitive coder on the planet…

o3 also aced the “ARC-AGI” benchmark, scoring 75.7% in “low-compute” mode and 87.5% in “high-compute” mode. Note that in “high compute” mode, o3 consumes thousands of dollars per task… The high score on the ARC-AGI benchmark doesn’t mean that o3 has reached AGI, since it still fails on some of the easy questions, but this is significantly higher than any other model : o1 preview had a score of 13%, and the release version of o1 reached 31%.

Finally Video

Sora: A Dream for Creatives

OpenAI’s long-awaited text-to-video model Sora, announced back in February, was finally released this month. After a few post-launch hiccups where users couldn’t register for the service, it is now available to all paying ChatGPT users. The web-based interface at Sora.com has a built-in editor with some nice features - trimming and extending a video, re-mixing it by describing changes, blending it with another video, and creating seamless video loops.

In the 10 months from announcement to release, several competitors have launched their own solutions - Runway, Luma Labs, Pika, Kling, Minimax and others. Sora’s video quality is certainly superb to most of these, but a day after its release Google made a significant move by releasing Veo 2, its own text-to-video model, which clearly beats Sora in side-by-side comparisons. I’m sure OpenAI is already working on Sora 2, and 2025 will definitely be the year where text-to-video matures and reaches the quality of current text-to-image models.

Real-time Video in Advanced Voice Mode

6 months ago I was amazed by two competing demos from OpenAI and Google, which featured users talking to their AI Chatbots while showing them a live video feed from the phone’s camera. Now, both companies have launched this capability. While Google’s version is currently available only in Google AI Studio, and is not part of the Gemini chatbot yet, OpenAI integrated it directly into ChatGPT, as part of Advanced Voice Mode, and it is available on the web and on its desktop and mobile apps. You can share live video not only from the camera, but also from your screen - so you can talk to ChatGPT about websites or apps that you use on your computer or phone. I’ve noticed however that the video capability sometimes disappears from Advanced Voice Mode, and other users have reported similar issues - I believe this is due to the load on OpenAI’s servers, since analyzing real-time video takes a lot of GPU resources.

But honestly, I haven’t found a killer use case for this feature. Obviously, it can help people who are visually impaired. But what can you do beyond that? What’s the added value of using the real-time video mode over snapping a photo or screenshot and uploading it to ChatGPT? If you find one, let me know in the comments.

Apple Integrations

ChatGPT Meets Apple Intelligence

Apple announced that ChatGPT would be integrated into iOS and macOS a few months ago, so this one wasn’t really a surprise, but OpenAI elegantly bundled it into their “12 days of OpenAI” announcement streak. The integration lets you use ChatGPT in Apple’s writing tools to create text or images, and talk to it about places and objects you see using your phone’s camera (similar to OpenAI’s video for AVM feature discussed above).

Enhanced macOS App Connectivity

ChatGPT “work with apps” feature was launched a few weeks ago - it lets the ChatGPT desktop app on Mac access text from other apps, which it can then process, enhance, debug, etc. Now, OpenAI is expanding “work with apps” to support many more apps, such as Apple Notes, Notion, Quip, Xcode, VS Code, PyCharm, TextEdit, Terminal and more. Another update is that “work with apps” now supports Advanced Voice Mode, so you can talk to ChatGPT and request edits while viewing text in the supported apps.

Personally I don’t think this feature is very useful, because ChatGPT copies the text from the app, but does not insert the updated text back into the app - you need to copy and paste manually.

Power Tools

Canvas: Co-writing with AI

The “Canvas” feature in ChatGPT, which is sort of a built-in text editor, was Initially launched in beta, and is now fully available to all ChatGPT users. The feature was significantly enhanced since the beta, and is now activated automatically when ChatGPT creates or edits a lot of text. Support for Canvas has been extended to CustomGPTs as well, so your custom chatbots will also activate and use this feature when needed.

I find this feature very useful, as it lets me control which parts of the text are modified or enhanced by ChatGPT, and which parts are untouched. The editor has some glitches (for example, cutting paragraphs and pasting them somewhere else is quite buggy), but it’s still a great feature, and the “do/undo” combined with version history really helps a lot.

Another interesting update to Canvas is code execution. Since code editing and debugging is one of the main uses for Canvas, OpenAI has added a complete Python environment which lets you run and debug your code directly within the Canvas. This is currently limited to text-based outputs, and not all libraries are supported, but it is yet another step towards making ChatGPT a complete environment for code development.

Projects: Organize Your Chats

ChatGPT history was always a mess - an endless chronological list of every session you’ve ever had with the tool, with search functionality added only recently. Now, the new Projects feature enables you to group chats together, and define system prompts and knowledge files which work across all the chats in a Project - this is very similar to Claude’s Projects feature. Projects are particularly beneficial for teams working on recurring tasks or deliverables, as they maintain continuity across related work and facilitate efficient collaboration.

Summary

I’ve summarized below my take on these recent announcement, and some recommendations on which tools should be used when.

Get a ChatGPT Plus subscription. For just 20$ a month you get 5x usage compared to the free plan, and access to all the goodies: ChatGPT Search, Code Interpreter, building Custom GPTs, Advanced Voice Mode with Video, o1 model, and more. If you’re a REALLY heavy user, consider the $200 per month unlimited ChatGPT Pro tier, but don’t do it just for the o1 pro mode - the added value over o1 doesn’t justify the price.
Use the o1 model as your first choice for any task involving complex problem solving, coding, algorithm developments and technical research. Use GPT-4o for creating content, and accessing ChatGPT’s tools such as Search, Canvas and image creation. Be on the lookout for o3 (or its younger brother o3 mini) for even better coding, math and reasoning.
Sora is great for basic text-to-video generation, and free if you already have a ChatGPT paid account. But for more professional work, check out Pika Labs Ingredients which lets you combine your own images, objects and backgrounds into the generated videos.
Real time video in AVM: Amazing demo, but solid, day-to-day use cases are still missing. This might change when it's integrated into a wearable device such as glasses (and Google is already testing such a product).
Apple Intelligence is mainly for the ease of use and integration, but you can do everything it offers and more with the AI Chatbot apps (ChaptGPT, Gemini, Claude, Copilot). ChatGPT’s “work with apps” on macOS will be useful only after adding the ability to insert text from ChatGPT back into the app.
Canvas is great for co-working with AI - I even used it to review this blog post!

That wraps up my take on the OpenAI December announcements. Can’t wait to see what they have lined up for us in 2025!

GAMDALA