OpenAI o1 goes Pro - Sync #496

Plus: DeepMind Genie 2; Google released Veo and Imagen 3 on Vertex AI; Tesla Optimus shows off new hand; Grok is free for all X users; ads might be coming to ChatGPT; Waymo comes to Miami; and more!

Dec 08, 2024

Hello and welcome to Sync #496!

On day one of 12 Days of OpenAI, OpenAI dropped new models—full o1 and o1 pro mode—and launched ChatGPT Pro—a new $200 per month subscription tier for ChatGPT power users. We will take a closer look at what OpenAI brings now to the table in this week’s issue of Sync.

Meanwhile, Google released Veo and Imagen 3 on Vertex AI, and DeepMind released Genie 2—a foundation model capable of generating highly realistic virtual worlds from a single image prompt. Elsewhere in AI, OpenAI is considering dropping its AGI clause to unlock more investments, partners with defense company Andruil and explores including ads in ChatGPT. The ARC Prize has revealed its 2024 results, while researchers have shown that a two-hour-long interview is sufficient for AI to clone someone’s personality.

Over in robotics, Tesla’s Optimus humanoid robot shows off its new hand, and Waymo is expanding its robotaxi services to Miami.

We will also meet a neuroscientist who believes humans could live forever thanks to brain preservation, and learn how Delhi is using drones, satellites and artificial rain to tackle its severe air pollution.

Enjoy!

OpenAI o1 goes Pro

It’s the holiday season, and OpenAI has decided to create their own version of the 12 Days of Christmas, which they call the 12 Days of OpenAI. As Sam Altman explained, “each weekday, we [OpenAI] will have a livestream with a launch or demo, some big ones and some stocking stuffers.”

12 Days of OpenAI started with a “big one”—a release of full o1 model and o1 pro mode, as well as with the launch of ChatGPT Pro—a new $200 per month subscription tier for ChatGPT power users.

OpenAI o1, which was first released in preview mode in September, is OpenAI’s first reasononing model. Unlike other models from the GPT family, o1 is capable of "thinking" or "reasoning" before producing an answer. If we compare these models to how humans think, models from the GPT family are like a person who says the first thing that comes to mind. o1, on the other hand, takes some time before responding to a prompt. It behaves similarly to a human when presented with a difficult problem that requires thought—it generates chains of reasoning, critically analyses them, and, in theory, catches and corrects errors in its reasoning, resulting in higher-quality responses.

As OpenAI claims, the full o1 is smarter, faster, more reliable, and better at following instructions compared to the o1 preview or GPT-4o. It also makes 34% fewer mistakes than the preview version and generates answers 50% faster. Additionally, the full o1 can reason over images and read files uploaded to the conversation. However, it still lacks the ability to browse the internet for answers, and API access remains limited.

Limited access to o1 is available to ChatGPT Plus subscribers. For those who want or need unlimited access to o1, OpenAI has introduced a new subscription tier—a $200 per month ChatGPT Pro plan, which is ten times more expensive than the current ChatGPT Plus plan. While o1 may not deliver ten times the performance (we will touch on that in a bit), it would not be surprising if running o1 costs OpenAI ten times more than GPT-4o.

ChatGPT Pro, designed for ChatGPT power users, provides unlimited access to o1 and o1-mini models, as well as unlimited access to GPT-4o and Advanced Voice features. Additionally, ChatGPT Pro includes access to o1 pro mode, which, as OpenAI describes, uses more compute to deliver the best answers to the most challenging questions.

With o1 and o1 pro mode, OpenAI promises a new level of performance, positioning these models as an ideal choice for users such as engineers, scientists, and coders who could benefit from their capabilities. But how accurate are these claims?

o1 and o1 pro mode are new models, and the AI community is still evaluating their performance and potential applications. However, by examining the benchmark scores shared in the o1 System Card, we can get an idea of how these two new models perform.

Generally, o1 and o1 pro mode outperform o1 preview in tasks like mathematics, coding or science-based reasoning but not by much. And if we add other models, like GPT-4o or Claude 3.5 Sonnet, into the mix, o1 and o1 pro mode fall behind them in some benchmarks. As AI Explained concluded in his excellent analysis of OpenAI’s new models, there are some improvements but they are not big enough to justify the $200 per month price to access full unlimited o1 for most users, especially given the competition and existing alternatives.

o1 and o1 pro mode are definitely interesting developments in the AI space. They signal that we might be entering a new stage in AI development, where performance improvements for AI models come not just from larger or better-optimised foundation models but also from what is built on top of those models. o1, with its reasoning systems, is an exa,ple of this new class of models.

It will also be interesting to see what OpenAI has in store for us for the remainder of the 12 Days of OpenAI. Some see hints that perhaps GPT-4.5 could be revealed in the coming days.

If you enjoy this post, please click the ❤️ button or share it.

Do you like my work? Consider becoming a paying subscriber to support it

Become a paid subscriber

For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.

Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.

🦾 More than a human

‘With brain preservation, nobody has to die’: meet the neuroscientist who believes life could be eternal
The field of longevity is exploring various methods to extend human lifespan, and Dr. Ariel Zeleznikow-Johnston proposes brain preservation as a option worth researching. This approach could serve as a "pause button" for terminally ill individuals, preserving their identity until future advancements in medicine and technology enable life extension or restoration. Zeleznikow-Johnston draws parallels to historical medical breakthroughs, such as insulin for diabetes and anesthesia, which turned seemingly impossible problems into routine medical practices. In his words, “With the advent of brain preservation, I don’t think that you, or anyone you love, has to die at all.”

🧠 Artificial Intelligence

Veo and Imagen 3: Announcing new video and image generation models on Vertex AI
Veo, Google’s video generator, and Imagen 3, its text-to-image model, are now available on the Vertex AI platform. According to Google, both models have been developed with enterprise safety and security in mind. Outputs from the models are watermarked using SynthID, and both include built-in safeguards to prevent the creation of harmful content, aligning with Google’s Responsible AI Principles. Additionally, Google has promised not to use customer data to train its models and offers copyright indemnity to provide customers with peace of mind through an industry-first approach to copyright concerns.

Elon Musk’s xAI lands $6B in new cash to fuel AI ambitions
xAI, Elon Musk’s AI company, has raised $6 billion in its latest funding round, according to a filing with the US Securities and Exchange Commission on Thursday, bringing its total funding to $12 billion and valuing the company at $50 billion. Notable investors include Valor Equity Partners, Sequoia Capital, Andreessen Horowitz, and Qatar Investment Authority. The newly raised funds will be used to improve xAI’s flagship AI model, Grok, and to build a massive AI data centre in Memphis equipped with 100,000 Nvidia GPUs.

Grok is now free for all X users
X's AI chatbot, Grok, is now available for free to all users with certain limitations. Free users are allowed 10 prompts and image generations every two hours, while image analysis is restricted to three uses per day unless they subscribe to X Premium or X Premium+.

Genie 2: A large-scale foundation world model
Google DeepMind released Genie 2, a large-scale foundation world model capable of generating realistic 3D virtual worlds from a single image prompt. The generated worlds are interactive and either a human player or an AI agent can interact with it. In their showcase, researchers demonstrate various aspects of these worlds, including realistic physics, lighting and reflections, NPCs, animations, and diverse environments. Potential applications of Genie 2 include training AI agents in highly realistic virtual environments, as well as use by artists and game developers to accelerate their creative processes.

OpenAI partners with defense company Anduril
OpenAI and defence technology company Anduril have announced a partnership to deploy advanced AI systems for US national security missions, focusing on counter-unmanned aircraft systems (CUAS) to detect, assess, and respond to aerial threats in real time. AI models will synthesise data, reduce the burden on human operators, and improve situational awareness in high-stakes scenarios. The partnership is part of a growing trend where AI companies are entering defense contracts, reversing earlier bans on military use of their tools. OpenAI claims its partnership with Anduril aims to protect military personnel from drone attacks and that the use of its AI will align with its policy of not causing harm.

ARC Prize 2024 Results
The team behind the ARC-AGI benchmark and ARC Prize has published a summary of the contest’s progress to date. Launched this year, the contest aims to accelerate AGI research and development by challenging participants to create an AI system capable of scoring at least 85% on the ARC-AGI benchmark. While no AI has reached that level yet, state-of-the-art results have improved significantly, rising from 33% to 55.5% this year. The team also released the ARC Prize 2024 Technical Report, which surveys top approaches, reviews new open-source implementations, examines the limitations of the ARC-AGI-1 dataset, and shares key insights gained from the competition.

OpenAI seeks to unlock investment by ditching ‘AGI’ clause with Microsoft
OpenAI is reportedly considering removing a provision that restricts Microsoft, its key partner and investor, from accessing its most advanced technology upon achieving artificial general intelligence (AGI). This provision, intended to prevent AGI from being misused for commercial purposes, would transfer ownership to OpenAI's nonprofit board and exclude AGI from licensing agreements. However, OpenAI’s board may eliminate the AGI clause to attract new investment opportunities to further its research into AGI.

Ads might be coming to ChatGPT — despite Sam Altman not being a fan
Financial Times reports that OpenAI is considering introducing ads in ChatGPT. Although OpenAI currently has no active plans to pursue advertising, CFO Sarah Friar told the newspaper that the company is evaluating an ads-based business model with a focus on being “thoughtful” about when and where ads might appear. CEO Sam Altman, however, expressed reservations about ads during a recent fireside chat at Harvard Business School, describing them as a “last resort.” “I’m not saying OpenAI would never consider ads, but I don’t like them in general, and I think that ads-plus-AI is sort of uniquely unsettling to me,” Altman said.

Copilot Vision, Microsoft’s AI tool that can read your screen, launches in preview
Microsoft releases a limited, US-only preview of Copilot Vision, an AI tool in Microsoft Edge, that can analyze and respond to questions about websites users are visiting. Learning from the backlash over its announcement of Recall, Microsoft emphasises that the AI only operates on a pre-approved list of "popular" websites and does not function on paywalled or “sensitive” content, though Microsoft did not explain how it defines “sensitive” here. Additionally, data from sessions will not be stored or used for model training. The tool is also designed to respect websites’ AI-related controls and avoid scraping data without permission. Copilot Vision is available via Copilot Labs, part of the $20/month Microsoft Copilot Pro plan.

Meta joins the nuclear-powered AI fray
Meta joins Microsoft, Google, and Amazon in exploring nuclear energy to power its AI data centres and nearby communities with zero-carbon energy. The company has issued a request for proposals seeking partners to develop one to four gigawatts of nuclear capacity by the early 2030s. Meta is open to various reactor types, locations, and collaborative approaches, including cost-sharing during development and purchasing power once the projects are operational.

AI can now create a replica of your personality
A team of researchers from Stanford and Google DeepMind has discovered that a two-hour interview is sufficient to replicate someone's personality. The researchers interviewed 1,000 participants, asking questions about their childhood, formative memories, career, political views, and more. The responses were then fed into a large language model, which replicated their personalities with 85% accuracy in personality tests, social surveys, and logic games.

If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word?

Refer a friend

🤖 Robotics

Next stop: Miami
Waymo announced that Miami will become the fifth city, after Phoenix, Los Angeles, San Francisco, and Austin, where Waymo One, its autonomous taxi service, operates. The service is expected to be available to riders in 2026.

Tesla Optimus shows off new hand
Tesla’s humanoid robot, Optimus, got a new hand and shows what it can do with it by catching tennis balls.

Hyundai and Kia unveil "X-ble Shoulder," a wearable robot
Hyundai Motor Company and Kia Corporation present X-ble Shoulder, a wearable robot designed to improve industrial efficiency and reduce musculoskeletal injuries and fatigue. According to the press release, this lightweight robot reduces shoulder load by up to 60% and deltoid muscle strain by 30%. Both companies plan to introduce X-ble Shoulder in their maintenance and production lines in 2025, with plans to expand to overseas markets by 2026. No price has been announced for the X-ble Shoulder.

▶️ MagicBot at Work: Multi-Robot Collaboration in Action (1:52)

MagicLab, another startup building humanoid robots, shows in this video how they envision their robots can be used in a factory doing tasks such as product inspections, moving objects, precision assembly, barcode scanning, and inventory management.

💡Tangents

Can Artificial Rain, Drones, or Satellites Clean Toxic Air?
Delhi’s air quality has reached its worst level in eight years, with the AQI hitting a hazardous 494 on November 18, placing pollution in the “severe plus” category. Dangerous levels of PM2.5 are causing widespread respiratory and cardiovascular issues, particularly impacting children’s health and development. In response, Delhi is deploying various measures, such as drone monitoring, artificial rain, and anti-dust campaigns. Drones are used to spray mist along roads to suppress dust and to monitor and identify pollution hotspots, while satellites track air quality over large regions to inform regional approaches. However, experts criticize these methods as offering only temporary relief without addressing the systemic sources of pollution, highlighting the need for more effective, long-term solutions.

Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.

Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.

A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!

My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"

Discussion about this post

Ready for more?