Llama 3 is out - Weekly News Roundup - Issue #463
Plus: brand-new, all-electric Atlas; AI Index Report 2024; Microsoft pitched GenAI tools to US military; Humane AI Pin reviews are in; debunking Devin; and more!
Hello and welcome to Weekly News Roundup Issue #463. This was another eventful week in tech with a lot going on.
Meta released their latest open large language model, Llama 3, which will be the main focus of this issue. Apart from that, Google announced new Gemini-powered products at Google Cloud Next 2024. The Humane AI Pin is out, and the reviews are not good. Boston Dynamics surprised everyone by retiring the hydraulic-powered Atlas and revealing the new, all-electric Atlas.
These news and more will be covered in this week’s news roundup. I hope you enjoy it!
Meta has released Llama 3, a family of open-weights large language models and the successor to Llama 2 model. Llama 3 comes in three sizes: the 8B and 70B parameter models are available now, while the largest model in the Llama 3 family, the 400B+ model, is still being trained and will be released later.
In terms of performance, the Llama 3 models do not disappoint—at least according to the benchmark results published by Meta. These benchmarks show Llama 3 to be one of the best, if not the best, open large language models. Llama 3 8B surpasses Google's Gemma 7B and Mistral 7B models, while Llama 3 70B is at the same level, if not slightly better than, Google's Gemini Pro 1.5 and Anthropic's Claude 3 Sonnet. LMSYS Chatbot Arena, which compares large language models based on human preferences, lists Llama 3 8B in the same neighbourhood as Mixtral 7x22B, Mistral Medium and Command R models, all of which are much bigger than Llama 3 8B. Meanwhile, Llama 3 70B is ranked 7th, alongside models such as GPT-4, Claude 3 Sonnet, and Command R+.
In the post announcing the new models, Meta stated that the Llama 3 models were pretrained on over 15 trillion tokens collected from publicly available sources. The training dataset is seven times larger than that used for Llama 2 and includes four times more code, so the quality of the answers should be higher.
The overall picture that emerges from these benchmarks is that Llama 3 8B is the best model in its category of small models, while Llama 3 70B is the best open language model in the GPT-3.5 category and overall the best open-weights model. And there is still the 400B+ model to be released, which might be aiming for the top of the leaderboards. Meta gave us a sneak peek into the performance level of the still-in-training Llama 3 400B+ model, and the numbers they report are roughly at the same level as the best models from OpenAI, Google, and Anthropic.
However, let’s keep in mind that these are numbers published by Meta. Many developers are now downloading and testing the new Llama 3 models to see what they are actually capable of and how they compare to other open and proprietary models. We should have a better picture once the results from those experiments are published.
Meta promises to release updated versions of Llama 3 models in the near future. Currently, Llama 3 models are text-only models with 8,192 token context window. Meta plans to add multimodality, the ability to converse in multiple languages (Llama 3 currently only supports English), a much longer context window, and stronger overall capabilities, including improved reasoning and executing multi-step plans.
Llama 3 8B and 70B models are available on HuggingFace and will arrive soon to AWS, Databricks, Google Cloud, Microsoft Azure, NVIDIA NIM, and Snowflake. Llama 3 can also be downloaded directly from Meta but that method requires signing in. The easiest way to get Llama 3 and start experimenting with it is to use tools such as Ollama if you want to run it on your computer.
But if you don’t want to download and set up Llama 3 on your computer, you can try it out in Meta AI, Meta’s new AI assistant (assuming you are in one of the supported countries). Llama 3 will also be used internally by Meta to power new AI features coming soon to Facebook, Messenger, WhatsApp and Instagram, exposing billions of people to AI chatbots and generative AI tools.
In conclusion, Meta, the unlikely champion for open-weight models, delivers two models offering top performance in their respective categories. That is a good news for AI and open source communities which now have access to two more very capable models to experiment with and build new AI-powered applications. But another thing the release of Llama 3 has shown is that the gaps between top models have become smaller in 2024. We are no longer in a situation where GPT-4 enjoys a massive advantage. The competitors have caught up and free models with open weights offering similar levels of performance are emerging.
Now let’s see what OpenAI is going to bring to the table.
If you enjoy this post, please click the ❤️ button or share it.
Do you like my work? Consider becoming a paying subscriber to support it
For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter.
Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving.
🦾 More than a human
Pea-sized brain implant could treat depression and more
Engineers have created a pea-sized implantable brain stimulator that can wirelessly stimulate specific brain areas to treat psychiatric and neurological disorders, such as drug-resistant depression. Designed to sit beneath the skin on top of the skull, the device utilizes magnetoelectric power transfer technology, enabling it to operate without internal batteries and eliminating the need for invasive surgeries associated with traditional brain stimulation devices. Having been successfully tested in humans and animals, researchers are now pursuing FDA approval to expand clinical trials.
What Neuralink Is Missing
This article points out an important thing about brain-computer interfaces (BCIs) that could be missed in the excitement surrounding this technology. It’s not about making a safe, functional device and obtaining approval from regulators like the FDA. It’s also about convincing insurance companies that BCIs are worth the expense and making these devices affordable and accessible for those who need them most.
Rejuvenating the Blood Cell Population
Scientists have discovered a link between the ratio of certain types of blood cells and aging. After restoring the ratio in older mice to resemble the blood composition of younger mice, they found that these older mice were better at fighting infections and had reduced inflammation levels. The next question is whether this method could also work in humans.
🧠 Artificial Intelligence
Microsoft Pitched OpenAI’s DALL-E as Battlefield Tool for U.S. Military
The Intercept reveals that Microsoft approached the US Department of Defence proposing the use of OpenAI’s text-to-image generator, DALL·E, amongst other OpenAI models, to help build software to execute military operations. The article shares the slide deck Microsoft used to pitch generative AI tools like GPT-4, DALL·E or Codex could be used by the US military in various applications, such as generating training data or image analysis.
Humane AI Pin reviews are in… and they are not positive
The Humane AI Pin, one of the first devices in the new class built around AI, is now available, along with its reviews—and the feedback is not good. MKBHD called Humane AI Pin the worst product he ever reviewed. The Verge said the device, which costs $700 plus an additional $25 per month subscription, just does not work, while Wired concludes that Humane AI Pin is “too bare-bones and not all that useful”. The Rabbit R1 is still set to be released soon, so it will be interesting to see what it brings to the table, but I'm not keeping my hopes high.
Google announced a suite of new AI tools and services at the Google Cloud Next 2024 event, many of which are powered by Gemini 1.5 Pro, now in public preview. Also, Imagen 2.0 is now generally available in Vertex AI. Gemini for Google Workspace introduces a bunch of tools aiming to increase productivity. One of those new introduced tools is Google Vids, an AI video creation tool, designed to make video production much easier. For engineers and developers, Gemini will power new coding assistance tools as well as tools for creating cloud infrastructure and improving cybersecurity. Google also announced new hardware optimized for AI for Google Cloud, including machines equipped with Nvidia’s Blackwell GPUs, coming in early 2025. What piqued my interest is Vertex AI Agent Builder, a tool for creating AI agents for various tasks, from interacting with customers to automating jobs. The full, one-hour-long opening keynote is available on YouTube. Google also published a good summary on Google Cloud blog.
Artificial Intelligence Index Report 2024
The seventh edition of the AI Index report is out, one of the best overviews of the AI landscape which covers essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. The full 500-page report is available here. If you don’t have time to read it, the authors also published 10 top takeaways summarising the report.
Microsoft and G42 partner to accelerate AI innovation in UAE and beyond
Microsoft announced a $1.5 billion investment into G42, an AI company based in Abu Dhabi, United Arab Emirates. The investment gives Microsoft a minority stake in the company. In a statement announcing the investment, Microsoft said this strategic investment will “strengthen the two companies’ collaboration on bringing the latest Microsoft AI technologies and skilling initiatives to the UAE and other countries around the world”. Part of that plan is a $1 billion fund to boost AI skills in the region.
DeepMind CEO Says Google Will Spend More Than $100 Billion on AI
Microsoft is heavily investing in AI across all fronts. The company has invested over $13 billion in OpenAI, is funnelling billions into other AI startups, and is in the process of building Stargate—a $100 billion supercomputer for AI. When asked about the Stargate supercomputer, DeepMind CEO Demis Hassabis responded by saying that Google will outspend Microsoft, investing more than $100 billion in various AI projects.
Texas is replacing thousands of human exam graders with AI
The state of Texas is testing an automated scoring engine to grade open-ended questions on the State of Texas Assessments of Academic Readiness (STAAR) exams. The Texas Education Agency (TEA) expects the new AI system to save $15-20 million per year by reducing the need for temporary human scorers, whose number will be reduced from 6000 to under 2000. To catch any issues with the system, a quarter of all computer-graded results will be rescored by humans. However, some educators do not share the TEA's optimism in the AI scoring system.
▶️ Debunking Devin: "First AI Software Engineer" Upwork lie exposed! (25:15)
A month ago, Cognition Labs released Devin, the 'First AI Software Engineer.' The company claimed that Devin has successfully passed technical tests from leading AI companies and has completed real jobs on Upwork. This video focuses on the latter claim, exposing in detail how the video showing Devin performing the task was manipulated to make the AI appear better than it actually is in reality.
If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word?
🤖 Robotics
This week, Boston Dynamics bid farewell to hydraulic-powered Atlas and celebrated with 11 years of falls, jumps, dances and everything else that Atlas achieved. A day later, Boston Dynamics unveiled the brand-new, all-electric Atlas. The way new Atlas introduced itself to the world might came across as creepy but what Boston Dynamics showed is way ahead of everyone else.
Boston Dynamics’ Robert Playter on the New Atlas
IEEE Spectrum sat down with Boston Dynamics CEO Robert Playter to ask some questions about the recently unveiled all-electric Atlas. Playter shared insights on the development of the new Atlas and the transition from a hydraulic-powered humanoid robot to an all-electric machine. He also discussed what’s next for the new Atlas, including plans for the robot's commercialisation.
▶️ LASSIE - a robot-dog for the Moon (5:15)
Researchers from NASA and a couple of US universities experiment with the idea of using robot dogs to explore the surface of the Moon. While no robot dog is scheduled to go to the Moon in the near future, researchers are currently testing how these robots handle difficult terrain on Earth in challenging environments that, in one way or another, mimic what such a robot might encounter on the Moon's surface.
▶️ Making a Dog-Sized Furby Robot (and taking it on a walk) (23:01)
We live in a future where anybody can buy a robot dog and turn it into a Furby monstrosity. I love it.
Tiny AI-trained robots demonstrate remarkable soccer skills
Those tiny robots, standing just 5 cm tall, have learned to play football (or soccer) all on their own. Using deep reinforcement learning, researchers from DeepMind made the robots to figure out how to stand up, run and kick the ball in a simulated environment. After countless trials and errors, the robots learned to play football quite well in one-versus-one games.
🧬 Biotechnology
The Wizardry and Prophecy of DNA Printing
This article proposes an interesting take on the field of DNA printing and synthetic biology in general. In any fast-moving, high-impact field, people can be grouped into two classes - wizards (people who encourage ingenuity and experimentation even if risky) and prophets (people who warn us about the unintended consequences or misuse of new technologies). But to create the biopunk future, the article proposes that we need a third class - paladins, or people who hear prophetic fears yet adopt a wizardly mindset to fiercely innovate protective technology.
💡Tangents
When Bacteria Are Beautiful
Can bacteria be beautiful? According to microbiologist Tal Danino, the answer is yes. To prove it, he has been photographing various bacterial strains. The result is a series of photographs highlighting the beauty of the shapes these tiny organisms created.
Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it.
Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human.
A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support!
My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!"