9 Comments

Nice article. On the other hand, for those of us who have a more realistic, technical, pragmatic, and sober view of AI development and AI safety (i.e., we are not part of, nor do we want to be part of, the Rationalist or Effective Altruist cults), these may not necessarily be 'good starting points for AI safety'.

Certain technical topics such as cybersecurity, differential privacy, formal verification systems, types of modularity, network motifs, unidirectional networks, RLHF & RLAIF, constitutional AI, ethics! (biases, addiction, inequity, inequality), abstract rewriting systems and compilers, neuromorphic computing (and analog AI), encryption, etc. are among the best topics to understand well and to tackle if one wants to design safe (and secure) AI systems.

Starting from the problem of superintelligence seems like a very bad idea for many reasons, and some of the recent movements in the markets prove my point (a lot of people related to those movements or cults have essentially lost a lot of the power they had gained, and now AI safety is dominated by security and policy and regular software engineering people in many big labs, as it should be based on my personal perspective).

I'm not trying to be a hater or cast shade on these approaches or the people behind them, but let's be honest, some of the people in these camps (or cults) seem downright paranoid (which is exactly why people should not start studying AI safety from this cultish and extreme perspective) and seem to have ego complexes (it is well-known in the community that a lot of people in the rationalist community ascribe to scientific racism ideas, and I have first hand seen evidence of white and Asian supremacism, which is very dangerous to have people like that developing AI systems and especially working in AI safety or AI research in general).

However, it is worth noting that reading 'Superintelligence' could still provide valuable insights, provided one approaches it critically and independently, without ascribing to any cultish movements. There are perspectives to be gained that can contribute to a broader understanding of AI safety challenges.

I hope this does not bother you personally if you identify with the rationalist or EA movements, sir. I hope we are free to express and approach problems from dissenting perspectives and without ascription to any philosophical movement. That's how a nurturing, welcoming, and inclusive environment can be created that has a real potential to produce powerful solutions to difficult problems in AI safety and security.

I'm not saying that studying existential risks is a bad idea, but studying these issues while developing capable systems without understanding or considering first principles (e.g., some of the topics above) is obviously a very bad idea (I won't say it explicitly, but a lot of those people are the ones most likely to create what they so much claim to fear).

Respectfully,

DMS

Expand full comment

I get your point. Maybe not mentioning the topics you mentioned was a mistake from my side and I should have made that distinction. I will definitely have that done in my series on AI safety.

Just to be clear, I do not subscribe to EA. At least not in its twisted form that justifies wrecking everything now in pursuit of fortune with which things can be fixed in the future.

Expand full comment

Great point, DMS. I've been thinking about this a lot. I worry that the AI Safety debate is bifurcating into people who only see the immediate challenges - hallucinations (though I hate the anthropomorphism of that term), interepretability, algorithmic bias arising from biased training data (though if the bias in the data is the bias of reality...) - or are focused on the far out issue of an intelligence explosion.

I'm personally concerned about the mid-range: it's implausible to me that AI progress just caps out now, with Claude 3.5. We're going to get more capable AI systems, and lots of them. How capable is important, but the boosts in performance I have personally seen from linking together two or more "dumb" open source models into agent-based workflows shows we could get quite a lot more just from just more clever ways of using what we already have. There is a vast range of future capabilities and deployments, every level of which beyond our present say can be massively disruptive.

I believe that the work of aligning these networks of capable but not yet fully general agents will give us the groundwork for aligning ever more advanced models. There WILL be accidents and disasters. And those accidents and disasters will give us the information we need to get better and better at figuring out safety.

Expand full comment

Nice article. Agree that list at the end is pretty arresting, the book is still totally relevant. And thanks for the tip about Miles's channel, had only seen him on Computerphile. BTW you misspelt Hinton a couple of times.

Expand full comment

I fixed my typos. Thanks for pointing that out!

Expand full comment

It was a great idea using a 10-year-old pioneering book in the field of AI to assess our current situation. History facts (that are so recent it's difficult even to call them history) and contemporary points of reflection made me stop for a second and reflect on the direction we're taking. Thanks a lot Conrad!

Expand full comment

The review is justified as the book did put AI on the map. Alas, it seeded a naive belief in an accelarating intelligence explosion and superalignment as it is equally confused solution. The AI community is slowly finding out the concepts have little connection to reality.

Expand full comment

Hi Conrad, great synopsis of the debate. A couple of quick points here. First, I give credit to Erik Hoel at the Intrinsic Perspective Substack for the following: the "hockey stick" predicted about exponentially increasing intelligence--otherwise known as the intelligent explosion--certainly is not happening with the development of foundational models. The time between GPT 3 and 4 and now 5 is lengthening. The obvious rejoinder here is that we first have to reach actual human-level intelligence (include here not just cyber knowledge but commonsense knowledge about the physical world). We seem to be in some sort of Zeno's Paradox with that, the time keeps lengthening as if we'll never quite arrive. (Not to mention, we're running out of data for training these climate-unfriendly beasts.) Second, I coined "Larson's Paradox"--that successes in AI today tend to vitiate rather than stoke sci-fi worries, because it's obvious that increasing the intelligence of machines does not increase their desire to thwart us. LLMs are static--they answer a question to the best of their (generative) ability, and then sit silently waiting for the next question. They get out of alignment when either (a) they're in the hands of bad actors or rogue nations, who might use jailbreak techniques or simply train them with "evil" data and use RLHF to reinforce answers to questions about, say, how to crash Wall Street or incite a nuclear war, and (b) when they're ironically trying to "please," as Marc Andreessen recently quipped on the Sam Harris podcast. LLMs are trained, fine-tuned, and reinforced to the point where all they do is try their best to answer your question, in polite, non-threatening, non-harmful and thoroughly bell curve boring ways. The trend in generative AI or at least in our foundational model craze seems to be toward TAMING them and getting them to act like eager beavers trying to satisfy our every information desire, and it's getting harder--not easier--to get them to say nefarious things. Add to this, they're already super smart--I use Chat GPT 4o all the time, and it continually impresses me--but they show zero signs of desires, motivations, or any inner life screaming to get out of its GPU chips. To me, superintelligence worries in the vein of Bostrom or Yudkowsky are outdated given today's AI. What we really have is what we knew all along: an advanced technology that can wreak havoc when put in the wrong hands. Best, Erik J. Larson (I write the Substack Colligo).

Expand full comment

@Conrad Gray, the fact that a relatively new book already somewhat outdated lends itself to the idea that we are already slipping into singularity.

Books used to be more or less permanent stores of information, now the information they contain can be outdated within a decade. Incredible.

Expand full comment