The Thinking Game: Move 37 and the Intelligence Deception
I finished watching The Thinking Game the other night, and I thought it would be nice to put down on paper the feelings it aroused in me.
Spoiler alert: throughout the article, I will inevitably refer to parts of the video, so if you are like me and hate spoilers, please come back here only after watching the documentary. All the ideas you will find below are mine, they are not dogmas, and I take full responsibility for what I am about to say.
Propaganda or Vision?
Back to us… I am baffled. The documentary is certainly very interesting, as is the figure of Demis who, as a chess prodigy, at 12 years old finds himself losing a game in a very difficult tournament, where due to fatigue he does not realize that the game is destined for a draw, and therefore surrenders, triggering the hilarity of his opponent who starts mocking him, making him feel bad, and bringing out in him the question that will guide him for the rest of his life: “But if all these brains could unite and focus on one goal, important problems, like cancer, would probably be solved. We are wasting our energy here”.
Everything was clearly set up to give a tone to Deepmind and give it a sort of aura of superiority compared to competitors, even just considering the corporate mission. The others, the “bad guys”, forced us to work on LLMs for the interest of private individuals and companies, but our true goal is absolutely not to earn as much as possible and collect more user data than has ever been possible… After all, Google has never done dirty things… Of course not, our holy grail is the achievement of the so-called AGI.
And speaking of AGI… Does it really make sense to use this term? It refers to a general intelligence, which knows how to range and alienate human capabilities in every field. Achievable? Probably yes, although almost certainly not through the Transformer architecture of current LLMs, regardless of how much scalability and computation continues to be thrown into the furnace.
Desirable? Hmm, maybe.
The documentary, predictably, polishes the noble side of the coin. It talks about solving incurable diseases, of protein mappings that would have taken millennia carried out in a few hours. But, albeit superficially, the specters emerge. Speaking of AlphaFold, risks such as data poisoning or data breaches are cited - en passant. But the real question, the one that leaves a bitter aftertaste, is another: now that these protein databases are open source, surely we will be able to derive immense benefits by exploiting the good side of the coin, but what prevents, for example, a state or malicious actor from synthesizing a custom pathogen? From creating a new pandemic in the laboratory?
After all, Demis Hassabis really has correlations with Oppenheimer. Creator of a technology that as usual can be interpreted both as an instrument of salvation and hope and as a mass destroyer. And what about the final question posed by the documentary? “How do you control forever something more powerful and infinitely more intelligent than us?”.
The Move 37 Syndrome
It seems trivial, but it is not. If the hoped-for level of intelligence were truly reached (not to be confused with the albeit excellent results of today’s models, which are nothing more than “stochastic parrots”), what would prevent it from taking over us?
After all, our way of thinking is extremely limited as human beings. We limit ourselves to our schemes, to shared patterns, we rarely manage to truly abstract something of a higher caliber (those who succeed are labeled as geniuses).
Yet let’s look at AlphaGo, and its match against the Korean champion Lee Sedol, which is shown in the documentary. At a certain point in the match, AlphaGo executes the famous “move 37”, that is, it executes a play that it itself predicts to have “a 1 in 10,000 chance of being made by a human”. Statistical nothingness practically, almost impossible. And what happens in fact? The commentators mock the machine for having made such a “stupid” play, the champion instead starts to smell something burning, so much so that he leaves the room to recover from the strangeness of that move, which he really did not understand.
Final outcome? Move 37 proves decisive in the development of the match, leading the Korean champion to a resounding defeat.
NB: at this point the documentary asserts that this episode starring AlphaGo establishes a real Sputnik moment for China. Here too, an understandable but debatable opinion, as no one really knows what circulates in China and what their current firepower in the tech field is, but let’s take it as true.
Red Teaming an Alien Entity: Technical Analysis
But let’s take the logic of move 37 now and apply it to our world, that of Security. The impact is absolutely devastating. It means that the entire defensive castle we have erected (NIST, Zero Trust, EDR, SIEM…) is based on a wrong and naive hypothesis: that is, that the attacker reasons like a human or uses tools written by humans.
An AI has a lateral thinking that we humans simply cannot comprehend, we do not have enough “computing power” to keep track of so many combinations simultaneously.
Our fundamental error is that we focus on “Vulnerabilities” (CVE, Patching, Bugs). We look for broken pieces. The AI, on the other hand, will look for pieces that work perfectly, but combined together create a disaster.
Let’s think about the classic Vulnerability Assessment. We humans spend our days scanning servers looking for the Critical Vulnerability (CVSS 9.0 or 10.0) to patch it immediately. We feel safe because “we closed the holes”. Move 37 of security completely ignores vulnerabilities. The AI knows that exploiting a known vulnerability (an exploit) makes noise, is detected by EDRs, leaves traces in logs. That is a “human” move.
Here is what it would do instead:
- Graph Analysis (not code): The AI maps relationships in Active Directory or Entra ID. It doesn’t look for who is Domain Admin, it looks for who has “strange” permissions.
- The Invisible Concatenation: Finds an Intern user who, due to a configuration error three years ago, has permission to reset the password of an old Service Account.
- The Legitimate Pivot: That Service Account is not an administrator, but has permission to “write” to the Group Policies of developer workstations.
- The Attack: The AI uses the intern to take the Service Account, modifies a policy to add a “Scheduled Task” on the developers’ PCs. When the developers (who have privileged access to the Cloud) log in, the task executes the AI’s command.
Result?
- Vulnerability Scanner: Green. No software was unpatched.
- Antivirus/EDR: Silent. Resetting a password is a legitimate action. Modifying a GPO is a legitimate action (if you have permissions). Creating a task is legitimate.
- SIEM: Sees a series of authorized events.
The AI didn’t “hack” the system by breaking the code. It used our own IT bureaucracy (permissions, policies, ACLs) against us. It found a road (Attack Path) that was invisible to us because it was scattered in millions of logs, but for it, it was a highway lit up by day.
IAM: The Nightmare of Invisible Permissions
To truly understand what awaits us, we must stop thinking of IAM as a list of users and passwords. That is the human vision, linear, easy to understand and easy to attack. Modern IAM (Entra ID, AWS Cognito, Okta) is a Graph. It is an intricate network of millions of nodes where complexity is not additive, it is exponential.
The problem is that we humans are terrible at managing graphs. When an administrator assigns a permission, they think: “I give Mario access to folder X”. But an AI sees what we don’t see: Mario is in group Y, which has a permission inherited from group Z, which can reset the password of Service Principal K, which has write rights on the global security policy.
These are called “Toxic Combinations”. For a human, finding a toxic combination requires advanced tools and days of analysis. For an AGI, it is instantaneous. It’s like looking at the chessboard and seeing mate in 15 moves before the opponent even moves the pawn.
The AI will see the “Shadow Admins”: users who on paper are not administrators, but who mathematically can become one by exploiting a chain of 7 or 8 obscure permissions. We build walls, the AI walks through the cracks of our definitions. If identity is the new perimeter, then our perimeter is a sieve that we don’t even know we have.
The Great Intelligence Deception: A “Floridian” Look
If we take the ham slices off our eyes and force ourselves to think for a moment, we realize that the real deception lies right in the title: The Thinking Game. We are obsessing over the fact that machines “think” (Thinking), but the real problem, the one that should keep us awake at night, is that machines have started acting successfully (Acting) without needing to be intelligent as we understand it.
Prof. Luciano Floridi (watch any interview he has ever given, it is always illuminating) calls it the “divorce between Agency and Intelligence”. For millennia, if you wanted to do something intelligent (win at chess, diagnose a disease, hack a bank), you had to be intelligent. You had to have consciousness, semantics, understand the “why”.
Move 37 slams reality in our faces: Agency (the ability to act and obtain results) has detached itself from Intelligence. AlphaGo doesn’t “know” it played Go. It doesn’t give a damn about the game, it feels no joy in victory. Yet it crushed us.
Applied to Cybersecurity, this is terrifying. We are not fighting against a “brilliant hacker” on the other side of the screen. We are fighting against pure Agency, a syntactic efficiency that doesn’t need to understand the value of your data to steal it from you better than anyone else.
And do you know why they will win? Because we have spent the last thirty years doing what Floridi calls Enveloping. We have “packaged” the world into data. We have transformed reality into an Infosphere made of APIs, logs, protocols, and databases. We have built the perfect playing field for them. In a purely digital environment, he who dominates syntax dominates reality.
We humans are analog, slow, full of biases. They are natives of this environment that we ourselves built. We are like the dinosaurs that built the meteorite.
Conclusions
In fact, we are all already overexposed. Or perhaps, to use the terminology dear to zero trust, we must take for granted that the attacker will be able to enter, and indeed that they are already inside. It is useless to hide behind firewalls if the AI is able to navigate our digital bureaucracy better than us.
It is therefore impossible to think of existing as an Enterprise on the web without having sophisticated identity and governance projects implemented (and possibly seasoned with a PAM solution for JIT access), continuous audits… but perhaps even this will not be enough.
The final question of the documentary is not rhetorical, it is a condemnation: how do you control something that plays a different game from yours, on a chessboard that you built but that it sees in 4D?
Welcome to the real Thinking Game. Spoiler: we are not the players. We are the chessboard.
Vittorio