| ◈ Science | In Expanding de Sitter Space, Quantum Mechanics Gets Even More Elusive | 12 min |
| ◈ Science | An AI Rediscovered the Standard Model — and Predicted the Top Quark | Takeaway |
| ⬡ AI & Product | Agentic Engineering Patterns | 10 min |
| ◉ Wildcard | Behavioral Economics of AI: Do LLMs Inherit Our Irrationality? | Takeaway |
Physicists can make quantum mechanics behave in collapsing or static universes. But our universe is expanding — pushed apart by dark energy into a shape called de Sitter space — and there the theory falls apart in one paradox after another. This piece walks through why the accelerating expansion creates horizons that destroy the mathematical machinery physicists rely on, and what a new generation of theorists is trying to do about it. If you care about the deep tension between quantum theory and cosmology, this is the sharpest recent treatment at a non-specialist level.
Read at Quanta →A group has built ALBERT, a neuro-symbolic framework that tries to discover particle physics theories directly from experimental data. The system generates candidate theories as tokenized sequences of symmetries, particles, and interactions under a rule-based grammar — sidestepping the hallucination problem of pure LLMs — then evaluates each candidate against data using proper radiative corrections and χ² fits.
The proof of concept is striking: trained only on legacy LEP data (which contains no direct evidence of the top quark), a 25-million-parameter transformer rediscovered the Standard Model gauge structure and autonomously inferred that a sixth quark was necessary, predicting its mass at 178.9 ± 5.0 GeV — consistent with the measured value of ~172.5 GeV at the LHC. It's not the first symbolic regression applied to physics, but the combination of reinforcement learning, first-principles constraints, and the scale of the theory space explored is genuinely new.
The real question is whether this approach can find something physicists haven't already found. But as a demonstration that AI can navigate the space of gauge theories and land on the right one from data alone, it's a remarkable result.
Read the preprint →Willison draws a sharp line between "vibe coding" (prompt-and-pray, no accountability) and what he calls "agentic engineering" — where experienced developers use coding agents like Claude Code and Codex to accelerate their work while staying responsible for the output. The key insight: the practices that make agents produce better results are the same practices that already define good engineering — automated tests, clean documentation, CI/CD, well-factored code. The agents don't replace discipline; they reward it. A useful framework for anyone navigating the shift from AI-as-autocomplete to AI-as-collaborator.
Read at simonwillison.net →Bini, Cong, Huang, and Jin ran the classic behavioral economics experiments — the ones designed to expose human cognitive biases — on every major LLM family (GPT, Claude, Gemini, Llama) across model versions and scales. The headline finding splits neatly in two: on preference-based tasks (risk aversion, loss aversion, framing effects), larger and more advanced models become more human-like — which is to say, more biased. On belief-based tasks (probability estimation, Bayesian updating), the bigger models trend toward rationality.
This is a genuinely interesting divergence. It suggests that RLHF and scale are pulling models toward human preference patterns (including the irrational ones) while simultaneously improving their raw reasoning. The practical implication: if you're using LLMs for decisions that involve preferences or values, you may be importing biases you didn't intend to. The paper also shows that simply prompting models to "be rational" partially corrects the biases — which is both reassuring and slightly unsettling.
Read at NBER →For years, the excess gamma-ray emission from the Milky Way's center has been a tantalizing dark matter candidate that kept getting explained away by astrophysical backgrounds — primarily millisecond pulsars. New high-resolution simulations published in PRL show that the dark matter distribution in the inner galaxy isn't spherical but flattened and asymmetrical, and when you account for that realistic geometry, the annihilation signal matches the observed excess more cleanly than previous models suggested. It's not a detection — the systematics are still brutal — but the dark matter interpretation just got harder to dismiss. The field is watching for what the next round of Fermi data and the upcoming LuSee Night lunar observatory will add.
DeepMind's latest attempt to put rigor behind "progress toward AGI" borrows from decades of cognitive science to define a 10-ability taxonomy — perception, generation, attention, learning, memory, reasoning, metacognition, and others. It's a more principled framework than the usual benchmark-chasing, drawing on how psychologists actually decompose human intelligence rather than whatever task happens to have a leaderboard. Whether any taxonomy can capture what matters about general intelligence is debatable, but this one at least asks the right questions. They've also launched a Kaggle hackathon around it, which should produce some interesting data.