Recent posts:
-
A World of Auctions
In free markets a core element is the auction - a procedure for optimal allocation of scarce resources. Auctions have multiple social benefits: they help to deliver an item to the person that values it most; through bidding prices, they reveal underlying demand; and they generate revenue for the sellers. They also play a big part in our world today - radio spectrum, electricity markets, carbon and pollution permits, treasury bonds, fishing quotas and natural resources, online advertising, airport slots and transport rights, art and collectibles - all based on auctions.
-
Generative Models: Flow Matching
Flow matching grew out of two earlier ideas. Normalizing flows showed how to gradually transform simple noise into complex data, while optimal transport studied how to move one distribution into another along efficient paths. Flow matching takes inspiration from both: it learns smooth velocity fields that carry noise toward data without the heavy math of exact Jacobians or transport costs. This simplicity and flexibility helped it quickly gain attention as an alternative to diffusion models. Let's see what it's all about.
-
Can LLMs Learn to Reason End-to-End?
I was recently thinking about reasoning models and why RL methods like GRPO have become so prominent in that context. Previously, people were finetuning LLMs on step-by-step reasoning traces, yet this can hardly be called "learning to reason", as it's rather about learning to reproduce the reasoning patterns of some other expert. With RL for LLMs the paradigm has shifted and now the model can, in principle, discover the right reasoning patterns for the task. Yet, there are still lots of nuances. Here we consider a simple question: can LLMs learn the best reasoning patterns in an end-to-end manner? By understanding this we'll see precisely how different model designs and training paradigms fit together, like pieces of a puzzle.
-
Differentiable Simulation for Search
Differentiable simulation is the concept of modeling an environment's dynamics as a differentiable function. This immediately allows us to embed the environment's rollout into the computation graph of another module, allowing end-to-end differentiable training. This property has found applications in many domains. In physics, robotics, and autonomous vehicles, the dynamics often represent kinematic equations and laws of motion or energy transfer. In graphics, it is often the rendering process that is differentiable. There, differentiable rendering is at the heart of techniques like neural radiance fields and Gaussian splatting. We've previously explored how differentiable simulation can be used for control and world modeling, yet one key aspect is missing to complete the trilogy and that is test-time search. Here we'll see how differentiable simulation can be used to design efficient fast procedures that search for the best action sequence at test time.
-
Episodes From The Land of Ice and Fire
We've always wanted to do a trip in Iceland. We've talked about it for at least 5 years now. Plans got serious last year, but we didn't manage to get organized in time. This year it finally happened, and it was spectacular. Definitely a lifetime achievement to remember. How can this little island in the middle of the ocean pack so much scenery, biodiversity, and grandeur into it?
Every subset of less than half the total number of vertices has a proportionally large boundary of edges.