Recent posts:
-
Data-Driven Structure From Motion
Structure from motion (SFM) has come a long way. It started back in the 1950s when researchers painstakingly derived 3D structures from pairs of aerial photographs through precise geometric reasoning. The eight-point algorithm in 1981 was one of the first applications of mathematical rigour to the problem. In the late 1990s RANSAC enabled robust estimation in the presence of noise, while bundle adjustment greatly improved the reconstruction precision by jointly updating both the camera parameters and the scene geometry. In the next 10 years these solutions were scaled up and turned into products. Since then, deep learning has become the norm, with more of a focus on data-driven learning instead of explicit geometry modeling.
-
Attempts to Solve a Market
I was recently tinkering with some fairly realistic oligopolistic market simulations. Compared to textbook cases, where it is common to assume the market matches all buyers to all sellers simultaneously, in my case the simulation involved non-clearing markets, sequential search, and various computational constraints among the market participants. If you think about it, it gets quite hard to solve the market in this case. Analytic solutions are out of the question. One typically has to use numerical methods. Yet, I had the beautiful idea of tryin out multi-agent RL for finding the equilibria. It turned out to be a very nice bridge between the two disciplines - one providing the problem setting, and another providing the tool to solve it.
-
A Trip to Arizona
The plan was to go to WACV25, held early March in Tucson, Arizona. The desert biome there has been on my wishlist for a long time, so naturally expectations were set high, not for the conference but for the trip itself. It ain't often that a little European poverino gets to travel to the US, especially in such "glorious" times as the present, so I decided to make the most of it. Here are some random details from this satisfying jaunt.
-
Value-Based Methods
This post is a brief summary of the different classes of value-based RL methods. Personally, I find this topic incredibly rewarding in terms of its ability to provide precious deep intuition about how the RL algorithm landscape is spread out. Fundamental concepts like on-policy, off-policy, bootstrapping, and many others, all stem from these simple settings.
-
Analytic World Models
Differentiable simulators have recently shown great promise for training autonomous vehicle controllers. Being able to backpropagate through them, they can be placed into an end-to-end training loop where their known dynamics turn into useful priors for the policy to learn, removing the typical black box assumption of the environment. So far, these systems have only been used to train policies. However, this is not the end of the story in terms of what they can offer. They can also be used to train world models. Specifically, in this post we propose and explore three new task setups that allow us to learn next state predictors, optimal planners, and optimal inverse states. Unlike analytic policy gradients (APG), which requires the gradient of the next simulator state with respect to the current actions, the proposed setups rely on the gradient of the next state with respect to the current state. We call this approach Analytic World Models (AWMs).
Every subset of less than half the total number of vertices has a proportionally large boundary of edges.