MIRI updates
- MIRI researcher Abram Demski writes regarding counterfactuals:
I've felt like the problem of counterfactuals is "mostly settled" (modulo some math working out) for about a year, but I don't think I've really communicated this online. Partly, I've been waiting to write up more formal results. But other research has taken up most of my time, so I'm not sure when I would get to it.
So, the following contains some "shovel-ready" problems. If you're convinced by my overall perspective, you may be interested in pursuing some of them. I think these directions have a high chance of basically solving the problem of counterfactuals (including logical counterfactuals). […]
- Alex Mennen writes a thoughtful critique of one of the core arguments behind Abram's new take on counterfactuals; Abram replies.
- Abram distinguishes simple Bayesians (who reason according to the laws of probability theory) from reflective Bayesians (who endorse background views that justify Bayesianism), and argues that simple Bayesians can better "escape the trap" of traditional issues with Bayesian reasoning.
- Abram explains the motivations behind his learning normativity research agenda, providing "four different elevator pitches, which tell different stories" about how the research agenda's desiderata hang together.
News and links
- CFAR co-founder Julia Galef has an excellent new book out on human rationality and motivated reasoning: The Scout Mindset: Why Some People See Things Clearly and Others Don't.
- Katja Grace argues that there is pressure for systems with preferences to become more coherent, efficient, and goal-directed.
- Andrew Critch discusses multipolar failure scenarios and "multi-agent processes with a robust tendency to play out irrespective of which agents execute which steps in the process".
- A second AI alignment podcast joins Daniel Filan's AI X-Risk Research Podcast: Quinn Dougherty's Technical AI Safety Podcast, with a recent episode featuring Alex Turner.
- A simple but important observation by Mark Xu: Strong Evidence is Common.