Will MacAskill on normative uncertainty

 |   |  Conversations

William MacAskill portrait Will MacAskill recently completed his DPhil at Oxford University and, as of October 2014 will be a Research Fellow at Emmanuel College, Cambridge.

He is the cofounder of Giving What We Can and 80,000 Hours. He’s currently writing a book, Effective Altruism, to be published by Gotham (Penguin USA) in summer 2015.

Luke Muehlhauser: In MacAskill (2014) you tackle the question of normative uncertainty:

Very often, we are unsure about what we ought to do… Sometimes, this uncertainty arises out of empirical uncertainty: we might not know to what extent non-human animals feel pain, or how much we are really able to improve the lives of distant strangers compared to our family members. But this uncertainty can also arise out of fundamental normative uncertainty: out of not knowing, for example, what moral weight the wellbeing of distant strangers has compared to the wellbeing of our family; or whether non-human animals are worthy of moral concern even given knowledge of all the facts about their biology and psychology.

…one might have expected philosophers to have devoted considerable research time to the question of how one ought to take one’s normative uncertainty into account in one’s decisions. But the issue has been largely neglected. This thesis attempts to begin to fill this gap.

In the first part of your thesis you argue that when the moral theories to which an agent assigns some credence are cardinally measurable (as opposed to ordinal-scale) and they are intertheoretically comparable, the agent should choose an action which “maximizes expected choice-worthiness” (MEC), which is akin to maximizing expected value across multiple uncertain theories of what is desirable.

I suspect that result will be intuitive to many, so let’s jump forward to where things get more interesting. You write:

Sometimes, [value] theories are merely ordinal, and, sometimes, even when theories are cardinal, choice-worthiness is not comparable between them. In either of these situations, MEC cannot be applied. In light of this problem, I propose that the correct metanormative theory is sensitive to the different sorts of information that different theories provide. In chapter 2, I consider how to take normative uncertainty into account in conditions where all theories provide merely ordinal choice-worthiness, and where choice-worthiness is noncomparable between theories, arguing in favour of the Borda Rule.

What is the Borda Rule, and why do you think it’s the best action rule under these conditions?


Will MacAskill: Re: “I suspect that result will be intuitive to many.” Maybe in your circles that’s true! Many, or even most, philosophers get off the boat way before this point. They say that there’s no sense of ‘ought’ according to which what one ought to do takes normative uncertainty into account. I’m glad that I don’t have to defend that for you, though, as I think it’s perfectly obvious that the ‘no ought’ position is silly.

As for the Borda Rule: the Borda Rule is a type of voting system, which works as follows. For each theory, an option’s Borda Score is equal to the number of options that rank lower in the theory’s choice-worthiness ordering than that option. An option’s Credence-Weighted Borda Score is equal to the sum, across all theories, of the decision-maker’s credence in the theory multiplied by the Borda Score of the option, on that theory.

So, for example, suppose I have 80% credence in Kantianism and 20% credence in Contractualism. (Suppose I’ve had some very misleading evidence….) Kantianism says that option A is the best option, then option B, then option C. Contractualism says that option C is the best option, then option B, then option A.

The Borda scores, on Kantianism, are:
A = 2
B = 1
C = 0

The Borda scores, on Contractualism, are:
A = 0
B = 1
C = 2

Each option’s Credence-Weighted Borda Score is:
A = 0.8*2 + 0.2*0 = 1.6
B = 0.8*1 + 0.2*1 = 1
C = 0.8*0 + 0.2*2 = 0.4

So, in this case, the Borda Rule would say that A is the most appropriate option, followed by B, and then C.

The reason we need to use some sort of voting system is because I’m considering, at this point, only ordinal theories: theories that tell you that it’s better to choose A over B (alt: that “A is more choice-worthy than B”), but won’t tell you how much more choice-worthy A is than B. So, in these conditions, we have to have a theory of how to take normative uncertainty into account that’s sensitive only to each theory’s choice-worthiness ordering (as well as the degree of credence in each theory), because the theories I’m considering don’t give you anything more than an ordering.

The key reason why I think the Borda Rule is better than any other voting system is that it satisfies a condition I call Updating Consistency. The idea is that increasing your credence in some particular theory T1 shouldn’t make the appropriateness ordering (that is, the ordering of options in terms of what-you-ought-to-do-under-normative-uncertainty) worse by the lights of T1.

This condition seems to me to be very plausible indeed. But, surprisingly, very few voting systems satisfy that property, and those others that do have other problems.
Read more »

Erik DeBenedictis on supercomputing

 |   |  Conversations

Erik DeBenedictis portrait Erik DeBenedictis works for Sandia’s Advanced Device Technologies department. He has been a member of the International Technology Roadmap for Semiconductors since 2005.

DeBenedictis has received Ph.D. in computer science from Caltech. As a grad student and post-doc, he worked on the hardware that turned into the first hypercube multiprocessor computer. Later dubbed the “Cosmic Cube,” it ran for more than a decade after he left the university and was copied over and over. It’s considered the ancestor of most of today’s supercomputers.

In the 1980s, then working for Bell Labs in Holmdel, N.J., DeBenedictis was part of a consortium competing for the first Gordon Bell award. The team got the second place award, the first place going to Sandia. During the 1990s, he ran NetAlive, Inc., a company developing information management software for desktops and wireless systems. Starting in 2002, DeBenedictis was one of the project leads on the Red Storm supercomputer.

Read more »

2013 in Review: Fundraising

 |   |  MIRI Strategy

Update 04/16/2014: At a donor’s request I have replaced the Total Donations per Year chart with one that shows which proportion of the donations were from new and returning donors. Some tweaks were also made to our donation database since the publication of this post, so I have updated the post to reflect these changes.

 

This is the 5th part of my personal and qualitative self-review of MIRI in 2013, in which I review MIRI’s 2013 fundraising activities.

For this post, “fundraising” includes donations and grants, but not other sources of revenue.1

 

Summary

  1. Our funding in 2013 grew by about 75% compared to 2012, though comparing to past years is problematic because MIRI is now a very different organization than it was in 2012 and earlier.
  2. We began to apply for grants in late 2013. We haven’t received money from any of these grantmakers yet, but several of our grant applications are pending.
  3. MIRI’s ability to spend money on its highest-value work is much greater than it was one year ago, and we plan to fundraise heavily to meet our 2014 fundraising goal of $1.7 million.

Read more »


  1. I didn’t include other sources of revenue in this analysis because they don’t seem likely to play a significant role in our ongoing funding strategy in the forseeable future. For example, revenue from ebook sales is marginal, and we don’t plan to sell tickets to a new conference. 

Lyle Ungar on forecasting

 |   |  Conversations

Lyle Ungar portrait Dr. Lyle Ungar is a Professor of Computer and Information Science at the University of Pennsylvania, where he also holds appointments in multiple departments in the schools of Engineering, Arts and Sciences, Medicine, and Business. He has published over 200 articles and is co-inventor on eleven patents. His research areas include machine learning, data and text mining, and psychology, with a current focus on statistical natural language processing, spectral methods, and the use of social media to understand the psychology of individuals and communities.

Luke Muehlhauser: One of your interests (among many) is forecasting. Some of your current work is funded by IARPA’s ACE program — one of the most exciting research programs happening anywhere in the world, if you ask me.

One of your recent papers, co-authored with Barbara Mellers, Jonathan Baron, and several others, is “Psychological Strategies for Winning a Geopolitical Forecasting Tournament.” The abstract is:

Five university-based research groups competed to assign the most accurate probabilities to events in two geopolitical forecasting tournaments. Our group tested and found support for three psychological drivers of accuracy: training, teaming, and tracking. Training corrected cognitive biases, encouraged forecasters to use reference classes, and provided them with heuristics, such as averaging when multiple estimates were available. Teaming allowed forecasters to share information and discuss the rationales behind their beliefs. Tracking placed the highest performers (top 2% from Year 1) in elite teams that worked together. Results showed that probability training improved calibration. Team collaboration and tracking enhanced both calibration and resolution. Forecasting is often viewed as a statistical problem; but it is also a deep psychological problem. Behavioral interventions improved the accuracy of forecasts, and statistical algorithms improved the accuracy of aggregations. Our group produced the best forecasts two years in a row by putting statistics and psychology to work.

In these experiments, some groups were given scenario training or probability training, which “took approximately 45 minutes, and could be examined throughout the tournament.”

Are these modules available to the public online? If not, can you give us a sense of what they were like? And, do you suspect that significant additional probability or scenario training would further reduce forecasting errors, e.g. if new probability training content was administered to subjects for 30 minutes every two weeks?

Read more »

Anil Nerode on hybrid systems control

 |   |  Conversations

Anil Nerode portrait Dr. Anil Nerode is a Goldwin Smith Professor of Mathematics and Computer Science at the Cornell University. He is “a pioneer in mathematical logic, computability, automata theory, and the understanding of computable processes, both theoretical and practical for over half a century, whose work comes from a venerable and distinguished mathematical tradition combined with the newest developments in computing and technology.”

His 50 Ph.D.’s and their students occupy many major university and industrial positions world-wide in mathematics, computer science, software engineering, electrical engineering, etc. He and Wolf Kohn founded the discipline of hybrid systems in 1992 which has become a major area of research in mathematics, computer science, and many branches of engineering. Their work on modeling control of macroscopic systems as relaxed calculus of variations problems on Finsler manifolds is the ground for their current efforts in quantum control and artificial photosynthesis. His research has been supported consistently by many entities, ranging from NSF (50 years) to ADWADC, AFOSR, ARO, USEPA, etc. He has been a consultant on military development projects since 1954. He received his Ph.D. in Mathematics from the University of Chicago under Saunders MacLane (1956).

Luke Muehlhauser: In Nerode (2007), you tell the origin story of hybrid systems control. A 1990 DARPA meeting in Pacifica seems to have been particularly seminal. As you describe it:

the purpose of the meeting was to explore how to clear a major bottleneck, the control of large military systems such as air-land-sea forces in battle space.

Can you describe in more detail what DARPA’s long-term objectives for that meeting seemed to be? Presumably they hoped the meeting would spur new lines of research that would allow them to solve particular control problems in the next 5-20 years?

Read more »

Michael Carbin on integrity properties in approximate computing

 |   |  Conversations

Michael Carbin portraitMichael Carbin is a Ph.D. Candidate in Electrical Engineering and Computer Science at MIT. His interests include the design of programming systems that deliver improved performance and resilience by incorporating approximate computing and self-healing.

His work on program analysis at Stanford University as an undergraduate received an award for Best Computer Science Undergraduate Honors Thesis. As a graduate student, he has received the MIT Lemelson Presidential and Microsoft Research Graduate Fellowships. His recent research on verifying the reliability of programs that execute on unreliable hardware received a best paper award at OOPSLA 2013.

Luke Muehlhauser: In Carbin et al. (2013), you and your co-authors present Rely, a new programming language that “enables developers to reason about the… probability that [a program] produces the correct result when executed on unreliable hardware.” How is Rely different from earlier methods for achieving reliable approximate computing?


Michael Carbin: This is a great question. Building applications that work with unreliable components has been a long-standing goal of the distributed systems community and other communities that have investigated how to build systems that are fault-tolerant. A key goal of a fault tolerant system is to deliver a correct result even in the presence of errors in the system’s constituent components.

This goal stands in contrast to the goal of the unreliable hardware that we have targeted in my work. Specifically, hardware designers are considering new designs that will — purposely — expose components that may silently produce incorrect results with some non-negligible probability. These hardware designers are working in a subfield that is broadly called approximate computing.

The key idea of the approximate computing community is that many large-scale computations (e.g., machine learning, big data analytics, financial analysis, and media processing) have a natural trade-off between the quality of their results and the time and resources required to produce a result. Exploiting this fact, researchers have devised a number of techniques that take an existing application and modify it to trade the quality of its results for increased performance or decreased power consumption.

One example that my group has worked on is simply skipping parts of a computation that we have demonstrated — through testing — can be elided without substantially affecting the overall quality of the application’s result. Another approach is executing portions of an application that are naturally tolerant of errors on these new unreliable hardware systems.

A natural follow-on question to this is, how have developers previously dealt with approximation?

These large-scale applications are naturally approximate because exact solutions are often intractable or perhaps do not even exist (e.g., machine learning). The developers of these applications therefore often start from an exact model of how to compute an accurate result and then use that model as a guide to design a tractable algorithm and a corresponding implementation that returns a more approximate solution. These developers have therefore been manually applying approximations to their algorithms (and their implementations) and reasoning about the accuracy of their algorithms for some time. A prime example of this is the field of numerical analysis and its contributions to scientific computing.

The emerging approximate computing community represents the realization that programming languages, runtime systems, operating systems, and hardware architectures can not only help developers navigate the approximations they need to make when building these applications, but also that these systems can incorporate approximations themselves. So for example, the hardware architecture may itself export unreliable hardware components that an application’s developers can then use as one of their many tools for performing approximation.
Read more »

Randal Koene on whole brain emulation

 |   |  Conversations

Randal A. Koene portraitDr. Randal A. Koene is CEO and Founder of the not-for-profit science foundation Carboncopies as well as the neural interfaces company NeuraLink Co. Dr. Koene is Science Director of the 2045 Initiative and a scientific board member in several neurotechnology companies and organizations.

Dr. Koene is a neuroscientist with a focus on neural interfaces, neuroprostheses and the precise functional reconstruction of neural tissue, a multi‑disciplinary field known as (whole) brain emulation. Koene’s work has emphasized the promotion of feasible technological solutions and “big‑picture” roadmapping aspects of the field. Activities since 1994 include science-curation such as bringing together experts and projects in cutting‑edge research and development that advance key portions of the field.

Randal Koene was Director of Analysis at the Silicon Valley nanotechnology company Halcyon Molecular (2010-2012) and Director of the Department of Neuroengineering at Tecnalia, the third largest private research organization in Europe (2008-2010). Dr. Koene founded the Neural Engineering Corporation (Massachusetts) and was a research professor at Boston University’s Center for Memory and Brain. Dr. Koene earned his Ph.D. in Computational Neuroscience at the Department of Psychology at McGill University, as well as an M.Sc. in Electrical Engineering with a specialization in Information Theory at Delft University of Technology. He is a core member of the University of Oxford working group that convened in 2007 to create the first roadmap toward whole brain emulation (a term Koene proposed in 2000). Dr. Koene’s professional expertise includes computational neuroscience, neural engineering, psychology, information theory, electrical engineering and physics.

In collaboration with the VU University Amsterdam, Dr. Koene led the creation of NETMORPH, a computational framework for the simulated morphological development of large‑scale high‑resolution neuroanatomically realistic neuronal circuitry.

Luke Muehlhauser: You were a participant in the 2007 workshop that led to FHI’s Whole Brain Emulation: A Roadmap report. The report summarizes the participants’ views on several issues. Would you mind sharing your own estimates on some of the key questions from the report? In particular, at what level of detail do you think we’ll need to emulate a human brain to achieve WBE? (molecules, proteome, metabolome, electrophysiology, spiking neural network, etc.)

(By “WBE” I mean what the report calls success criterion 6a (“social role-fit emulation”), so as to set aside questions of consciousness and personal identity.)


Randal Koene: It would be problematic to base your questions largely on the 2007 report. All of those involved are pretty much in agreement that said report did not constitute a “roadmap”, because it did not actually lay out a concrete / well devised theoretical plan by which whole brain emulation is both possible and feasible. The 2007 white paper focuses almost exclusively on structural data acquisition and does not explicitly address the problem of system identification in an unknown (“black box”) system. That problem is fundamental to questions about “levels of detail” and more. It immediately forces you to think about constraints: What is successful/satisfactory brain emulation?

System identification (in small) is demonstrated by the neuroprosthetic work of Ted Berger. Taking that example and proof-of-principle, and applying it to the whole brain leads to a plan for decomposition into feasible parts. That’s what the actual roadmap is about.

I don’t know if you’ve encountered these two papers, but you might want to read and contrast with the 2007 report:

I think that a range of different levels of detail will be involved in WBE. For example, as work by Ted Berger on a prosthetic hippocampus has already shown, it may often be adequate to emulate at the level of spike timing and patterns of neural spikes. It is quite possible that, from a functional perspective, emulation at that level can capture that which is perceptible to us. Consider, differences of pre- and post-synaptic spike times are the basis for synaptic strengthening (spike-timing dependent potentiation), i.e. encoding of long term memory. Trains of spikes are used to communicate sensory input (visual, auditory, etc). Patterns of spikes are used to drive groups of muscles (locomotion, speech, etc).

That said, a good emulation will probably require a deeper level of data acquisition for parameter estimation and possible also a deeper level of emulation in some cases, for example if we try to distinguish different types of synaptic receptors, and therefore how particular neurons can communicate with each other. I’m sure there are many other examples.
So, my hunch (strictly a hunch!) is that whole brain emulation will ultimately involve a combination of tools that carry out most data acquisition at one level, but which in some places or at some times dive deeper to pick up local dynamics.

I think it is likely that we will need to acquire structure data at least at the level of current connectomics that enables identification of small axons/dendrites and synapses. I also think it is likely that we will need to carry out much electrophysiology, amounting to what is now called the Brain Activity Map (BAM).
I think is is less likely that we will need to map all proteins or molecules throughout an entire brain – though it is very likely that we will be studying each of those thoroughly in representative components of brains in order to learn how best to relate measurable quantities with parameters and dynamics to be represented in emulation.

(Please don’t interpret my answer as “spiking neural networks”, because that does not refer to a data acquisition level, but a certain type of network abstraction for artificial neural networks.)
Read more »

Max Tegmark on the mathematical universe

 |   |  Conversations

Randal A. Koene portraitKnown as “Mad Max” for his unorthodox ideas and passion for adventure, his scientific interests range from precision cosmology to the ultimate nature of reality, all explored in his new popular book “Our Mathematical Universe“. He is an MIT physics professor with more than two hundred technical papers, 12 cited over 500 times, and has featured in dozens of science documentaries. His work with the SDSS collaboration on galaxy clustering shared the first prize in Science magazine’s “Breakthrough of the Year: 2003.”

Luke Muehlhauser: Your book opens with a concise argument against the absurdity heuristic — the rule of thumb which says “If a theory sounds absurd to my human psychology, it’s probably false.” You write:

Evolution endowed us with intuition only for those aspects of physics that had survival value for our distant ancestors, such as the parabolic orbits of flying rocks (explaining our penchant for baseball). A cavewoman thinking too hard about what matter is ultimately made of might fail to notice the tiger sneaking up behind and get cleaned right out of the gene pool. Darwin’s theory thus makes the testable prediction that whenever we use technology to glimpse reality beyond the human scale, our evolved intuition should break down. We’ve repeatedly tested this prediction, and the results overwhelmingly support Darwin. At high speeds, Einstein realized that time slows down, and curmudgeons on the Swedish Nobel committee found this so weird that they refused to give him the Nobel Prize for his relativity theory. At low temperatures, liquid helium can flow upward. At high temperatures, colliding particles change identity; to me, an electron colliding with a positron and turning into a Z-boson feels about as intuitive as two colliding cars turning into a cruise ship. On microscopic scales, particles schizophrenically appear in two places at once, leading to the quantum conundrums mentioned above. On astronomically large scales… weirdness strikes again: if you intuitively understand all aspects of black holes [then you] should immediately put down this book and publish your findings before someone scoops you on the Nobel Prize for quantum gravity… [also,] the leading theory for what happened [in the early universe] suggests that space isn’t merely really really big, but actually infinite, containing infinitely many exact copies of you, and even more near-copies living out every possible variant of your life in two different types of parallel universes.

Like much of modern physics, the hypotheses motivating MIRI’s work can easily run afoul of a reader’s own absurdity heuristic. What are your best tips for getting someone to give up the absurdity heuristic, and try to judge hypotheses via argument and evidence instead?


Max Tegmark: That’s a very important question: I think of the absurdity heuristic as a cognitive bias that’s not only devastating for any scientist hoping to make fundamental discoveries, but also dangerous for any sentient species hoping to avoid extinction. Although it appears daunting to get most people to drop this bias altogether, I think it’s easier if we focus on a specific example. For instance, whereas our instinctive fear of snakes is innate and evolved, our instinctive fear of guns (which the Incas lacked) is learned. Just as people learned to fear nuclear weapons through blockbuster horror movies such as “The Day After”, rational fear of unfriendly AI could undoubtedly be learned through a future horror movie that’s less unrealistic than Terminator III, backed up by a steady barrage of rational arguments from organizations such as MIRI.

In the mean time, I think a good strategy is to confront people with some incontrovertible fact that violates their absurdity heuristic and the whole notion that we’re devoting adequate resources and attention to existential risks. For example, I like to ask why more people have heard of Justin Bieber than of Vasili Arkhipov, even though it wasn’t Justin who singlehandedly prevented a Soviet nuclear attack during the Cuban Missile Crisis.

Read more »