Seminar | November 9 | 10-11 a.m. | 8019 Berkeley Way West
Michael Dennis
Electrical Engineering and Computer Sciences (EECS)
Before applying an AI system to any real world problem, it is first necessary to explicitly specify an objective for the system to solve. Whether this is a loss function over a dataset, an MDP to plan in, or a simulator in which to train an RL agent, eventually we must write down a specification which reduces our vague real-world problem to an explicit computational problem. While most agree that the true problem we really want to solve is not what was explicitly written, there has been no effective way of studying when a given specification is good enough, or when one specification is better than another. In this talk, I will discuss a formal framework towards resolving such disagreements (specification design problems) which formally models the decision faced by the designer when choosing between several problem specifications. With this theory, we can compute the justification value of a specification, which I argue ought to guide AI designers when determining what they should specify. I explore some implications of this theory, such as how effective AI designers will often use self-referential claims to efficiently give agents advice, and show how to extend traditional Bayesian reasoning to include such claims. Finally, I discuss how the problem specifications discussed in this theory can be solved through algorithms for unsupervised environment design such as PAIRED.
michael_dennis@cs.berkeley.edu
Jean Nguyen, jeannguyen@eecs.berkeley.edu, 510-642-9413