Blog Post

The alignment problem: how can machines learn human values?

After reading the excellent – Algorithms to Live By: The Computer Science of Human Decisions last year, I was pleased to see Brian Christian (@brianchristian) not only has a new book out but also he was presenting a virtual event at Ri.

The presentation explored the ethical issues of Artificial Intelligence and what happens when it goes wrong, and is based on his new book The alignment problem: how can machines learn human values?

The presentation can be seen on the Ri YouTube channel

The journey starts in 1939 with – Bertrand Russel, Walter Pitts and Warren McCulloch. Citing examples such as machine learning issues in the judicial system and racial bias, the discussion moved onto the two key challenges

  1. The training data
  2. The objective function

Examples where issues with training data caused undesirable consequences included offensive misclassification of photo and even a self driving car collision resulting in the death of a pedestrian.

Examples of issues with objective function issues included ethical issues and how models can learn in unexpected ways to “game” the system to get what are unexpected outcomes.

Worth the donation to watch!