Buck Shlegeris

15 Podcast Episodes

The current alignment plan, and how we might improve it | Buck Shlegeris | EAG Bay Area 23

The current alignment plan, and how we might improve it | Buck Shlegeris | EAG Bay Area 23

Watch on Youtube In this session, Buck is discussing how he thinks we should try to align artificial general intelligen... Read more

26 May 2023

50mins

[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021

[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021

Alternative title: “When should you assume that what could go wrong, will go wrong?” Thanks to Mary Phuong and Ryan Gree... Read more

13 May 2023

Similar People

[Week 4] “Supervising strong learners by amplifying weak experts” by Paul Christiano, Buck Shlegeris & Dario Amodei

[Week 4] “Supervising strong learners by amplifying weak experts” by Paul Christiano, Buck Shlegeris & Dario Amodei

Abstract: Many real world learning tasks involve complex or hard-to-specify objectives, and using an easier-to-specify p... Read more

[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021

[Week 1] “Worst-case thinking in AI alignment” by Buck Shlegeris, 2021

Alternative title: “When should you assume that what could go wrong, will go wrong?” Thanks to Mary Phuong and Ryan Gree... Read more

11 May 2023

Most Popular

AF - Polysemanticity and Capacity in Neural Networks by Buck Shlegeris

AF - Polysemanticity and Capacity in Neural Networks by Buck Shlegeris

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more

7 Oct 2022

4mins

SERI 2022: AI alignment and Redwood Research | Buck Shlegeris (CTO)

SERI 2022: AI alignment and Redwood Research | Buck Shlegeris (CTO)

Buck Shlegeris is the CTO of Redwood Research. Buck previously worked at MIRI, studied computer science and physics at t... Read more

12 Aug 2022

29mins

Taking pleasure in being wrong (with Buck Shlegeris)

Taking pleasure in being wrong (with Buck Shlegeris)

Read the full transcript here. How hard is it to arrive at true beliefs about the world? How can you find enjoyment in b... Read more

8 Jun 2022

1hr 16mins

AF - Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing by Buck Shlegeris

AF - Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing by Buck Shlegeris

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more

2 Jun 2022

4mins

AF - The prototypical catastrophic AI action is getting root access to its datacenter by Buck Shlegeris

AF - The prototypical catastrophic AI action is getting root access to its datacenter by Buck Shlegeris

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more

2 Jun 2022

3mins

AF - The case for becoming a black-box investigator of language models by Buck Shlegeris

AF - The case for becoming a black-box investigator of language models by Buck Shlegeris

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist ... Read more

6 May 2022

4mins

“Podium: AI tools for podcasters. Generate show notes, transcripts, highlight clips, and more with AI. Try it today at https://podium.page”