Cover image of Data Skeptic
(428)

Rank #122 in Science category

Technology
Science

Data Skeptic

Updated about 1 month ago

Rank #122 in Science category

Technology
Science
Read more

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

Read more

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

iTunes Ratings

428 Ratings
Average Ratings
291
85
22
20
10

Very Informative Enjoyable To Listen To

By prime_player - May 05 2020
Read more
I enjoy the topic and discussion Kyle generates. Very interesting material and interviews. Look forward to each episode.

great podcast

By Artemis_2 - Apr 19 2020
Read more
great podcast for introducing recent important works in machine learning

iTunes Ratings

428 Ratings
Average Ratings
291
85
22
20
10

Very Informative Enjoyable To Listen To

By prime_player - May 05 2020
Read more
I enjoy the topic and discussion Kyle generates. Very interesting material and interviews. Look forward to each episode.

great podcast

By Artemis_2 - Apr 19 2020
Read more
great podcast for introducing recent important works in machine learning
Cover image of Data Skeptic

Data Skeptic

Latest release on Jul 11, 2020

Read more

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

Rank #1: [MINI] Markov Chain Monte Carlo

Podcast cover
Read more

This episode explores how going wine testing could teach us about using markov chain monte carlo (mcmc).

Apr 03 2015

15mins

Play

Rank #2: Quantum Computing

Podcast cover
Read more

In this week's episode, Scott Aaronson, a professor at the University of Texas at Austin, explains what a quantum computer is, various possible applications, the types of problems they are good at solving and much more. Kyle and Scott have a lively discussion about the capabilities and limits of quantum computers and computational complexity.

Dec 01 2017

47mins

Play

Rank #3: Game Theory

Podcast cover
Read more

Thanks to our sponsor The Great Courses.

This week's episode is a short primer on game theory.

For tickets to the free Data Skeptic meetup in Chicago on Tuesday, May 15 at the Mendoza College of Business (224 South Michigan Avenue, Suite 350), click here,

May 11 2018

24mins

Play

Rank #4: [MINI] Confidence Intervals

Podcast cover
Read more

Commute times and BBQ invites help frame a discussion about the statistical concept of confidence intervals.

Sep 26 2014

11mins

Play

Rank #5: [MINI] The Chi-Squared Test

Podcast cover
Read more

The χ2 (Chi-Squared) test is a methodology for hypothesis testing. When one has categorical data, in the form of frequency counts or observations (e.g. Vegetarian, Pescetarian, and Omnivore), split into two or more categories (e.g. Male, Female), a question may arrise such as "Are women more likely than men to be vegetarian?" or put more accurately, "Is any observed difference in the frequency with which women report being vegetarian differ in a statistically significant way from the frequency men report that?"

Feb 06 2015

17mins

Play

Rank #6: Being Bayesian

Podcast cover
Read more

This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised distribution.

We present this concept in a few different contexts but primarily focus on how our bird Yoshi sends signals about her food preferences.

Like many animals, Yoshi is a complex creature whose preferences cannot easily be summarized by a straightforward utility function the way they might in a textbook reinforcement learning problem. Her preferences are sequential, conditional, and evolving. We may not always know what our bird is thinking, but we have some good indicators that give us clues.

Oct 26 2018

24mins

Play

Rank #7: [MINI] Natural Language Processing

Podcast cover
Read more

This episode overviews some of the fundamental concepts of natural language processing including stemming, n-grams, part of speech tagging, and th bag of words approach.

Apr 17 2015

13mins

Play

Rank #8: [MINI] k-means clustering

Podcast cover
Read more

The k-means clustering algorithm is an algorithm that computes a deterministic label for a given "k" number of clusters from an n-dimensional datset.  This mini-episode explores how Yoshi, our lilac crowned amazon's biological processes might be a useful way of measuring where she sits when there are no humans around.  Listen to find out how!

Feb 20 2015

14mins

Play

Rank #9: Applied Data Science in Industry

Podcast cover
Read more

Kyle sits down with Jen Stirrup to inquire about her experiences helping companies deploy data science solutions in a variety of different settings.

Sep 06 2019

21mins

Play

Rank #10: The Complexity of Learning Neural Networks

Podcast cover
Read more

Over the past several years, we have seen many success stories in machine learning brought about by deep learning techniques. While the practical success of deep learning has been phenomenal, the formal guarantees have been lacking. Our current theoretical understanding of the many techniques that are central to the current ongoing big-data revolution is far from being sufficient for rigorous analysis, at best. In this episode of Data Skeptic, our host Kyle Polich welcomes guest John Wilmes, a mathematics post-doctoral researcher at Georgia Tech, to discuss the efficiency of neural network learning through complexity theory.

Oct 20 2017

38mins

Play

Rank #11: [MINI] Bias Variance Tradeoff

Podcast cover
Read more

A discussion of the expected number of cars at a stoplight frames today's discussion of the bias variance tradeoff. The central ideal of this concept relates to model complexity. A very simple model will likely generalize well from training to testing data, but will have a very high variance since it's simplicity can prevent it from capturing the relationship between the covariates and the output. As a model grows more and more complex, it may capture more of the underlying data but the risk that it overfits the training data and therefore does not generalize (is biased) increases. The tradeoff between minimizing variance and minimizing bias is an ongoing challenge for data scientists, and an important discussion for skeptics around how much we should trust models.

Nov 13 2015

13mins

Play

Rank #12: [MINI] Primer on Deep Learning

Podcast cover
Read more

In this episode, we talk about a high-level description of deep learning.  Kyle presents a simple game (pictured below), which is more of a puzzle really, to try and give  Linh Da the basic concept.

Thanks to our sponsor for this week, the Data Science Association. Please check out their upcoming Dallas conference at dallasdatascience.eventbrite.com

Feb 10 2017

14mins

Play

Rank #13: BERT

Podcast cover
Read more

Kyle provides a non-technical overview of why Bidirectional Encoder Representations from Transformers (BERT) is a powerful tool for natural language processing projects.

Jul 29 2019

13mins

Play

Rank #14: Practicing and Communicating Data Science with Jeff Stanton

Podcast cover
Read more

Jeff Stanton joins me in this episode to discuss his book An Introduction to Data Science, and some of the unique challenges and issues faced by someone doing applied data science. A challenge to any data scientist is making sure they have a good input data set and apply any necessary data munging steps before their analysis. We cover some good advise for how to approach such problems.

Oct 24 2014

36mins

Play

Rank #15: Data Science Hiring Processes

Podcast cover
Read more

Kyle shares a few thoughts on mistakes observed by job applicants and also shares a few procedural insights listeners at early stages in their careers might find value in.

Dec 28 2018

33mins

Play

Rank #16: The Right (big data) Tool for the Job with Jay Shankar

Podcast cover
Read more

In this week's episode, we discuss applied solutions to big data problem with big data engineer Jay Shankar.  The episode explores approaches and design philosophy to solving real world big data business problems, and the exploration of the wide array of tools available.

Jul 07 2014

49mins

Play

Rank #17: [MINI] Bayesian Updating

Podcast cover
Read more

In this minisode, we discuss Bayesian Updating - the process by which one can calculate the most likely hypothesis might be true given one's older / prior belief and all new evidence.

Jun 27 2014

11mins

Play

Rank #18: Advertising Attribution with Nathan Janos

Podcast cover
Read more

A conversation with Convertro's Nathan Janos about methodologies used to help advertisers understand the affect each of their marketing efforts (print, SEM, display, skywriting, etc.) contributes to their overall return.

Jun 06 2014

1hr 16mins

Play

Rank #19: Data Infrastructure in the Cloud

Podcast cover
Read more

Kyle chats with Rohan Kumar about hyperscale, data at the edge, and a variety of other trends in data engineering in the cloud.

May 18 2019

30mins

Play

Rank #20: Neuroscience from a Data Scientist's Perspective

Podcast cover
Read more

... or should this have been called data science from a neuroscientist's perspective? Either way, I'm sure you'll enjoy this discussion with Laurie Skelly. Laurie earned a PhD in Integrative Neuroscience from the Department of Psychology at the University of Chicago. In her life as a social neuroscientist, using fMRI to study the neural processes behind empathy and psychopathy, she learned the ropes of zooming in and out between the macroscopic and the microscopic -- how millions of data points come together to tell us something meaningful about human nature. She's currently at Metis Data Science, an organization that helps people learn the skills of data science to transition in industry.

In this episode, we discuss fMRI technology, Laurie's research studying empathy and psychopathy, as well as the skills and tools used in common between neuroscientists and data scientists. For listeners interested in more on this subject, Laurie recommended the blogs Neuroskeptic, Neurocritic, and Neuroecology.

We conclude the episode with a mention of the upcoming Metis Data Science San Francisco cohort which Laurie will be teaching. If anyone is interested in applying to participate, they can do so here.

Nov 20 2015

40mins

Play

GANs Can Be Interpretable

Podcast cover
Read more

Erik Härkönen joins us to discuss the paper GANSpace: Discovering Interpretable GAN Controls. During the interview, Kyle makes reference to this amazing interpretable GAN controls video and it’s accompanying codebase found here. Erik mentions the GANspace collab notebook which is a rapid way to try these ideas out for yourself.

Jul 11 2020

26mins

Play

Interpretability Practitioners

Jun 26 2020

32mins

Play

Facial Recognition Auditing

Jun 19 2020

47mins

Play

Robust Fit to Nature

Jun 12 2020

38mins

Play

Black Boxes Are Not Required

Podcast cover
Read more

Deep neural networks are undeniably effective. They rely on such a high number of parameters, that they are appropriately described as “black boxes”.

While black boxes lack desirably properties like interpretability and explainability, in some cases, their accuracy makes them incredibly useful.

But does achiving “usefulness” require a black box? Can we be sure an equally valid but simpler solution does not exist?

Cynthia Rudin helps us answer that question. We discuss her recent paper with co-author Joanna Radin titled (spoiler warning)…

Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From An Explainable AI Competition

Jun 05 2020

32mins

Play

Robustness to Unforeseen Adversarial Attacks

Podcast cover
Read more

May 30 2020

21mins

Play

Estimating the Size of Language Acquisition

May 22 2020

25mins

Play

Interpretable AI in Healthcare

May 15 2020

35mins

Play

Understanding Neural Networks

Podcast cover
Read more

What does it mean to understand a neural network? That’s the question posted on this arXiv paper. Kyle speaks with Tim Lillicrap about this and several other big questions.

May 08 2020

34mins

Play

Self-Explaining AI

Podcast cover
Read more

Dan Elton joins us to discuss self-explaining AI. What could be better than an interpretable model? How about a model wich explains itself in a conversational way, engaging in a back and forth with the user.

We discuss the paper Self-explaining AI as an alternative to interpretable AI which presents a framework for self-explainging AI.

May 02 2020

32mins

Play

Plastic Bag Bans

Podcast cover
Read more

Becca Taylor joins us to discuss her work studying the impact of plastic bag bans as published in Bag Leakage: The Effect of Disposable Carryout Bag Regulations on Unregulated Bags from the Journal of Environmental Economics and Management. How does one measure the impact of these bans? Are they achieving their intended goals? Join us and find out!

Apr 24 2020

34mins

Play

Self Driving Cars and Pedestrians

Apr 18 2020

30mins

Play

Computer Vision is Not Perfect

Podcast cover
Read more
Computer Vision is not Perfect

Julia Evans joins us help answer the question why do neural networks think a panda is a vulture. Kyle talks to Julia about her hands-on work fooling neural networks.

Julia runs Wizard Zines which publishes works such as Your Linux Toolbox. You can find her on Twitter @b0rk

Apr 10 2020

26mins

Play

Uncertainty Representations

Podcast cover
Read more

Jessica Hullman joins us to share her expertise on data visualization and communication of data in the media. We discuss Jessica’s work on visualizing uncertainty, interviewing visualization designers on why they don't visualize uncertainty, and modeling interactions with visualizations as Bayesian updates.

Homepage: http://users.eecs.northwestern.edu/~jhullman/

Lab: MU Collective

Apr 04 2020

39mins

Play

AlphaGo, COVID-19 Contact Tracing and New Data Set

Podcast cover
Read more
Announcing Journal Club

I am pleased to announce Data Skeptic is launching a new spin-off show called "Journal Club" with similar themes but a very different format to the Data Skeptic everyone is used to.

In Journal Club, we will have a regular panel and occasional guest panelists to discuss interesting news items and one featured journal article every week in a roundtable discussion. Each week, I'll be joined by Lan Guo and George Kemp for a discussion of interesting data science related news articles and a featured journal or pre-print article.

We hope that this podcast will give listeners an introduction to the works we cover and how people discuss these works. Our topics will often coincide with the original Data Skeptic podcast's current Interpretability theme, but we have few rules right now or what we pick. We enjoy discussing these items with each other and we hope you will do.

In the coming weeks, we will start opening up the guest chair more often to bring new voices to our discussion. After that we'll be looking for ways we can engage with our audience.

Keep reading and thanks for listening!

Kyle

Mar 28 2020

33mins

Play

Visualizing Uncertainty

Podcast cover
Read more

Mar 20 2020

32mins

Play

Interpretability Tooling

Podcast cover
Read more

Pramit Choudhary joins us to talk about the methodologies and tools used to assist with model interpretability.

Mar 13 2020

42mins

Play

Shapley Values

Podcast cover
Read more

Kyle and Linhda discuss how Shapley Values might be a good tool for determining what makes the cut for a home renovation.

Mar 06 2020

20mins

Play

Anchors as Explanations

Podcast cover
Read more

We welcome back Marco Tulio Ribeiro to discuss research he has done since our original discussion on LIME.

In particular, we ask the question Are Red Roses Red? and discuss how Anchors provide high precision model-agnostic explanations.

Please take our listener survey.

Feb 28 2020

37mins

Play

iTunes Ratings

428 Ratings
Average Ratings
291
85
22
20
10

Very Informative Enjoyable To Listen To

By prime_player - May 05 2020
Read more
I enjoy the topic and discussion Kyle generates. Very interesting material and interviews. Look forward to each episode.

great podcast

By Artemis_2 - Apr 19 2020
Read more
great podcast for introducing recent important works in machine learning