Cover image of Data Skeptic
(389)
Education
Science & Medicine
Technology
Natural Sciences

Data Skeptic

Updated 13 days ago

Education
Science & Medicine
Technology
Natural Sciences
Read more

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

Read more

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

iTunes Ratings

389 Ratings
Average Ratings
269
78
16
18
8

The “banter” is truly awful

By fizzbuzzbot - Jul 12 2019
Read more
The main podcast is good. But, I have to skip the episodes where he talks to the female co-host as if she’s an idiot while she asks ridiculous questions and makes insipid comments. Honestly, she’s frequently repeating back something simple that he said as if it’s a deep mystery of the universe. i.e. Female co-host: “computer?” (Sounding utterly puzzled) Male host: “Yes, a computer” (launches into 10 minute explanation as if to a child). Then, often she will laugh or sigh or giggle for no apparent reason. I’ve wondered if she’s actually high when she talk so slow and giggles at random. If she’s not and this is scripted, pretty demeaning.

Good. Could be great.

By Peaceful bird - Jul 02 2019
Read more
This show is very good, but would be great if the co-host was more science-minded. As it stands, the mini episodes consist of Kyle explaining technical concepts to Linh Da, who is intended to be the layperson and prevent Kyle from getting too jargon-y. She is effective in that capacity. However, quite a bit of time gets wasted with arguments that would mostly not occur if Kyle were speaking to, say, a trained biologist, or even an attorney. Because it gets pretty annoying, I have to keep my listening of the show fairly sparse. Great show overall, and the deeper dives with guests are killer.

iTunes Ratings

389 Ratings
Average Ratings
269
78
16
18
8

The “banter” is truly awful

By fizzbuzzbot - Jul 12 2019
Read more
The main podcast is good. But, I have to skip the episodes where he talks to the female co-host as if she’s an idiot while she asks ridiculous questions and makes insipid comments. Honestly, she’s frequently repeating back something simple that he said as if it’s a deep mystery of the universe. i.e. Female co-host: “computer?” (Sounding utterly puzzled) Male host: “Yes, a computer” (launches into 10 minute explanation as if to a child). Then, often she will laugh or sigh or giggle for no apparent reason. I’ve wondered if she’s actually high when she talk so slow and giggles at random. If she’s not and this is scripted, pretty demeaning.

Good. Could be great.

By Peaceful bird - Jul 02 2019
Read more
This show is very good, but would be great if the co-host was more science-minded. As it stands, the mini episodes consist of Kyle explaining technical concepts to Linh Da, who is intended to be the layperson and prevent Kyle from getting too jargon-y. She is effective in that capacity. However, quite a bit of time gets wasted with arguments that would mostly not occur if Kyle were speaking to, say, a trained biologist, or even an attorney. Because it gets pretty annoying, I have to keep my listening of the show fairly sparse. Great show overall, and the deeper dives with guests are killer.
Cover image of Data Skeptic

Data Skeptic

Updated 13 days ago

Read more

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

Rank #1: BERT

Podcast cover
Read more

Kyle provides a non-technical overview of why Bidirectional Encoder Representations from Transformers (BERT) is a powerful tool for natural language processing projects.

Jul 29 2019
13 mins
Play

Rank #2: Being Bayesian

Podcast cover
Read more

This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised distribution.

We present this concept in a few different contexts but primarily focus on how our bird Yoshi sends signals about her food preferences.

Like many animals, Yoshi is a complex creature whose preferences cannot easily be summarized by a straightforward utility function the way they might in a textbook reinforcement learning problem. Her preferences are sequential, conditional, and evolving. We may not always know what our bird is thinking, but we have some good indicators that give us clues.

Oct 26 2018
24 mins
Play

Rank #3: Machine Learning Done Wrong

Podcast cover
Read more

Cheng-tao Chu (@chengtao_chu) joins us this week to discuss his perspective on common mistakes and pitfalls that are made when doing machine learning. This episode is filled with sage advice for beginners and intermediate users of machine learning, and possibly some good reminders for experts as well. Our discussion parallels his recent blog postMachine Learning Done Wrong.

Cheng-tao Chu is an entrepreneur who has worked at many well known silicon valley companies. His paper Map-Reduce for Machine Learning on Multicore is the basis for Apache Mahout. His most recent endeavor has just emerged from steath, so please check out OneInterview.io.

Apr 01 2016
25 mins
Play

Rank #4: Game Theory

Podcast cover
Read more

Thanks to our sponsor The Great Courses.

This week's episode is a short primer on game theory.

For tickets to the free Data Skeptic meetup in Chicago on Tuesday, May 15 at the Mendoza College of Business (224 South Michigan Avenue, Suite 350), click here,

May 11 2018
24 mins
Play

Rank #5: Zillow Zestimate

Podcast cover
Read more

Zillow is a leading real estate information and home-related marketplace. We interviewed Andrew Martin, a data science Research Manager at Zillow, to learn more about how Zillow uses data science and big data to make real estate predictions.

Sep 01 2017
37 mins
Play

Rank #6: [MINI] type i / type ii errors

Podcast cover
Read more

In this first mini-episode of the Data Skeptic Podcast, we define and discuss type i and type ii errors (a.k.a. false positives and false negatives).

May 30 2014
11 mins
Play

Rank #7: [MINI] R-squared

Podcast cover
Read more

How well does your model explain your data? R-squared is a useful statistic for answering this question. In this episode we explore how it applies to the problem of valuing a house. Aspects like the number of bedrooms go a long way in explaining why different houses have different prices. There's some amount of variance that can be explained by a model, and some amount that cannot be directly measured. R-squared is the ratio of the explained variance to the total variance. It's not a measure of accuracy, it's a measure of the power of one's model.

Mar 04 2016
13 mins
Play

Rank #8: Let's Talk About Natural Language Processing

Podcast cover
Read more

This episode reboots our podcast with the theme of Natural Language Processing for the next few months.

We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of the classic problems are, and just a bit on approaches.

Finishing out the show is an interview with Lucy Park about her work on the KoNLPy library for Korean NLP in Python.

If you want to share your NLP project, please join our Slack channel.  We're eager to see what listeners are working on!

http://konlpy.org/en/latest/

Jan 04 2019
36 mins
Play

Rank #9: Data Science Hiring Processes

Podcast cover
Read more

Kyle shares a few thoughts on mistakes observed by job applicants and also shares a few procedural insights listeners at early stages in their careers might find value in.

Dec 28 2018
33 mins
Play

Rank #10: Quantum Computing

Podcast cover
Read more

In this week's episode, Scott Aaronson, a professor at the University of Texas at Austin, explains what a quantum computer is, various possible applications, the types of problems they are good at solving and much more. Kyle and Scott have a lively discussion about the capabilities and limits of quantum computers and computational complexity.

Dec 01 2017
47 mins
Play

Rank #11: Crypto

Podcast cover
Read more

How do people think rationally about small probability events?

What is the optimal statistical process by which one can update their beliefs in light of new evidence?

This episode of Data Skeptic explores questions like this as Kyle consults a cast of previous guests and experts to try and answer the question "What is the probability, however small, that Bigfoot is real?"

Jul 17 2015
1 hour 24 mins
Play

Rank #12: Advertising Attribution with Nathan Janos

Podcast cover
Read more

A conversation with Convertro's Nathan Janos about methodologies used to help advertisers understand the affect each of their marketing efforts (print, SEM, display, skywriting, etc.) contributes to their overall return.

Jun 06 2014
1 hour 16 mins
Play

Rank #13: [MINI] Primer on Deep Learning

Podcast cover
Read more

In this episode, we talk about a high-level description of deep learning.  Kyle presents a simple game (pictured below), which is more of a puzzle really, to try and give  Linh Da the basic concept.

Thanks to our sponsor for this week, the Data Science Association. Please check out their upcoming Dallas conference at dallasdatascience.eventbrite.com

Feb 10 2017
14 mins
Play

Rank #14: The Complexity of Learning Neural Networks

Podcast cover
Read more

Over the past several years, we have seen many success stories in machine learning brought about by deep learning techniques. While the practical success of deep learning has been phenomenal, the formal guarantees have been lacking. Our current theoretical understanding of the many techniques that are central to the current ongoing big-data revolution is far from being sufficient for rigorous analysis, at best. In this episode of Data Skeptic, our host Kyle Polich welcomes guest John Wilmes, a mathematics post-doctoral researcher at Georgia Tech, to discuss the efficiency of neural network learning through complexity theory.

Oct 20 2017
38 mins
Play

Rank #15: [MINI] Recurrent Neural Networks

Podcast cover
Read more

RNNs are a class of deep learning models designed to capture sequential behavior.  An RNN trains a set of weights which depend not just on new input but also on the previous state of the neural network.  This directed cycle allows the training phase to find solutions which rely on the state at a previous time, thus giving the network a form of memory.  RNNs have been used effectively in language analysis, translation, speech recognition, and many other tasks.

Aug 18 2017
17 mins
Play

Rank #16: [MINI] Gradient Descent

Podcast cover
Read more

Today's mini episode discusses the widely known optimization algorithm gradient descent in the context of hiking in a foggy hillside.

Jan 08 2016
14 mins
Play

Rank #17: [MINI] Random Forest

Podcast cover
Read more

Random forest is a popular ensemble learning algorithm which leverages bagging both for sampling and feature selection. In this episode we make an analogy to the process of running a bookstore.

Oct 07 2016
12 mins
Play

Rank #18: [MINI] The T-Test

Podcast cover
Read more

The t-test is this week's mini-episode topic. The t-test is a statistical testing procedure used to determine if the mean of two datasets differs by a statistically significant amount. We discuss how a wine manufacturer might apply a t-test to determine if the sweetness, acidity, or some other property of two separate grape vines might differ in a statistically meaningful way.

Oct 17 2014
17 mins
Play

Rank #19: Big Data Tools and Trends

Podcast cover
Read more

In this episode, I speak with Raghu Ramakrishnan, CTO for Data at Microsoft.  We discuss services, tools, and developments in the big data sphere as well as the underlying needs that drove these innovations.

Feb 17 2017
30 mins
Play

Rank #20: [MINI] The Chi-Squared Test

Podcast cover
Read more

The χ2 (Chi-Squared) test is a methodology for hypothesis testing. When one has categorical data, in the form of frequency counts or observations (e.g. Vegetarian, Pescetarian, and Omnivore), split into two or more categories (e.g. Male, Female), a question may arrise such as "Are women more likely than men to be vegetarian?" or put more accurately, "Is any observed difference in the frequency with which women report being vegetarian differ in a statistically significant way from the frequency men report that?"

Feb 06 2015
17 mins
Play

Similar Podcasts