Cover image of Data Engineering Podcast

Data Engineering Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some o... Read more

Ranked #1

Podcast cover

Putting Apache Spark Into Action with Jean Georges Perrin - Episode 60

Putting Apache Spark Into Action with Jean Georges Perrin - Episode 60

Summary Apache Spark is a popular and widely used tool for a variety of data oriented projects. With the large array of ... Read more

10 Dec 2018

50mins

Ranked #2

Podcast cover

Rebuilding Yelp's Data Pipeline with Justin Cunningham - Episode 5

Rebuilding Yelp's Data Pipeline with Justin Cunningham - Episode 5

Summary Yelp needs to be able to consume and process all of the user interactions that happen in their platform in as cl... Read more

18 Jun 2017

42mins

Similar Podcasts

Ranked #3

Podcast cover

Automating Your Production Dataflows On Spark

Automating Your Production Dataflows On Spark

Summary As data engineers the health of our pipelines is our highest priority. Unfortunately, there are countless ways t... Read more

4 Nov 2019

48mins

Ranked #4

Podcast cover

Putting Airflow Into Production With James Meickle - Episode 43

Putting Airflow Into Production With James Meickle - Episode 43

Summary The theory behind how a tool is supposed to work and the realities of putting it into practice are often at odds... Read more

13 Aug 2018

48mins

Most Popular Podcasts

Ranked #5

Podcast cover

Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42

Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42

Summary One of the longest running and most popular open source database projects is PostgreSQL. Because of its extensib... Read more

6 Aug 2018

56mins

Ranked #6

Podcast cover

Organizing And Empowering Data Engineers At Citadel

Organizing And Empowering Data Engineers At Citadel

Summary The financial industry has long been driven by data, requiring a mature and robust capacity for discovering and ... Read more

3 Dec 2019

45mins

Ranked #7

Podcast cover

Data Engineering Weekly with Joe Crobak - Episode 27

Data Engineering Weekly with Joe Crobak - Episode 27

Summary The rate of change in the data engineering industry is alternately exciting and exhausting. Joe Crobak found his... Read more

15 Apr 2018

43mins

Ranked #8

Podcast cover

An Agile Approach To Master Data Management with Mark Marinelli - Episode 46

An Agile Approach To Master Data Management with Mark Marinelli - Episode 46

Summary With the proliferation of data sources to give a more comprehensive view of the information critical to your bus... Read more

3 Sep 2018

47mins

Ranked #9

Podcast cover

Build Maintainable And Testable Data Applications With Dagster

Build Maintainable And Testable Data Applications With Dagster

Summary Despite the fact that businesses have relied on useful and accurate data to succeed for decades now, the state o... Read more

28 Oct 2019

1hr 7mins

Ranked #10

Podcast cover

Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39

Building Data Flows In Apache NiFi With Kevin Doran and Andy LoPresto - Episode 39

Summary Data integration and routing is a constantly evolving problem and one that is fraught with edge cases and compli... Read more

8 Jul 2018

1hr 4mins

Ranked #11

Podcast cover

Data Teams with Will McGinnis - Episode 19

Data Teams with Will McGinnis - Episode 19

Summary The responsibilities of a data scientist and a data engineer often overlap and occasionally come to cross purpos... Read more

19 Feb 2018

28mins

Ranked #12

Podcast cover

The Workflow Engine For Data Engineers And Data Scientists

The Workflow Engine For Data Engineers And Data Scientists

Summary Building a data platform that works equally well for data engineering and data science is a task that requires f... Read more

25 Jun 2019

1hr 8mins

Ranked #13

Podcast cover

Scaling Data Governance For Global Businesses With A Data Hub Architecture

Scaling Data Governance For Global Businesses With A Data Hub Architecture

Summary Data governance is a complex endeavor, but scaling it to meet the needs of a complex or globally distributed org... Read more

9 Mar 2020

54mins

Ranked #14

Podcast cover

A DataOps vs DevOps Cookoff In The Data Kitchen

A DataOps vs DevOps Cookoff In The Data Kitchen

Summary Delivering a data analytics project on time and with accurate information is critical to the success of any busi... Read more

18 Mar 2019

54mins

Ranked #15

Podcast cover

Building Tools And Platforms For Data Analytics

Building Tools And Platforms For Data Analytics

Summary Data engineers are responsible for building tools and platforms to power the workflows of other members of the b... Read more

26 Aug 2019

48mins

Ranked #16

Podcast cover

Data Serialization Formats with Doug Cutting and Julien Le Dem - Episode 8

Data Serialization Formats with Doug Cutting and Julien Le Dem - Episode 8

Summary With the wealth of formats for sending and storing data it can be difficult to determine which one to use. In th... Read more

22 Nov 2017

51mins

Ranked #17

Podcast cover

Buzzfeed Data Infrastructure with Walter Menendez - Episode 7

Buzzfeed Data Infrastructure with Walter Menendez - Episode 7

Summary Buzzfeed needs to be able to understand how its users are interacting with the myriad articles, videos, etc. tha... Read more

14 Nov 2017

43mins

Ranked #18

Podcast cover

Build Your Data Analytics Like An Engineer With DBT

Build Your Data Analytics Like An Engineer With DBT

Summary In recent years the traditional approach to building data warehouses has shifted from transforming records befor... Read more

20 May 2019

56mins

Ranked #19

Podcast cover

Solving Data Discovery At Lyft

Solving Data Discovery At Lyft

Summary Data is only valuable if you use it for something, and the first step is knowing that it is available. As organi... Read more

5 Aug 2019

51mins

Ranked #20

Podcast cover

Evolving An ETL Pipeline For Better Productivity

Evolving An ETL Pipeline For Better Productivity

Summary Building an ETL pipeline can be a significant undertaking, and sometimes it needs to be rebuilt when a better op... Read more

4 Jun 2019

1hr 2mins

“Podium: AI tools for podcasters. Generate show notes, transcripts, highlight clips, and more with AI. Try it today at https://podium.page”