OwlTail

Cover image of Devin Petersohn

Devin Petersohn

4 Podcast Episodes

Latest 3 Jul 2022 | Updated Daily

Episode artwork

Modin: Pandas Scalability with Devin Petersohn

Software Daily

Pandas is a Python data analysis library, and an essential tool in data science. Pandas allows users to load large quantities of data into a data structure called a dataframe, over which the user can call mathematical operations. When the data fits entirely into memory this works well, but sometimes there is too much data for a single box.The Modin project scales Pandas workflows to multiple machines by utilizing Dask or Ray, which are distributed computing primitives for Python programs. Modin builds an execution plan for large data frames to be operated on against each other, which makes data science considerably easier for these large data sets.Devin Petersohn started the Modin project, and he joins the show to talk about data science with Python, and his work in the Berkeley RISELab.

23 Jul 2020

Episode artwork

Modin: Pandas Scalability with Devin Petersohn

Software Engineering Daily

Pandas is a Python data analysis library, and an essential tool in data science. Pandas allows users to load large quantities of data into a data structure called a dataframe, over which the user can call mathematical operations. When the data fits entirely into memory this works well, but sometimes there is too much data for a single box. The Modin project scales Pandas workflows to multiple machines by utilizing Dask or Ray, which are distributed computing primitives for Python programs. Modin builds an execution plan for large data frames to be operated on against each other, which makes data science considerably easier for these large data sets. Devin Petersohn started the Modin project, and he joins the show to talk about data science with Python, and his work in the Berkeley RISELab. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Modin: Pandas Scalability with Devin Petersohn appeared first on Software Engineering Daily.

53mins

23 Jul 2020

Similar People

Episode artwork

Modin: Pandas Scalability with Devin Petersohn

Data – Software Engineering Daily

Pandas is a Python data analysis library, and an essential tool in data science. Pandas allows users to load large quantities of data into a data structure called a dataframe, over which the user can call mathematical operations. When the data fits entirely into memory this works well, but sometimes there is too much data for a single box. The Modin project scales Pandas workflows to multiple machines by utilizing Dask or Ray, which are distributed computing primitives for Python programs. Modin builds an execution plan for large data frames to be operated on against each other, which makes data science considerably easier for these large data sets. Devin Petersohn started the Modin project, and he joins the show to talk about data science with Python, and his work in the Berkeley RISELab. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Modin: Pandas Scalability with Devin Petersohn appeared first on Software Engineering Daily.

53mins

23 Jul 2020

Episode artwork

Modin: Pandas Scalability with Devin Petersohn

Podcast – Software Engineering Daily

Pandas is a Python data analysis library, and an essential tool in data science. Pandas allows users to load large quantities of data into a data structure called a dataframe, over which the user can call mathematical operations. When the data fits entirely into memory this works well, but sometimes there is too much data for a single box. The Modin project scales Pandas workflows to multiple machines by utilizing Dask or Ray, which are distributed computing primitives for Python programs. Modin builds an execution plan for large data frames to be operated on against each other, which makes data science considerably easier for these large data sets. Devin Petersohn started the Modin project, and he joins the show to talk about data science with Python, and his work in the Berkeley RISELab. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Modin: Pandas Scalability with Devin Petersohn appeared first on Software Engineering Daily.

53mins

23 Jul 2020

Most Popular