Cover image of Machine Learning – Software Engineering Daily
(52)
News
Tech News

Machine Learning – Software Engineering Daily

Updated 2 days ago

News
Tech News
Read more

Machine learning and data science episodes of Software Engineering Daily.

Read more

Machine learning and data science episodes of Software Engineering Daily.

iTunes Ratings

52 Ratings
Average Ratings
38
7
2
3
2

iTunes Ratings

52 Ratings
Average Ratings
38
7
2
3
2
Cover image of Machine Learning – Software Engineering Daily

Machine Learning – Software Engineering Daily

Latest release on Feb 13, 2020

The Best Episodes Ranked Using User Listens

Updated by OwlTail 2 days ago

Rank #1: TensorFlow in Practice with Rajat Monga

Podcast cover
Read more

TensorFlow is Google’s open source machine learning library. Rajat Monga is the engineering director for TensorFlow. In this episode, we cover how to use TensorFlow, including an example of how to build a machine learning model to identify whether a picture contains a cat or not.
TensorFlow was built with the mission of simplifying the process of deploying a machine learning model from research to production, so we also talk about that, as well as how TensorFlow can be used effectively in combination with Google’s open-source cluster manager, Kubernetes.

Sponsors

SnapCI is a continuous integration tool built by Thoughtworks. Go to snap.ci/softwareengineeringdaily to check it out. Alooma is your data pipeline as a service. Alooma is a fully managed tool for pulling from different data sources–MySQL, Postgres, elasticsearch, Salesforce, and many others. Go to alooma.com/sedaily for more information.

The post TensorFlow in Practice with Rajat Monga appeared first on Software Engineering Daily.

Aug 18 2016

44mins

Play

Rank #2: TensorFlow Applications with Rajat Monga

Podcast cover
Read more

Rajat Monga is a director of engineering at Google where he works on TensorFlow. TensorFlow is a framework for numerical computation developed at Google.

The majority of TensorFlow users are building machine learning applications such as image recognition, recommendation systems, and natural language processing–but TensorFlow is actually applicable to a broader range of scientific computation than just machine learning. TensorFlow has APIs for decision trees, support vector machines, and linear algebra libraries.

The current focus of the TensorFlow team is usability. There are thousands of engineers building data intensive applications with TensorFlow, but Rajat and the rest of the TensorFlow team would like to see millions more. In today’s show, Rajat and I discussed how TensorFlow is becoming more usable, as well as some of the developments in TensorFlow around edge computing, TensorFlow Hub, and TensorFlow.js, which allows TensorFlow to run in the browser.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors


Datadog was built to bring clarity to complex, dynamic applications—in the cloud, on-premises, in containers, or wherever they run. With beautiful dashboards, seamless integrations with more than 200 technologies, and distributed request tracing, Datadog provides deep, end-to-end visibility into the health and performance of modern applications. Visualize key metrics, set alerts to identify anomalies, and collaborate with your team to troubleshoot and fix issues fast. Try it yourself by starting a free, 14-day trial today. Listeners of this podcast will also receive a free Datadog T-shirt! softwareengineeringdaily.com/datadog


The octopus: a sea creature known for its intelligence and flexibility. Octopus Deploy: a friendly deployment automation tool for deploying applications like .NET apps, Java apps and more. Ask any developer and they’ll tell you it’s never fun pushing code at 5pm on a Friday then crossing your fingers hoping for the best. That’s where Octopus Deploy comes into the picture. Octopus Deploy is a friendly deployment automation tool, taking over where your build/CI server ends. Use Octopus to promote releases on-prem or to the cloud. Octopus integrates with your existing build pipeline–TFS and VSTS, Bamboo, TeamCity, and Jenkins. It integrates with AWS, Azure, and on-prem environments. Reliably and repeatedly deploy your .NET and Java apps and more. If you can package it, Octopus can deploy it! It’s quick and easy to install. Go to Octopus.com to trial Octopus free for 45 days. That’s Octopus.com


There’s no need to reinvent the wheel when it comes to making your app “realtime.” PubNub makes it simple, enabling you to build immersive and interactive experiences on the web, on mobile phones, embedded into hardware, and any other device connected to the Internet. With powerful APIs, and a robust global infrastructure, you can stream geolocation data, send chat messages, turn on your sprinklers, or rock your baby’s crib when they start crying (PubNub literally powers IoT cribs). 70 SDKs for web, mobile, IoT, and more means you can start streaming data in realtime without a ton of compatibility headaches, and no need to build your own SDKs from scratch. Go to PubNub.com/sedaily to get started. They offer a generous sandbox tier that’s free forever (until your app takes off).


GoCD is a continuous delivery tool created by ThoughtWorks. GoCD agents use Kubernetes to scale as needed. Check out gocd.org/sedaily and learn about how you can get started. GoCD was built with the learnings of the ThoughtWorks engineering team, who have talked about building the product in previous episodes of Software Engineering Daily. It’s great to see the continued progress on GoCD with the new Kubernetes integrations–and you can check it out for yourself at gocd.org/sedaily.

The post TensorFlow Applications with Rajat Monga appeared first on Software Engineering Daily.

Apr 26 2018

56mins

Play

Rank #3: Kubeflow: TensorFlow on Kubernetes with David Aronchick

Podcast cover
Read more

When TensorFlow came out of Google, the machine learning community converged around it. TensorFlow is a framework for building machine learning models, but the lifecycle of a machine learning model has a scope that is bigger than just creating a model. Machine learning developers also need to have a testing and deployment process for continuous delivery of models.

The continuous delivery process for machine learning models is like the continuous delivery process for microservices, but can be more complicated. A developer testing a model on their local machine is working with a smaller data set than what they will have access to when it is deployed. A machine learning engineer needs to be conscious of versioning and auditability.

Kubeflow is a machine learning toolkit for Kubernetes based on Google’s internal machine learning pipelines. Google open sourced Kubernetes and TensorFlow, and the projects have users AWS and Microsoft. David Aronchick is the head of open source machine learning strategy at Microsoft, and he joins the show to talk about the problems that Kubeflow solves for developers, and the evolving strategies for cloud providers.

David was previously on the show when he worked at Google, and in this episode he provides some useful discussion about how open source software presents a great opportunity for the cloud providers to collaborate with each other in a positive sum relationship.

The post Kubeflow: TensorFlow on Kubernetes with David Aronchick appeared first on Software Engineering Daily.

Jan 25 2019

1hr 2mins

Play

Rank #4: Convolutional Neural Networks with Matt Zeiler

Podcast cover
Read more

Convolutional neural networks are a machine learning tool that uses layers of convolution and pooling to process and classify inputs. CNNs are useful for identifying objects in images and video. In this episode, we focus on the application of convolutional neural networks to image and video recognition and classification.

Matt Zeiler is the CEO of Clarifai, an API for image and video recognition. Matt takes us through the basics of a convolutional neural network–you don’t need any background in machine learning to understand the content of the episode. He also discusses the subjective aspects of image and video recognition, and some of the tactics Clarifai has explored. This is far from a solved problem.

Matt also discusses the infrastructure of Clarifai–how they use Kubernetes, how models are deployed, and how models are updated.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view the transcript for this episode.

Sponsors


Deep learning promises to dramatically improve how our world works. To make deep learning easier and faster, we need new kinds of hardware and software–which is why Intel acquired Nervana Systems, a platform for deep learning. Intel Nervana is hiring engineers to help develop a full stack for AI, from chip design to software frameworks. Go to softwareengineeringdaily.com/intel to apply for a job at Intel Nervana. If you know don’t know much about the company, check out the interviews I have conducted with engineers from the company. You can find these at softwareengineeringdaily.com/intel.


Oracle Dyn provides DNS that is as dynamic and intelligent as your applications. Dyn DNS gets your users to the right cloud service, CDN, or data center, using intelligent response to steer traffic based on business policies, as well as real-time internet conditions, like the security and performance of the network path. Get started with a free 30-day trial for your application by going to dyn.com/sedaily.  After the free trial, Dyn’s developer plans start at just $7 a month for world-class DNS. Rethink DNS. Go to dyn.com/sedaily to learn more and get your free trial of Dyn DNS.


Don’t let your database be a black box–drill down into the metrics of your database with 1-second granularity. VividCortex provides database monitoring for MySQL, Postgres, Redis, MongoDB, and Amazon Aurora. Database uptime, efficiency, and performance can all be measured using VividCortex. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can be proactive, and fix performance problems before customers are impacted. If you have a database that you would like to monitor more closely, check out vividcortex.com/sedaily. Github, DigitalOcean, and Yelp all use VividCortex to understand database performance. Learn more at vividcortex.com/sedaily, and request a demo!

The post Convolutional Neural Networks with Matt Zeiler appeared first on Software Engineering Daily.

May 10 2017

54mins

Play

Rank #5: TensorFlow with Greg Corrado

Podcast cover
Read more

“You don’t mind if failures slow things down, but its very important that failures do not stop forward progress.”

TensorFlow is an open source machine learning library intended to bring large-scale, distributed machine learning and deep learning to everyone. Google recently released the framework to the public as a second-generation API, having learned from the successes and failures of DistBelief.

Greg Corrado is a senior research scientist and tech lead at Google, where he focuses on the research areas of machine intelligence, machine perception and natural language processing.

Questions

  • From the end-user’s point of view, how does Smart Reply work?
  • How can teams blend research and engineering to make better products?
  • How did the DistBelief project shape Tensor Flow?
  • How does Tensor Flow differ from streaming frameworks that are more generalized like Spark or Storm?
  • Why would I want to do machine learning on my phone?
  • How is Tensor Flow fault tolerant?
  • What are things the open source community should dive into in Tensor Flow, to fix and improve it?

Links

Sponsors

Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $600 bonus upon landing a job through Hired.

Digital Ocean is the simplest cloud hosting provider. Use promo code SEDAILY for $10 in free credit.

The post TensorFlow with Greg Corrado appeared first on Software Engineering Daily.

Dec 15 2015

41mins

Play

Rank #6: People.ai: Machine Learning for Sales with Andrey Akselrod

Podcast cover
Read more

A large sales organization has hundreds of sales people. Each of those sales people manages a set of accounts who they are trying to close sales deals on. Sales people are overseen by managers who ensure that the sales people are performing well. Directors and VPs ensure the scalability and health of the overall sales organization.

The sales lifecycle mostly takes place within a piece of software called a CRM: customer relationship management. This tool documents the interactions between sales people and accounts. CRMs have been around for many years, and although CRM software is a useful repository of data, it does not fulfill all the needs of a salesperson.

People.ai is a system of machine learning tools built around the sales tooling ecosystem. People.ai helps a sales organization avoid manual data entry, understand areas of potential improvement, and decide on who the highest value sales lead to pursue might be. Andrey Akselrod is the CTO At People.ai and he joins the show to discuss the potential applications of machine learning in the domain of sales, and the engineering work that his company has done.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

ANNOUNCEMENTS

  • FindCollabs is a place to find collaborators and build projects. We recently launched GitHub integrations. It’s easier than ever to find collaborators for your open source projects. And if you are looking for some people to start a project with, FindCollabs we have topic rooms that allow you to find other people who are interested in a particular technology, so that you can find people who are curious about React, or cryptocurrencies, or Kubernetes, or whatever you want to build with.
  • Podsheets is an open source podcast hosting platform that we recently launched. We are building Podsheets with the learnings from Software Engineering Daily, and our goal is to be the best place to host and monetize your podcast. If you have been thinking about starting a podcast, check out podsheets.com.
  • New SEDaily app for iOS and for Android. It includes all 1000 of our old episodes, as well as related links, greatest hits, and topics. You can comment on episodes and have discussions with other members of the community. I’ll be commenting on each episode, so if you hear an episode that you have some commentary on, jump onto the app, or on SoftwareDaily.com to share your thoughts. And you can become a paid subscriber for ad free episodes at softwareengineeringdaily.com/subscribe. Altalogy is the company who has been developing much of the software for the newest app, and if you are looking for a company to help you with your mobile and web development, I recommend checking them out. 

The post People.ai: Machine Learning for Sales with Andrey Akselrod appeared first on Software Engineering Daily.

Aug 07 2019

51mins

Play

Rank #7: Python Data Visualization with Jake VanderPlas

Podcast cover
Read more

Data visualization tools are required to translate the findings of data scientists into charts, graphs, and pictures. Understanding how to utilize these tools and display data is necessary for a data scientist to communicate with people in other domains. In this episode, Srini Kadamati hosts a discussion with Jake VanderPlas about the Python ecosystem for data science and the different attempts at creating a data visualization library.

Jake VanderPlas is the Director of Research for Physical Sciences at the University of Washington’s eScience institute, where he also received his PhD in Astronomy. In addition to contributing to many Python data science libraries like scikit-learn, scipy, numpy, and matplotlib, he’s written multiple books that have been published by O’Reilly and has given many talks on data science tools and techniques. He’s also the co-creator of the Altair project, which is a declarative data visualization library for Python built on the Vega-Lite visualization grammar.

Sponsors


Dice.com helps you manage your career in tech.  Dice.com has a huge index of tech job opportunities that it has developed from 20 years in the business of connecting tech professionals with job opportunities. To check out Dice and support Software Engineering Daily, go to dice.com/sedaily.


Saagie is an end-to-end data platform that lets you focus on deriving business value from data. Saagie helps you take control of your wide variety of data sources, and gets them in one place. Check it out at Saagie.com


SnapCI is a continuous integration tool built by Thoughtworks. Go to snap.ci/softwareengineeringdaily to check it out.

The post Python Data Visualization with Jake VanderPlas appeared first on Software Engineering Daily.

Jan 16 2017

48mins

Play

Rank #8: Real Estate Machine Learning with Or Hiltch

Podcast cover
Read more

Stock traders have access to high volumes of information to help them make decisions on whether to buy an asset. A trader who is considering buying a share of Google stock can find charts, reports, and statistical tools to help with their decision. There are a variety of machine learning products to help a technical investor create models of how a stock price might change in the future.

Real estate investors do not have access to the same data and tooling. Most people who invest in apartment buildings are using a combination of experience, news, and basic reports.

Real estate data is very different from stock data. Real estate assets are not fungible–each one is arguably unique from all others, whereas one share of Google stock is the same as another share. But there are commonalities between real estate assets.

Just like collaborative filtering can be applied to find a new movie that is similar to the ones you have watched on Netflix, comparable analysis can be used to find an apartment building that is very similar to another apartment building which recently appreciated in asset value.

Skyline.ai is a company that is building tools and machine learning models for real estate investors. Or Hiltch is the CTO at Skyline.ai and he joins the show to explain how to apply machine learning to real estate investing. He also describes the mostly serverless architecture of the company. This is one of the first companies we have talked to that is so heavily on managed services and functions-as-a-service.

Show Notes

The post Real Estate Machine Learning with Or Hiltch appeared first on Software Engineering Daily.

Sep 11 2018

58mins

Play

Rank #9: Distributed Deep Learning with Will Constable

Podcast cover
Read more

Deep learning allows engineers to build models that can make decisions based on training data. These models improve over time using stochastic gradient descent. When a model gets big enough, the training must be broken up across multiple machines. Two strategies for doing this are “model parallelism” which divides the model across machines and “data parallelism” which divides the data across multiple copies of the model.

Distributed deep learning brings together two advanced software engineering concepts: distributed systems and deep learning. In this episode, Will Constable, the head of distributed deep learning algorithms at Intel Nervana, joins the show to give us a refresher on deep learning and explain how to parallelize training a model.

Full disclosure: Intel is a sponsor of Software Engineering Daily, and if you want to find out more about Intel Nervana including other interviews and job postings, go to softwareengineeringdaily.com/intel. Intel Nervana is looking for great engineers at all levels of the stack, and in this episode we’ll dive into some of the problems the Intel Nervana team is solving.

Related episodes about machine learning can be found here.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors


Ready to build your own stunning website? Go to Wix-DOT-com and start for free! With Wix, you can choose from hundreds of beautiful, designer-made templates. Simply drag and drop to customize anything and everything. Add your text, images, videos and more. Wix makes it easy to get your stunning website looking exactly the way you want. Plus, your site is mobile optimized, so you’ll look amazing on any device. Whatever you need a website for, Wix has you covered. So, showcase your talents. Start that dev blog, detailing your latest projects. Grow your network with Wix apps made to work seamlessly with your site. Or, simply explore and share new ideas. You decide. Over one-hundred-million people choose Wix to create their website – what are you waiting for? Make yours happen today. It’s easy and free. And when you’re ready to upgrade, use the promo code SEDaily for a special SE Daily listener discount. Terms and conditions apply. For more details, go to Wix.com/wix-lp/SEdaily. Create your stunning website today with Wix.com, that’s W-I-X-DOT-com. 


Oracle Dyn provides DNS that is as dynamic and intelligent as your applications. Dyn DNS gets your users to the right cloud service, CDN, or data center, using intelligent response to steer traffic based on business policies, as well as real-time internet conditions, like the security and performance of the network path. Get started with a free 30-day trial for your application by going to dyn.com/sedaily.  After the free trial, Dyn’s developer plans start at just $7 a month for world-class DNS. Rethink DNS. Go to dyn.com/sedaily to learn more and get your free trial of Dyn DNS.


Deep learning promises to dramatically improve how our world works. To make deep learning easier and faster, we need new kinds of hardware and software–which is why Intel acquired Nervana Systems, a platform for deep learning. Intel Nervana is hiring engineers to help develop a full stack for AI, from chip design to software frameworks. Go to softwareengineeringdaily.com/intel to apply for a job at Intel Nervana. If you know don’t know much about the company, check out the interviews I have conducted with engineers from the company. You can find these at softwareengineeringdaily.com/intel.


Codepath is an 8-week iOS and Android development class for professional engineers who are looking to build a new skill. Codepath has free evening classes for dedicated, experienced engineers and designers.hether you are an engineer who is looking to retrain as a mobile developer–or you are looking to hire mobile engineers, go to Codepath.com to learn more.

The post Distributed Deep Learning with Will Constable appeared first on Software Engineering Daily.

Jun 14 2017

57mins

Play

Rank #10: Self-Driving Engineering with George Hotz

Podcast cover
Read more

In the smartphone market there are two dominant operating systems: one closed source (iPhone) and one open source (Android). The market for self-driving cars could play out the same way, with a company like Tesla becoming the closed source iPhone of cars, and a company like Comma.ai developing the open source Android of self-driving cars.

George Hotz is the CEO of Comma.ai. Comma makes hardware devices that allow users with “normal” cars to be augmented with advanced cruise control and lane assist features. This means you can take your own car–for example, a Toyota Prius–and outfit your car to have something similar to the Tesla Autopilot. Comma’s hardware devices cost under $1000 to order online.

George joins the show to explain how the Comma hardware and software stack works in detail–from the low level interface with a car’s CAN bus to the high level machine learning infrastructure.

Users who purchase the Comma.ai hardware drive around with a camera facing the front of their windshield. This video is used to orient the state of the car in space. The video from that camera also gets saved and uploaded to Comma’s servers. Comma can use this video together with labeled events from the user’s driving experience to crowdsource their model for self-driving.

For example, if a user is driving down a long stretch of highway, and they turn on the Comma.ai driving assistance, the car will start driving itself and the video capture will begin. If the car begins to swerve into another lane, the user will take over for the car and the Comma system will disengage. This “disengagement” event gets labeled as such, and when that data makes it back to Comma’s servers, Comma can use the data to update their models.

George is very good at explaining complex engineering topics, and is also quite entertaining and open to discussing the technology as well as other competitors in the autonomous car space. I have not been able to get many other people on the show to talk about autonomous cars, so this was quite refreshing! I hope to do more in the future.

The post Self-Driving Engineering with George Hotz appeared first on Software Engineering Daily.

Aug 08 2018

1hr 4mins

Play

Rank #11: Stripe Machine Learning Infrastructure with Rob Story and Kelley Rivoire

Podcast cover
Read more

Machine learning allows software to improve as that software consumes more data.

Machine learning is a tool that every software engineer wants to be able to use. Because machine learning is so broadly applicable, software companies want to make the tools more accessible to the developers across the organization.

There are many steps that an engineer must go through to use machine learning, and each additional step inhibits the chances that the engineer will actually get their model into production.

An engineer who wants to build machine learning into their application needs access to data sets. They need to join those data sets, and load them into a machine (or multiple machines) where their model can be trained. Once the model is trained, the model needs to test on additional data to ensure quality. If the initial model quality is insufficient, the engineer might need to tweak the training parameters.

Once a model is accurate enough, the engineer needs to deploy that model. After deployment, the model might need to be updated with new data later on. If the model is processing sensitive or financially relevant data, a provenance process might be necessary to allow for an audit trail of decisions that have been made by the model.

Rob Story and Kelley Rivoire are engineers working on machine learning infrastructure at Stripe. After recognizing the difficulties that engineers faced in creating and deploying machine learning models, Stripe engineers built out Railyard, an API for machine learning workloads within the company.

Rob and Kelley join the show to discuss data engineering and machine learning at Stripe, and their work on Railyard.

ANNOUNCEMENTS

The post Stripe Machine Learning Infrastructure with Rob Story and Kelley Rivoire appeared first on Software Engineering Daily.

Jun 13 2019

1hr 11mins

Play

Rank #12: Deep Learning Systems with Milena Marinova

Podcast cover
Read more

The applications that demand deep learning range from self-driving cars to healthcare, but the way that models are developed and trained is similar. A model is trained in the cloud and deployed to a device. The device engages with the real world, gathering more data. That data is sent back to the cloud, where it can improve the model.

From the processor level to the software frameworks at the top of the stack, the impact of deep learning is so significant that it is driving changes everywhere. At the hardware level, new chips are being designed to perform the matrix calculations at the heart of a neural net. At the software level, programmers are empowered by new frameworks like Neon and TensorFlow. In between the programmer and the hardware, middleware can transform software models into representations that can execute with better performance.

Milena Marinova is the senior director of AI solutions at the Intel AI products group, and joins the show today to talk about modern applications of machine learning and how those translate into Intel’s business strategy around hardware, software, and cloud.

From September 18-20, Milena is attending the O’Reilly AI Conference, hosted by Intel Nervana and O’Reilly.

Full disclosure: Intel is a sponsor of Software Engineering Daily.

Question of the Week: What is your favorite continuous delivery or continuous integration tool? Email jeff@softwareengineeringdaily.com and a winner will be chosen at random to receive a Software Engineering Daily hoodie. 

Show Notes

Data Skeptic podcast: Generative Adversarial Networks

Sponsors


Have you been thinking you’d be happier at a new job? If you’re dreaming about a new job and have been waiting for the right time to make a move, go to hired.com/sedaily. Hired makes finding work enjoyable. Hired uses an algorithmic job-matching tool in combination with a talent advocate who will walk you through the process of finding a better job. Check out hired.com/sedaily to get a special offer for Software Engineering Daily listeners–a $600 signing bonus from Hired when you find that great job that gives you the respect and salary that you deserve as a talented engineer. 


Cloudflare runs 10% of the Internet, providing performance and security to millions of websites. Many of you probably already use Cloudflare on your sites. We’re not talking about using Cloudflare today though, we’re here to talk about building on top of it. If you’re a developer you can build apps which can be installed by the the millions of sites which rely on Cloudflare. You can even sell your apps; they can make you money every month. Visit cloudflare.com/sedaily to watch how you can build and deploy an app in less than 3 minutes.


Who do you use for log management? I want to tell you about Scalyr, the first purpose built log management tool on the market. Most tools on the market utilize text indexing search, which is great… for indexing a book. But if you want to search logs, at scale, fast… it breaks down. Scalyr built their own database from scratch: the system is fast. Most searches take less than 1 second. In fact, 99% of their queries execute in <1 second.  Companies like OKCupid, Giphy and CareerBuilder use Scalyr. It was built by one of the founders of Writely (aka Google Docs). Scalyr has consumer grade UI, that scales infinitely. You can monitor key metrics, trigger alerts, and integrate with PagerDuty. It’s easy to use and did we mention: lightning fast. Give it a try today. It’s free for 90 days at softwareengineeringdaily.com/scalyr.

The post Deep Learning Systems with Milena Marinova appeared first on Software Engineering Daily.

Sep 19 2017

54mins

Play

Rank #13: Bridging Data Science and Engineering with Greg Lamp

Podcast cover
Read more

Current infrastructure makes it difficult for data scientists to share analytical models with the software engineers who need to integrate them.

Yhat is an enterprise software company tackling the challenge of how data science gets done. Their products enable companies and users to easily deploy data science environments and translate analytical models into production code.

Greg Lamp is the Co-founder and CTO of Yhat and previously worked as a product manager in financial services. Yhat was part of the Y Combinator winter 2015 class.

Questions

  • At a software company, what is the typical relationship between data scientists and software engineers?
  • Does Yhat turn data scientists into HTTP endpoints?
  • What was the most counterintuitive advice you received at Y-Combinator?
  • What is the moonshot goal for Yhat?
  • Is it easier to teach data science to an engineer or engineering to a data scientist?

Links

The post Bridging Data Science and Engineering with Greg Lamp appeared first on Software Engineering Daily.

Oct 05 2015

47mins

Play

Rank #14: Hedge Fund Artificial Intelligence with Xander Dunn

Podcast cover
Read more

A hedge fund is a collection of investors that make bets on the future. The “hedge” refers to the fact that the investors often try to diversify their strategies so that the direction of their bets are less correlated, and they can be successful in a variety of future scenarios. Engineering-focused hedge funds have used what might be called “machine learning” for a long time to predict what will happen in the future.

Numerai is a hedge fund that crowdsources its investment strategies by allowing anyone to train models against Numerai’s data. A model that succeeds in a simulated environment will be adopted by Numerai and used within its real money portfolio. The engineers who create the models are rewarded in proportion to how well the models perform.

Xander Dunn is a software engineer at Numerai and in this episode he explains what a hedge fund is, why the traditional strategies are not optimal, and how Numerai creates the right incentive structure to crowdsource market intelligence. This interview was fun and thought provoking–Numerai is one of those companies that makes me very excited about the future.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view or download the transcript for this show.

Sponsors


To understand how your application is performing, you need visibility into your database. VividCortex provides database monitoring for MySQL, Postgres, Redis, MongoDB, and Amazon Aurora. Database uptime, efficiency, and performance can all be measured using VividCortex. You can learn more about how VividCortex works at vividcortex.com/sedaily.


Good customer relationships define the success of your business. Zendesk helps you build better mobile apps and retain users. With Zendesk Mobile SDKs, you can bring native, in-app support to your app quickly and easily. If a user discovers a bug in your app, that user can view help content and start a conversation with your support team without leaving your app. Keep your customers happy with Zendesk. Check out zendesk.com/sedaily to support Software Engineering Daily, and get $177 off.


Exaptive simplifies data application development for the web. Work with the tech you know. Leave the other stuff and the blue code to the platform. Go to exaptive.com/sedaily to learn more and get a free account.

The post Hedge Fund Artificial Intelligence with Xander Dunn appeared first on Software Engineering Daily.

Apr 03 2017

58mins

Play

Rank #15: Word2Vec with Adrian Colyer

Podcast cover
Read more

Machines understand the world through mathematical representations. In order to train a machine learning model, we need to describe everything in terms of numbers.  Images, words, and sounds are too abstract for a computer. But a series of numbers is a representation that we can all agree on, whether we are a computer or a human.

In recent shows, we have explored how to train machine learning models to understand images and video. Today, we explore words. You might be thinking–”isn’t a word easy to understand? Can’t you just take the dictionary definition?” A dictionary definition does not capture the richness of a word. Dictionaries do not give you a way to measure similarity between one word and all other words in a given language.

Word2vec is a system for defining words in terms of the words that appear close to that word. For example, the sentence “Howard is sitting in a Starbucks cafe drinking a cup of coffee” gives an obvious indication that the words “cafe,” “cup,” and “coffee” are all related. With enough sentences like that, we can start to understand the entire language.

Adrian Colyer is a venture capitalist with Accel, and blogs about technical topics such as word2vec. We talked about word2vec specifically, and the deep learning space more generally. We also explored how the rapidly improving tools around deep learning are changing the venture investment landscape.

If you like this episode, we have done many other shows about machine learning with guests like Matt Zeiler, the founder of Clarif.ai and Francois Chollet, the creator of Keras. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.

Question of the Week: What is your favorite continuous delivery or continuous integration tool? Email jeff@softwareengineeringdaily.com and a winner will be chosen at random to receive a Software Engineering Daily hoodie. 

Sponsors


To build the kinds of things developers want to build today, they need better tools.  That’s why Amazon Web Services built Amazon Aurora. A relational database engine that’s compatible with MySQL and PostgreSQL, and provides up to five times the performance of standard MySQL—on the same hardware, at a tenth of the cost. Amazon Aurora from AWS can scale up to millions of transactions per minute. Automatically grow your storage up to 64 terabytes. And replicates data to three different Availability Zones. And you don’t have to manage a thing. There are no upfront charges, no commitments—you only pay for what you use. Check it out, at aurora.aws.


Toptal is the best place to find reasonably priced, extremely talented software engineers to build your projects from scratch or scale your workforce. Get a free pair of Apple Airpods when you use Toptal.com/sedaily to work with an engineer for at least 20 hours.


Cloudflare runs 10% of the Internet, providing performance and security to millions of websites. Many of you probably already use Cloudflare on your sites. We’re not talking about using Cloudflare today though, we’re here to talk about building on top of it. If you’re a developer you can build apps which can be installed by the the millions of sites which rely on Cloudflare. You can even sell your apps; they can make you money every month. Visit cloudflare.com/sedaily to watch how you can build and deploy an app in less than 3 minutes.

The post Word2Vec with Adrian Colyer appeared first on Software Engineering Daily.

Sep 13 2017

1hr 1min

Play

Rank #16: Deep Learning Topologies with Yinyin Liu

Podcast cover
Read more

Algorithms for building neural networks have existed for decades. For a long time, neural networks were not widely used. Recent changes to the cost of compute and the size of our data have made neural networks extremely useful. Our smart phones generate terabytes of useful data. Lower storage costs make it economical to keep that data. Cloud computing democratized the ability to do large scale machine learning across deep learning hardware.

Over the last few years, these trends have been driving widespread use of deep learning, in which neural nets with a large series of layers are used to create powerful results in various fields of classification and prediction. Neural networks are a tool for making sense of unstructured data–text, images, sound waves, and videos.

“Unstructured” data is data with high volume or high dimensionality. For example, an image has a huge collection of pixels, and each pixel has a color value. One way to think about image classification is that you are finding correlations between those pixels. A certain cluster of pixels might represent an edge. After doing edge detection on pixels, you have a collection of edges. Then you can find correlations between those edges, and build up higher levels of abstraction.

Yinyin Liu is a principal engineer and head of data science at the Intel AI products group. She studies techniques for building neural networks. Each different configuration of a neural network for a given problem is called a “topology.” Engineers are always looking at new topologies for solving a deep learning application–such as natural language processing.

In this episode, Yinyin describes what a deep learning topology is and describes topologies for natural language processing. We also talk about the opportunities and the bottlenecks in deep learning–including why the tools are so immature, and what it will take to make the tooling better.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors


Segment allows us to gather customer data from anywhere and send that data to any analytics tool. Segment is the customer data infrastructure that has saved us from writing duplicate code across all of the different platforms that we want to analyze. And if you’re using cloud apps such as – Mailchimp, Marketo, Intercom, AppNexus, Zendesk–you can integrate with all of these different tools and centralize your customer data in one place–with Segment. To get a free 90-day trial, signup for Segment at segment.com and enter SEDaily in the “How did you hear about us box?” during signup.


Azure Container Service simplifies the deployment, management and operations of Kubernetes. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/sedaily.


LiveRamp is one of the fastest growing companies in data connectivity in the Bay Area, and they are looking for senior level talent to join their team. LiveRamp helps the world’s largest brands activate their data to improve customer interactions on any channel or device. The infrastructure is at a tremendous scale: a 500-billion node identity graph generated from over a thousand data sources, running an 85PB hadoop cluster; and application servers that process over 20 billion HTTP requests per day. The LiveRamp team thrives on mind-bending technical challenges. LiveRamp members value entrepreneurship, humility, and constant personal growth. If this sounds like a fit for you, check out softwareengineeringdaily.com/liveramp.


GoCD is a continuous delivery tool created by ThoughtWorks. GoCD agents use Kubernetes to scale as needed. Check out gocd.org/sedaily and learn about how you can get started. GoCD was built with the learnings of the ThoughtWorks engineering team, who have talked about building the product in previous episodes of Software Engineering Daily. It’s great to see the continued progress on GoCD with the new Kubernetes integrations–and you can check it out for yourself at gocd.org/sedaily.

The post Deep Learning Topologies with Yinyin Liu appeared first on Software Engineering Daily.

May 10 2018

1hr

Play

Rank #17: Deep Learning Hardware with Xin Wang

Podcast cover
Read more

Training a deep learning model involves operations over tensors. A tensor is a multi-dimensional array of numbers. For several years, GPUs were used for these linear algebra calculations. That’s because graphics chips are built to efficiently process matrix operations.

Tensor processing consists of linear algebra operations that are similar in some ways to graphics processing–but not identical. Deep learning workloads do not run as efficiently on these conventional GPUs as they would on specialized chips, built specifically for deep learning.

In order to train deep learning models faster, new hardware needs to be designed with tensor processing in mind.

Xin Wang is a data scientist with the artificial intelligence products group at Intel. He joins today’s show to discuss deep learning hardware and Flexpoint, a way to improve the efficiency of space that tensors take up on a chip. Xin presented his work at NIPS, the Neural Information Processing Systems conference, and we talked about what he saw at NIPs that excited him. Full disclosure: Intel, where Xin works, is a sponsor of Software Engineering Daily.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors


Azure Container Service simplifies the deployment, management and operations of Kubernetes. Eliminate the complicated planning and deployment of fully orchestrated containerized applications with Kubernetes. You can quickly provision clusters to be up and running in no time, while simplifying your monitoring and cluster management through auto upgrades and a built-in operations console. Avoid being locked into any one vendor or resource. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/sedaily.


Your company needs to build a new app, but you don’t have the spare engineering resources. There are some technical people in your company who have time to build apps–but they are not engineers. OutSystems is a platform for building low-code apps. As an enterprise grows, it needs more and more apps to support different types of customers and internal employee use cases. OutSystems has everything that you need to build, release, and update your apps without needing an expert engineer. And if you are an engineer, you will be massively productive with OutSystems. Find out how to get started with low-code apps today–at OutSystems.com/sedaily. There are videos showing how to use the OutSystems development platform, and testimonials from enterprises like FICO, Mercedes Benz, and SafeWay. OutSystems enables you to quickly build web and mobile applications–whether you are an engineer or not. Check out how to build low-code apps by going to OutSystems.com/sedaily.


Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.

The post Deep Learning Hardware with Xin Wang appeared first on Software Engineering Daily.

Jan 29 2018

57mins

Play

Rank #18: Model Training with Yufeng Guo

Podcast cover
Read more

Machine learning models can be built by plotting points in space and optimizing a function based off of those points.

For example, I can plot every person in the United States in a 3 dimensional space: age, geographic location, and yearly salary. Then I can draw a function that minimizes the distance between my function and each of those data points. Once I define that function, you can give me your age and a geographic location, and I can predict your salary.

Plotting these points in space is called embedding. By embedding a rich data set, and then experimenting with different functions, we can build a model that makes predictions based on those data sets. Yufeng Guo is a developer advocate at Google working on CloudML. In this show, we described two separate examples for preparing data, embedding the data points, and iterating on the function in order to train the model.

In a future episode, Yufeng will discuss CloudML and more advanced concepts of machine learning.

Transcript

Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.

Sponsors


Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.


Digital Ocean Spaces gives you simple object storage with a beautiful user interface. You need an easy way to host objects like images and videos. Your users need to upload objects like pdfs and music files. Digital Ocean Spaces is modern object storage with a modern UI that you will love to use–it’s like the UI for Dropbox, but with the pricing of a raw object storage; I almost want to use it like a consumer product. To try Digital Ocean Spaces, go to do.co/sedaily and get 2 months of Spaces plus a $10 credit to use on any other Digital Ocean products–and you get this credit even if you have been with Digital Ocean for awhile. It’s a nice added bonus just for trying out Spaces. If you become a customer, the pricing is simple:  $5 per month price and includes 250GB of storage and 1TB of outbound bandwidth. There are no costs per request and additional storage is priced at the lowest rate available: $0.01 per GB transferred and $0.02 per GB stored. There won’t be any surprises on your bill. Digital Ocean simplifies the cloud–they look for every opportunity to remove friction from a developer’s experience. I love it, and I think you will too–check it out at do.co/sedaily.


The octopus: a sea creature known for its intelligence and flexibility. Octopus Deploy: a friendly deployment automation tool for deploying applications like .NET apps, Java apps and more. Ask any developer and they’ll tell you it’s never fun pushing code at 5pm on a Friday then crossing your fingers hoping for the best. That’s where Octopus Deploy comes into the picture. Octopus Deploy is a friendly deployment automation tool, taking over where your build/CI server ends. Use Octopus to promote releases on-prem or to the cloud. Octopus integrates with your existing build pipeline–TFS and VSTS, Bamboo, TeamCity, and Jenkins. It integrates with AWS, Azure, and on-prem environments. Reliably and repeatedly deploy your .NET and Java apps and more. If you can package it, Octopus can deploy it! It’s quick and easy to install. Go to Octopus.com to trial Octopus free for 45 days. That’s Octopus.com


Who do you use for log management? I want to tell you about Scalyr, the first purpose built log management tool on the market. Most tools on the market utilize text indexing search, which is great… for indexing a book. But if you want to search logs, at scale, fast… it breaks down. Scalyr built their own database from scratch: the system is fast. Most searches take less than 1 second. In fact, 99% of their queries execute in <1 second.  Companies like OKCupid, Giphy and CareerBuilder use Scalyr. It was built by one of the founders of Writely (aka Google Docs). Scalyr has consumer grade UI, that scales infinitely. You can monitor key metrics, trigger alerts, and integrate with PagerDuty. It’s easy to use and did we mention: lightning fast. Give it a try today. It’s free for 90 days at softwareengineeringdaily.com/scalyr.

The post Model Training with Yufeng Guo appeared first on Software Engineering Daily.

Oct 18 2017

49mins

Play

Rank #19: Drishti: Deep Learning for Manufacturing with Krish Chaudhury

Podcast cover
Read more

RECENT UPDATES:

Podsheets is our open source set of tools for managing podcasts and podcast businesses

New version of Software Daily, our app and ad-free subscription service

Software Daily is looking for help with Android engineering, QA, machine learning, and more

FindCollabs Hackathon has ended–winners will probably be announced by the time this episode airs; we will be announcing our next hackathon in a few weeks, so stay tuned

Drishti is a company focused on improving manufacturing workflows using computer vision.

A manufacturing environment consists of assembly lines. A line is composed of sequential stations along that manufacturing line. At each station on the assembly line, a worker performs an operation on the item that is being manufactured. This type of workflow is used for the manufacturing of cars, laptops, stereo equipment, and many other technology products.

With Drishti, the manufacturing process is augmented by adding a camera at each station. Camera footage is used to train a machine learning model for each station on the assembly line. That machine learning model is used to ensure the accuracy and performance of each task that is being conducted on the assembly line.

Krish Chaudhury is the CTO at Drishti. From 2005 to 2015 he led image processing and computer vision projects at Google before joining Flipkart, where he worked on image science and deep learning for another four years. Krish had spent more than twenty years working on image and vision related problems when he co-founded Drishti.

In today’s episode, we discuss the science and application of computer vision, as well as the future of manufacturing technology and the business strategy of Drishti.

The post Drishti: Deep Learning for Manufacturing with Krish Chaudhury appeared first on Software Engineering Daily.

Apr 17 2019

59mins

Play

Rank #20: Data Science at Spotify with Boxun Zhang

Podcast cover
Read more

“I normally try to sit together or very close to a product team or engineering team. And by doing so, I get very close to the source of all kinds of challenging problems.”

Spotify is a streaming music service that uses data science and machine learning to implement product features such as recommendation systems and music categorization, but also to answer internal questions.

Boxun Zhang is a data scientist at Spotify where he focuses on understanding user behavior within the product.

Questions

  • What is the overlap between distributed systems and data science?
  • How has Spotify’s big data architecture evolved over time?
  • As a data scientist do you need to understand this big data architecture well?
  • What were the benefits for starting to use Kafka?
  • What kinds of data science problems do you tackle at Spotify?
  • Could you describe what a random forest is?
  • Why are there so many streaming systems, and what do you use at Spotify?
  • How will data science change moving towards the future?

Links

Sponsors

Hired.com is the job marketplace for software engineers. Go to hired.com/softwareengineeringdaily to get a $600 bonus upon landing a job through Hired.

Digital Ocean is the simplest cloud hosting provider. Use promo code SEDAILY for $10 in free credit.

The post Data Science at Spotify with Boxun Zhang appeared first on Software Engineering Daily.

Dec 11 2015

57mins

Play