Cover image of Python Bytes
(146)

Rank #124 in Technology category

Technology

Python Bytes

Updated 10 days ago

Rank #124 in Technology category

Technology
Read more

Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.

Read more

Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.

iTunes Ratings

146 Ratings
Average Ratings
133
4
4
2
3

Recommendations req

By jx2233 - Jan 20 2020
Read more
Hey question what are the good python programming books for starters??

Outstanding Podcast

By Aggienaut - Sep 21 2019
Read more
Always informative and entertaining!

iTunes Ratings

146 Ratings
Average Ratings
133
4
4
2
3

Recommendations req

By jx2233 - Jan 20 2020
Read more
Hey question what are the good python programming books for starters??

Outstanding Podcast

By Aggienaut - Sep 21 2019
Read more
Always informative and entertaining!
Cover image of Python Bytes

Python Bytes

Latest release on Oct 23, 2020

The Best Episodes Ranked Using User Listens

Updated by OwlTail 10 days ago

Warning: This podcast data isn't working.

This means that the episode rankings aren't working properly. Please revisit us at a later time to get the best episodes of this podcast!

Rank #1: #204 Take the PSF survey and Will & Carlton drop by

Podcast cover
Read more

Sponsored by Techmeme Ride Home podcast: pythonbytes.fm/ride

Special guests

Brian #1: nbQA : Quality Assurance for Jupyter Notebooks

  • Sent in by listener and Patreon supporter (woohoo!!!) Marco Gorelli.
  • We’ve now talked about running black on Jupyter notebooks in the past (at least 2) shows?
  • Marco’s recommendation is nbQA
  • nbQA lets you run all this on notebooks:
    • black
    • isort
    • mypy
    • flake8
    • pylint
    • pyupgrade to upgrade syntax
    • doctest to check examples
  • Run as a pre-commit hook
  • Configure in pyproject.toml
  • Also (from Marco) better than standalone black due to:
    • can run on a directory, not just one file at a time
    • keeps diffs minimal and easier to read then black
    • preserves trailing semicolons, as they are used to suppress output in notebooks
    • supports most standard magic commands
  • And the nbQA project is tested with …. drum roll …. pytest (of course)

Michael #2: The PSF yearly survey is out, go take it now!

  • This is the fourth iteration of the official Python Developers Survey.
  • The results of this survey serve as a major source of knowledge about the current state of the Python community
  • Takes about 10 minutes
  • They will randomly choose 100 winners (from those who complete the survey in its entirety), who will each receive an amazing Python Surprise Gift Pack.
  • Analysis is really well done, see the 2019 results.

Will #3: From Prototype to Production in Django

  • Django defaults to local/prototype settings initially when run startproject command.
  • settings.py file contains global configs for a project. What needs to change for production?
    • DEBUG set to False
    • SECRET_KEY actually secret and not in source control
    • ALLOWED_HOSTS
    • Database probably not SQLite
    • Configure static/media files
    • Change admin path away from /admin
    • User registration, typically use django-allauth
  • Environment variables preferred method to have local/production settings. environs is Will’s favorite 3rd party package but multiple others.
  • Use Django deployment checklist aka python check --``deploy, add HTTPS all over basically
  • DJ Checkup website
  • What else? Testing, logging, performance, security, etc etc etc
  • Django for Professionals book covers all of this and more including Docker

Carlton #4: Deployment: Getting your app online

  • Deployment (and how hard it it) seem to come up almost every week on Django Chat.
  • I think a lot about The Deployment Gap that exists: a new user finishes the Django tutorial, or the DRF tutorial or Will’s Django for Beginners book, and they’re still a long way from being able to deploy.
  • PaaS look like a good option. (Heroku/App Service/DO’s new one, and so on) but they’re a bit of a cul-de-sac (do you use that in America?) — you drive in, get to the end and then have to drive out again.
  • On the other hand, DIY all looks far-too-scary™: there’s provisioning servers, private clouds, firewalls, permissions, block stores, objects stores, … — Argh! (On top of all the usual DNS, and all the rest of it.)
    • Plus there’s a tendency I think towards fashion: you’d think you can’t possibly deploy without adopting a micro-service architecture, or a container orchestration platform. It’s too much.
    • This is same as, You couldn’t possibly use Django, Postgres… — you have to use the New Hotness™.
  • I think there’s a simpler story: start with a VM, a relational database, a simple network setup, and grow from there. There’s still moving parts, but it’s not that complex.
  • I’m working on a tool for all this, Button. It’s coming in 2021. It’s a simpler deployment story. It’s part tool, part guide. It get’s you through the Argh! It’s too scary bit.
  • You can sign up for early updates at https://btn.dev

Brian #5: All Contributors
“This is a specification for recognizing contributors to an open source project in a way that rewards each and every contribution, not just code.
The basic idea is this:

Use the project README (or another prominent public documentation page in the project) to recognize the contributions of members of the project community.

People are giving themselves and their free time to contribute to open source projects in so many ways, so we believe everyone should be praised for their contributions (code or not).”

  • used by nbQA
  • It is a specification for how to be consistent in listing contributors.
  • Also includes an Emoji Key, to be used with contributors name (and optionally avatar) to denote the kind of contribution they’ve made:
    • 💻 code
    • 📖 doc
    • 🎨 design
    • 💡 example
    • 🚧 maintenance
    • 🔌 plugin
    • and many, many more
  • And a GitHub bot to automate acknowledging contributors to your open source projects.
    • Uses natural language parsing to add people as contributors and add the appropriate emoji
  • Also includes a cli for adding contributors, comparing GitHub contributors to your listed contributors, and more.

Michael #6: MovingPandas

  • A Python library for handling movement data based on Pandas and GeoPandas
  • It provides trajectory data structures and functions for analysis and visualization.
  • MovingPandas development started as a QGIS plugin idea in 2018. But made more sense as its own library
  • Features
    • Convert GeoPandas GeoDataFrames of time-stamped points into MovingPandas Trajectories and TrajectoryCollections
    • Add trajectory properties, such as movement speed and direction
    • Split continuous observations into individual trips
    • Generalize Trajectories
    • Aggregate TrajectoryCollections into flow maps
    • Create static and interactive visualizations for data exploration
  • MovingPandas makes it straightforward to compute movement characteristics, such as trajectory length and duration, as well as movement speed and direction.
  • Example
df = pd.DataFrame([
{'geometry':Point(0,0), 't':datetime(2018,1,1,12,0,0)},
{'geometry':Point(6,0), 't':datetime(2018,1,1,12,6,0)},
{'geometry':Point(6,6), 't':datetime(2018,1,1,12,10,0)},
{'geometry':Point(9,9), 't':datetime(2018,1,1,12,15,0)}
]).set_index('t')
gdf = gpd.GeoDataFrame(df, crs=CRS(31256))
traj = mpd.Trajectory(gdf, 1)

Extras

Carlton:

Michael:

Joke:

“Give a person a program, frustrate them for a day. Teach them to program, frustrate them for a lifetime. 🙂” (…unless you teach them to test at the same time. - Brian)

The failed interview: “Sorry, we’re looking for someone aged 22-26… with 30 years of experience with Flask”

Oct 23 2020

40mins

Play

Rank #2: #203 Scripting a masterpiece for Python web automation

Podcast cover
Read more

Sponsored by DataDog: pythonbytes.fm/datadog

Michael #1: Introducing DigitalOcean App Platform

  • Reimagining PaaS to make it simpler for you to build, deploy, and scale apps.
  • Many of our customers have come to DigitalOcean after their PaaS became too expensive, or after hitting various limitations.
  • You can build, deploy, and scale apps and static sites by simply pointing to your GitHub repository.
  • Built on DigitalOcean Kubernetes, the App Platform brings the power, scale, and flexibility of Kubernetes without exposing you to any of its complexity.
  • App Platform is built on open standards providing more visibility into the underlying infrastructure than in a typical closed PaaS environment.
  • You can also enable ‘Autodeploy on Push,’ which automatically re-deploys the app each time you push to the branch containing the source code.
  • To efficiently handle traffic spikes (planned or unplanned), the App Platform lets you scale apps horizontally (i.e., add more instances that serve your app) and vertically (beef up the instances with more CPU and memory resources). (with zero downtime)
  • What can you build with the App Platform? Web apps, Static sites, APIs, Background workers

Brian #2: Announcing Playwright for Python

  • playwright-python
  • playwrignt-pytest
  • it’s a Microsoft thing
  • the pitch: “With the Playwright API, you can author end-to-end tests that run on all modern web browsers. Playwright delivers automation that is faster, more reliable and more capable than existing testing tools.”
  • timeout-free automation
    • automatically waits for the UI to be ready
  • Intended to stay modern
    • emulation of mobile viewports
    • geolocation
    • web permissions
    • can automate scenarios across multiple pages
  • cross browser
    • Chromium (Chrome and Edge), WebKit (Safari), and Firefox
    • Safari rendering even works on Windows and Linux
  • pytest compatible
  • Django compatible
  • Can work within CI/CD, even GH actions.

Michael #3: Asynchronously Opening and Closing Files in asyncio

  • Article by Chris Wellons
  • asyncio has support for asynchronous networking, subprocesses, and interprocess communication. However, it has nothing for asynchronous file operations — opening, reading, writing, or closing.
  • If a file operation takes a long time, perhaps because the file is on a network mount, then the entire Python process will hang.
  • Let’s build it!
  • The usual way to work around the lack of operating system support for a particular asynchronous operation is to dedicate threads to waiting on those operations. By using a thread pool, we can even avoid the overhead of spawning threads when we need them. Plus asyncio is designed to play nicely with thread pools anyway.
  • open() uses with so build an aopen() to have async with. Here’s the tasty bit:
def __aenter__(self):
def thread_open():
return open(*self._args, **self._kwargs)
loop = asyncio.get_event_loop()
self._future = loop.run_in_executor(None, thread_open)
return self._future

Brian #4: Excel: Why using Microsoft's tool caused Covid-19 results to be lost

  • this article was on bbc.com, but it was in several places
  • Nearly 16,000 coronavirus cases went unreported in England.
  • Logs pulled together from data from commercial testing firms (filed as csv files) was combined in a Excel xls template so that it could then be uploaded to a central system and made available to the NHS Test and Trace team, as well as other government computer dashboards.
  • XLS was one problem. Limit is about 65k rows.
  • XLSX increases that limit by about 16 times.
  • But still, …. Excel for this?
  • Comment from Prof Jon Crowcroft from the University of Cambridge:
    • "Excel was always meant for people mucking around with a bunch of data for their small company to see what it looked like.”
    • “And then when you need to do something more serious, you build something bespoke that works - there's dozens of other things you could do.”
    • "But you wouldn't use XLS. Nobody would start with that."
  • In short: Best practices in computing don’t always make it into the rest of the world. Much of the world still runs on Excel.
  • What does this have to do with Python? Well.. Big datasets should use databases and Python.
  • Check out the Talk Python free webcast on moving from Excel to Python: talkpython.fm/excel-webcast

Michael #5: locust.io

  • via Prayson Daniel
  • locust.io is awesome tool to simulate users hammering your endpoint. Quite handy.
  • An open source load testing tool: Define user behavior with Python code, and swarm your system with millions of simultaneous users.
  • Usage: after installing it via pip, you can map your local endpoint locust --host=http://localhost:5000 and open http://localhost:8089 to access the locust web ui to simulate usage
  • Features:
    • Define user behavior in code: No need for clunky UIs or bloated XML. Just plain code.
    • Distributed & scalable: Locust supports running load tests distributed over multiple machines, and can therefore be used to simulate millions of simultaneous users
    • Proven & battle tested: Locust has been used to simulate millions of simultaneous users. Battlelog, the web app for the Battlefield games, is load tested using Locust, so one can really say Locust is Battletested ;).
  • Example:
from locust import HttpUser, between, task

class WebsiteUser(HttpUser):
wait_time = between(5, 15)

def on_start(self):
self.client.post("/login", {
"username": "test_user",
"password": ""
})

@task
def index(self):
self.client.get("/")
self.client.get("/static/assets.js")

@task
def about(self):
self.client.get("/about/")

Brian #6: Fixing Hacktoberfest

  • various sources
  • Hacktoberfest is an interesting idea sponsored by Digital Ocean, and other sponsors.
    • Overall, it’s a good idea. Encourage people to contribute by bribing them with a t-shirt and other swag.
  • Problem and some solutions outlined well by Anthony Sottile in what’s (wrong with) hacktoberfest?
    • There’s always been some spam associated with hacktoberfest.
      • Tiny bizarre PRs, PRs to unmaintained repos, etc.
    • This year has been worse
    • A fairly popular YouTuber posted a video showing people how to get a free t-shirt by doing things like adding “- an awesome project” or expanding “It’s” to “It is” to the readme, then submitting it as “improved docs”.
  • Changes:
PRs count if:
> Submitted during the month of October AND (
> The PR is labelled as hacktoberfest-accepted by a maintainer OR
> Submitted in a repo with the hacktoberfest topic AND (
> The PR is merged OR
> The PR has been approved
> )
> )
- The deadline for completions, merging, labeling, and approving is November 1.
- I applaud DO and whoever else is working on hacktoberfest for reacting quickly to this.

Extras:

Michael:

  • PyCascades 2021 will take place on Saturday, February 20th from many locations across the Pacific Northwest and beyond.
  • Call for Proposals 📣 PyCascades has been lucky to give our stage to incredible speakers with wonderful talks over the last three years. We are really looking forward to showcasing our community again next year. Our Call for Proposals (CFP) opens today and closes at the end of the day on Tuesday, November 10th, 2020 Anywhere on Earth.
  • Patricio Reyes, a researcher at Barcelona Supercomputing Center (virtual tour):

Joke: More Classical Programmer Paintings

Oct 16 2020

40mins

Play

Rank #3: #202 Jupyter is back in black!

Podcast cover
Read more

Sponsored by DataDog: pythonbytes.fm/datadog

Brian #1: New in Python 3.9

  • scheduled to be released Oct 5
  • Python 3.9.0rc2 released Sept 17
  • New features (highlights)
    • Dictionary merge (|) and update (|=) operators.
    • String str.removeprefix(prefix) and str.removesuffix(suffix). This have also been added to bytes, bytearray, and collections.UserString.
    • In type annotations you can now use built-in collection types such as list and dict as generic types instead of importing the corresponding capitalized types (e.g. List or Dict) from typing.
    • New PEG parser
    • Any valid expression can be used as a decorator. see PEP 614. Haven’t quite wrapped my head around the possibilities yet.
    • [zoneinfo](https://docs.python.org/3.9/library/zoneinfo.html#module-zoneinfo) module brings support for the IANA time zone database to the standard library.
  • Lots of other great stuff too, please check out the changelog and give 3.9 a spin

Michael #2: jupyter-black

  • via Mary Hoang
  • I recently tuned into the auto racing episode on Talk Python and liked Kane’s pypi suggestion of blackcellmagic.
  • There are a couple of other pypi packages that envelop the idea of black formatting Jupyter Notebooks and I recently started using a new pypi tool called jupyterblack!
  • This tool lets you black format Notebooks like you would Python files, only you call jblack instead of black.
  • Then the extension provides
    • a toolbar button
    • a keyboard shortcut for reformatting the current code-cell (default: Ctrl-B)
    • a keyboard shortcut for reformatting whole code-cells (default: Ctrl-Shift-B)
  • It will also point basic syntax errors.

Brian #3: Understanding and preventing DoS in web applications

  • listener submitted suggestion, which led me to a bit of a rabbit hole
  • by Jacob Kaplan-Moss
  • Great discussion of what a DoS attack is, and how to check for and prevent problems, including a focus on Python and django.
  • One example is ReDoS, regular expression DoS
  • “ReDoS bugs occur when certain types of strings can cause improperly crafted regular expressions to perform extremely poorly.”
  • Links to Finding Python ReDoS bugs at scale using Dlint and r2c, which talks about using dlint.
  • dlint

Michael #4: bbox-visualizer

  • via Shoumik Sharar Chowdhury (SHOH-mik CHOW-duh-ree)
  • I work with computer vision, and one of the pain points of working with something like object detection or object recognition is positioning the labels once you get the bounding boxes.
  • So for example, in the first image in the README, you get the positions of the boxes around the objects using any object-detection method. That part isn't hard. Positioning the labels like "person", "bicycle", "car" right on top of the boxes, however, is quite annoying. You have to do some clumsy math to make it work like that.
  • This library helps make that very easy. You just use the bounding box locations and their corresponding labels and the library takes care of everything. Moreover, there are some other cool visualizations that you can use, other than the standard label on top of the boxes.
  • Uses Open CV in Python to work with the image files and in memory drawing
  • Define the bounds, set the label text and you’re off.
bbv.draw_rectangle(img, bbox)
bbv.add_T_label(img, label, bbox)

Brian #5: How to NEVER use lambdas.

  • Another listener suggestion.
  • Starts off with a brief example showing how to rewrite a power function as a lambda.
  • Then jumps right into crazy code
  • Replacing import statements with __import__(``'``library``'``) expressions
  • Moving on to lambda-ifying class definitions
  • Ending with a complete Flask application as a lambda expression.
  • Truly horrible stuff

Michael #6: Uncommon Contributions: Making impact without touching the core of a library

  • via Alexander, by Vincent Warmerdam
  • Different ways that people can contribute to open source software besides the typical code contribution.
  • Often, contributions include adding features to a library, fixing bugs, or providing examples to a documentation page. But consider:
  • Info
    • rasa --version
    • Before, this command would list the current version of Rasa. In the new version, it lists:
    • The version of python.
    • The path to your virtual environment.
    • The versions of related packages.
  • Cron on Dependencies
    • A user for scikit-lego, a package that I maintain, discovered that the reason the code wasn’t working was because scikit-learn introduced a minor, but breaking, change. To fix this the user added a cronjob with Github actions to the project.
  • Spellcheck
    • Run a spellchecker, not just against our docs, but also on our source code! It turns out we had some issues in our docstrings as well.
  • Error Messages
    • In whatlies, we’ve recently allowed for optional dependencies. If you try to use a part of the library that requires a dependency that is not part of the base package, then you’ll get this error message.
In order to use ConveRTLanguage you'll need to install via;
> pip install whatlies[tfhub]
See installation guide here: https://rasahq.github.io/whatlies/#installation
  • I added something like this to fluentcheck: github.com/csparpa/fluentcheck/pull/22
  • Failing Unit Tests
    • There’s a lovely plugin for mkdocs called mkdocs-jupyter. It allows you to easily add jupyter notebooks to your documentation pages. When I was playing with it, I noticed that it wasn’t compatible with a new version of mkdocs. Instead of just submitting a bug to Github, I went the extra mile. I created a PR that contained a failing unit-test for this issue.
  • Renaming files
    • Is there a file.py and a class File in file within a package? Careful there.

Extras:

Joke:

  • Suggested by Tim Skov Jacobsen
    • Kelsey Hightower’s project nocode
    • No Code: No code is the best way to write secure and reliable applications. Write nothing; deploy nowhere.”
    • No Code Style Guide: All no code programs are the same, regardless of use case, any code you write is a liability.”
    • 43.6k stars
    • 3.2k issues
    • 426 PRs

Oct 09 2020

33mins

Play

Rank #4: #201 Understand git by rebuilding it in Python

Podcast cover
Read more

Sponsored by us! Support our work through:

Michael #1: Under the hood of calling C/C++ from Python

  • Basics first: what C compiles to?
    • Each operating system features some exact format to work with. Among the most popular ones are:
    • ELF (Executable and Linkable Format), used by most of Linux distros
    • PE (Portable Executable), used by Windows
    • Mach-O (Mach object), used by Apple products
  • We also need to make our library visible to our programs. An easiest way to do so is to copy it to /usr/lib/ - default system-wide directory for libraries. Maybe put it in system / system32 on Windows?
  • ctypes: the simplest way
    • With the shared object compiled, we are ready to call it.
    • Consider ctypes to be the easiest way to execute some C code, because:
    • it’s included in the standard library,
    • writing a wrapper uses plain Python.
    • lib = ctypes.CDLL(f'/usr/lib/libdullmath.so')
    • lib.get_pi
  • For C: You need to be clear about the calling convention (extern “C” for example)
    • Now we can load libraries at runtime, but we are still missing the way to generate correct caller ABI to use external C libraries. Do deal with it, libffi was created.
    • Libffi is a portable C library, designed for implementing FFI tools, hence the name. Given structs and functions definitions, it calculates an ABI of function calls at runtime.
  • A mature approach to improve in this area is to allow libraries to introduce themselves. We can oblige every library to define a function named entry_point, which will return metadata about functions it contains.
  • Final destination: C/C++ extensions and Python/C API
// NOTE: entry point function has dynamic name PyInit_[HTML_REMOVED]
PyMODINIT_FUNC PyInit_mymath(void)
{
return PyModule_Create(&mymathmodule);
}
  • The main difference is that we have to wrap initial C functions with Python-specific ones. CPython interpreter uses its own PyObject type internally rather than raw int, char*, and so on, and we need the wrappers to perform the conversion.
  • Cython, Boost.Python, pybind11 and all all all
    • The main challenge of writing pure C extensions is a massive amount of boilerplate that needs to be written. Mainly this boilerplate is related to wrapping and unwrapping PyObject. It becomes especially hard if a module introduces its own classes (object types).
    • To solve this issue, a plethora of different tools was created. All of them introduce a certain way to generate wrapping boilerplate automatically. They also provide easy access to C++ code and advanced tools for the compilation of extensions.
    • Examples
    • aiohttp - asyncio web framework that uses Cython for HTTP parsing,
    • uvloop - event loop that is wrapping libuv, fully written in Cython,
    • httptools - bindings to nodejs HTTP parser, also fully written in Cython (a lot of other big projects like sanic or uvicorn use httptools).

Cecil #2: ugit: DIY Git in Python

Michael #3: Things I Learned to Become a Senior Software Engineer

  • by Niel Kakkar
  • Growing using different ladders of abstraction
    • Entering my second year, I had all the basics in place.
    • I did figure out something insightful. I’m working inside the software development lifecycle, but this lifecycle is part of a bigger lifecycle: the product and infrastructure development lifecycle.
  • Learning what people around me are doing
    • Since we’re not in a closed system, it makes sense to better understand the job of the product managers, the sales people, and the analysts.
    • Product managers are the best source for this. They know how the business makes money, who are the clients, and what do clients need.
  • Learning good habits of mind
  • Strategies for making day-to-day more effective: The other side of the coin is habits that allow you to think well. It starts with noticing little irritations during the day, inefficiencies in meetings, and then figuring out strategies to avoid them.
  • Some good habits I’ve noticed:
    • Never leave a meeting without making the decision / having a next action
    • Decide who is going to get it done. Things without an owner rarely get done.
    • Document design decisions made during a project
  • Acquiring new tools for thought & mental models
    • New tools for thought are related to thinking well, but more specific to software engineering.
    • For example, I was recently struggling with a domain with lots of complex business logic. Edge cases were the norm, and we wanted to design a system that handles this cleanly. That’s when I read about Domain Driven Design
  • Protect your slack
    • When I say slack, I don’t mean the company, but the adjective.
    • One thing that gives me high output and productivity gains is to “slow down”. Want to get more done? Slow down.
    • When there is slack, you get a chance to experiment, learn, and think things through. This means you get enough time to get things done.
    • When there is no slack, deadlines are tight, and all your focus goes into getting shit done.
  • Ask Questions
    • Q: What is a package?
    • A: It’s code wrapped together that can be installed on a system.
    • Q: Why do I need packages? A: They give a consistent way of getting all the files you need in the right place. Without them, things are easy to mess up. You need to ensure every file is where it’s supposed to be, the system paths are set up, and dependent packages are available.
    • Q: How do packages differ from applications I can install on my system? A: It’s a very similar idea! Windows installer is like a package manager that helps install applications. Similarly, DPKG and rpm packages are like .exes that you can install on Linux systems, with the help of apt and yum package managers, which are like the windows installers.
  • Force multipliers
    • One sprint I didn’t get much done myself. I wrote very limited code. Instead, I co-ordinated which changes should go out when (it was a complicated sprint), tested they worked well, did lots of code reviews, made alternate design suggestions, and pair-programmed wherever I could to get things un-stuck. We got everything done, and in this case, zooming out helped make decisions for PRs easier. It was one of our highest velocity sprints.
  • Embrace fear: I’ve learned to embrace this feeling. It excites me. It’s information about what I’m going to learn. I’ve taken it so far that I’ve started tracking it in my human log - “Did I feel fear this week?” If the answer is no too many weeks in a row, I’ve gotten too comfortable.
  • Super powers
    • Getting into the source code when documentation isn’t enough
      • Quest: Reading open source code.
    • Quickly build a mental model for the code you’re looking at
      • Quest: Reading open source code.
    • Embracing fear
      • Quest: Build a side project.
    • Confidence to express ignorance
      • Quest: Overcome the first gotcha with growing.

Cecil #4: Build tech skills for space exploration

Michael #5: Profiling Django Views

  • by Farhan Azmi
  • We know we need to profile our code
  • Many Python profiling tools exist, but this article will limit only to the most used tools: cProfile and django-silk . The two tools mainly profile in regards to function calls and execution time.
  • To incorporate cProfile to Django views, we can write our own middleware that captures the profiling on every request sent to our Django views.
  • Thankfully, there exists a simpler solution: django-cprofile-middleware. It is a simple profiling middleware created by a Github user omarish.
  • To profile this view with the installed middleware, we can just append prof parameter to the end of the URL, i.e. http://localhost:8000/api/auth/users/availability/?username=[HTML_REMOVED]&email=[HTML_REMOVED]&prof
  • We can visualize the profile result further with Python profiler visualizing library, such as SnakeViz. Just add &download to the request.
  • the profile result could not show which database query that brings performance hit. This is needed especially when our application is centered around database (SQL) queries: That’s where django-silk comes in.
  • Add as middleware: Silk will automatically intercept requests we make to our views and the UI can be accessed from the path /silk/ .
  • Dive into a request to see all the headers/form/etc + DB query and perf.

Cecil #6: Send an SMS message with Azure Communication Services

Extras:

Joke: Dependencies

Oct 02 2020

40mins

Play

Rank #5: #200 No dog-piling please (it's episode 200!)

Podcast cover
Read more

Sponsored by us! Support our work through:

Brian #1: How to be helpful online

  • Ned Batchelder
  • When answering questions. Lots of great advice. We’ll focus on just a few here.
    • Answer the question first. There may be other problems with their code that they are not asking about that you want to point out. But keep that for after you’ve helped them and built up trust.
    • No third rails. “It should be OK for someone to ask for help with a program using sockets, and not have to defend using sockets, especially if the specific question has nothing to do with sockets.” Same for pickle, threads, globals, singletons, etc. Don’t let your strong opinions derail the conversation. The goal is to help people. Strong reactions can make the asker feel attacked.
    • No dog-piling.
    • Meet their level. “Try to determine what they know, and give them a reasonable next step, not the ultimate solution. A suboptimal solution they understand is better than a gold standard they can’t make use of.”
    • Say yes.
    • Avoid absolutes.
    • Step back.
    • Take some blame.
    • Use more words. “IRC and other online mediums encourage quick short responses, which are exactly the kinds of responses that will be easy to misinterpret. Try to use more words, especially encouraging optimistic words.”
    • Understand your motivations.
    • Humility.
    • Make connections.
    • Finally: It’s hard.
  • All of Ned’s advice is great. Good meditations for when you read a question and your mouth drops open and your eyes stare in shock.

Michael #2: blackcellmagic

  • IPython magic command to format python code in cell using black.
  • Has a great animated gif ;)
  • Just do: %load_ext blackcellmagic
  • Then in any cell %%black and magic!
  • Accepts “arguments” like %%black -l 79
  • Tobin Jones has been kind enough to develop a NPM package over blackcellmagic to format all cells at once which can be found here. But it’s archived so no idea whether it’s current.

Brian #3: Test smarter, not harder

  • Luke Plant
  • There’s lots of great advice in here, but I want to highlight two parts that are often overlooked.
  • “Write your test code with the functions/methods/classes you wish existed, not the ones you’ve been given.” “If the API you want to use doesn’t exist yet, you still use it, and then make it exist.”
    • This is huge.
    • People tend to think like this while coding, but forget to do it while testing.
    • Also. Your tests are often the first client for your API, so if the API in question is under your control and you need an easier API for testing, consider adding it to the real API. If it’s easier for testing, it may be easier for other clients of the API as well.
  • “Only write necessary tests — specifically, tests whose estimated value is greater than their estimated cost. This is a hard judgement call, of course, but it does mean that at least some of the time you should be saying “it’s not worth it”.”

Michael #4: US: The Greatest Package in the World

  • by Jeremy Carbaugh
  • A package for easily working with US and state metadata:
  • all US states and territories
  • postal abbreviations
  • Associated Press style abbreviations
  • FIPS codes
  • capitals
  • years of statehood
  • time zones
  • phonetic state name lookup
  • is contiguous or continental
  • URLs to shapefiles for state, census, congressional districts, counties, and census tracts
  • The state lookup method allows matching by FIPS code, abbreviation, and name
  • Even a CLI: $ states md

Brian #5: Think Like A Coder

  • Part of TED-Ed
  • “… a 10-episode series that will challenge viewers with programming puzzles as the main characters— a girl and her robot companion— attempt to save a world that has been plunged into turmoil.”
  • Although, I only count 9 episodes, I was 4 episodes in and hooked.
  • Main cool thing, I think, is introducing terms and topics so they will be familiar when someone really does start coding:
    • loops, for loops, until loops, while loops
    • conditionals
    • variables
    • path logic
    • permutations
    • searches
    • tables
    • recursion
    • Big O
  • Also highly recommended for getting excited about coding:
  • TED-Ed has tons of other cool series on lots of subjects.
  • CodeCombat

Michael #6: Costs of running a Python web app for 55k monthly users

  • How much does running a web app in production actually cost?
  • KeepTheScore is an online software for scorekeeping. Create your own scoreboard for up to 150 players and start tracking points. It's mostly free and requires no user account.
  • Keepthescore.co is a Python flask application running on DigitalOcean and Firebase. It currently has around 55k unique visitors per month, per day it’s around 3.4k.
  • Servers and database on DigitalOcean: Costs per month: $95, the servers are oversized for the load they’re currently seeing.
  • Amazon Web Services: Costs per month: $60, use a reporting tool called Metabase to generate insights and reports from the database
  • Google Cloud, costs per month: $1.32, for Firebase
  • DNS hosting, costs per month: $5
  • Disqus, costs per month: $10
  • Is it worth it? Is there revenue?
  • In total that’s around $171 USD per month. If you’re running a company with employees that would be peanuts, but in this case the cost is being borne by a single indie-developer out of his own pocket.
  • The bigger issue is that on the revenue side there’s a big fat zero. This is the reason why we are currently working on monetization.
  • Some Talk Python stats:
  • Maybe 40k monthly visitors, but oh, the podcast clients
  • 3M requests / month just RSS, resulting in 320 GB / mo of XML traffic.
  • We run on two prod servers: $10 & $5 as well as a dedicated MongoDB server @ $10. Total $25/mo.
  • On the other hand, Talk Python Training's AWS bill last month was over $1,000 USD.
  • You can hear a bunch about this on Talk Python 215.

Joke:

  • From twitter, originally from Netlify:

    • "Oh no! We lost the hackers! Where did they go?"
    • "I don't know! They just ransomware!”
  • Number of days since I have encountered an array index error: -1.

Sep 25 2020

32mins

Play

Rank #6: #199 Big news for a very small Python runtime

Podcast cover
Read more

Sponsored by us! Support our work through:

Michael #1: micropython updated

  • via Matt Trentini
  • v1.13 is packed with features and bugfixes including solid asyncio support and tasty BLE improvements. Heck, we've even got the walrus operator.
  • a new implementation of the uasyncio module which aims to be more compatible with CPython's asyncio module.
  • The main change is to use a Task object for each coroutine, allowing more flexibility to queue tasks in various places, eg the main run loop, tasks waiting on events, locks or other tasks.
  • It no longer requires pre-allocating a fixed queue size for the main run loop.
  • Most code in this repository is now auto-formatted using uncrustify for C code and Black for Python code.
  • BlueKitchen BTstack bindings have been added for the ubluetooth module, as an optional alternative to the NimBLE stack. The unix port can now be built with BLE support using these bindings
  • Other Bluetooth additions include: new events for service/characteristic/ descriptor discovery complete; new events for read done and indicate acknowledgement; and support for active scanning in BLE.gap_scan().
  • PEP 526 has been (Walrus)
  • There has been an important bug fix when importing ARM machine code from an .mpy file: the system now correctly tracks the executable memory allocated to the machine code so this memory is not reclaimed by the garbage collector.
  • For testing, a multi-instance test runner has been added (see tests/run-multitests.py) which allows running a synchronised test across two or more MicroPython targets.
  • There are breaking changes
  • First release since Dec 19, 2019

Brian #2: respx: A utility for mocking out the Python HTTPX library

import httpx
import respx

@respx.mock
def test_something():
request = respx.post("https://foo.bar/baz/", status_code=201)
response = httpx.post("https://foo.bar/baz/")
assert request.called
assert response.status_code == 201
  • Documentation includes examples of using respx with both pytest and unittest, including how to set up mocked_api fixtures for pytest.
  • There’s call statistics you can assert on.
  • Ability to raise exceptions, return non-200 status codes, set custom return content.
  • Content can be generated in a callback method.
  • JSON content can be returned
  • Tons of nice options to help test your httpx based application.

Michael #3: GetPy - A Vectorized Python Dict/Set

  • The goal of GetPy is to provide the highest performance python dict/set that integrates into the python scientific ecosystem.
  • GetPy is a thin binding to the Parallel Hashmap (https://github.com/greg7mdp/parallel-hashmap.git) which is the current state of the art unordered map/set with minimal memory overhead and fast runtime speed.
  • The binding layer is supported by PyBind11 (https://github.com/pybind/pybind11.git)
  • The gp.Dict and gp.Set objects are designed to maintain a similar interface to the corresponding standard python objects.
  • Simple example:
  • -
import getpy as gp

key_type = np.dtype('u8')
value_type = np.dtype('u8')

keys = np.random.randint(1, 1000, size=10**2, dtype=key_type)
values = np.random.randint(1, 1000, size=10**2, dtype=value_type)

gp_dict = gp.Dict(key_type, value_type)
gp_dict[keys] = values

Brian #4: isort and black now play nice together easily

Michael #5: Scientists rename human genes to stop Microsoft Excel from misreading them as dates

  • Via Chris Moffitt
  • There are tens of thousands of genes in the human genome
  • Each gene is given a name and alphanumeric code, known as a symbol, which scientists use to coordinate research.
  • Over the past year or so, some 27 human genes have been renamed, all because Microsoft Excel kept misreading their symbols as dates.
  • Excel is regularly used by scientists to track their work and even conduct clinical trials.
  • But its default settings were designed with more mundane applications in mind, so when a user inputs a gene’s alphanumeric symbol into a spreadsheet, like MARCH1 — short for “Membrane Associated Ring-CH-Type Finger 1” — Excel converts that into a date: 1-Mar.
  • One study from 2016 examined genetic data shared alongside 3,597 published papers and found that roughly one-fifth had been affected by Excel errors.
  • See 12 of the Biggest Spreadsheet Fails in History for more examples: https://blogs.oracle.com/smb/10-of-the-costliest-spreadsheet-boo-boos-in-history
  • The scientific body in charge of standardizing the names of genes, the HUGO Gene Nomenclature Committee, published new guidelines for gene naming. From now on human genes and the proteins they expressed will be named with one eye on Excel’s auto-formatting.
  • Check out the Excel to Python course and webcast to escape this.

Brian #6: Never Run ‘python’ In Your Downloads Folder

  • by Glyph
  • This is really a nice, short tutorial on how sys.path is populated, why you should care, and why you need to make sure it’s only trusted locations.
  • “downloads” is definitely not trusted.
  • So never, ever, ever run python from the downloads directory, even with python -m something, as that adds the download dir to the include path.
  • Example includes a demonstration of malicious js code that downloads a fake pip.py to your downloads folder, so when you call python -m pip install ./legit_package.whl you get the fake pip.
  • Further examples show how you need to be vigilant to check your dot files for weird PYTHONPATH extensions and additions.

Extras:

Michael:

  • We recently passed 5,000,000 downloads of the audio files over at Python Bytes and are the 130th most popular tech podcast in the world. Thank you everyone!
  • Got a new LinkSys WiFi 6 mesh router, and wow, highly recommended.

Joke

Are you a real programmer? Check with XKCD to find out.

Sep 17 2020

29mins

Play

Rank #7: #198 There's a beaver in your database and Anna-Lena drops by

Podcast cover
Read more

Sponsored by us! Support our work through:

Special guest: Anna-Lena Popkes

Brian #1: Easily create Python scripts using argparse

  • Back in the day, when I was writing most of my utility scripts in bash, I’d keep around an example.bash file with different types of arguments and flags and control structures, etc to use as a template for new scripts.
  • Python has the same problem, or worse, if you use the built in argparse instead of something like click or typer. However, there are many times where you don’t want to have any external dependencies on a script, so built in argparse it is.
  • But I definitely relate to this tweet:
    • “Every time I write a python script, I have to go back to an old script of mine to remember how to set up argparse. For some reason it just does not stick in my mind AT ALL.” - Joshua Schraiber
  • Well, then steps in Ken Youens-Clark with a little utility called new.py. It’s not pip install-able, so you gotta clone it or fork it or copy it or whatever. But it’s cool and fairly simple to hack on yourself, and you’re going to want to make it your own anyway, so that’s fine.
  • You do something like python new.py foo.py and it creates an example starter foo.py for you with:
    • a positional argument
    • a string argument
    • an integer argument
    • a file argument (which also checks to make sure the file is readable)
    • a boolean flag
  • Modify, copy, paste, delete, whatever you want to it now to make it the script you need super fast.
  • Also, add a -t flag to it, like this python new.py -t foo.py, and it generates a test stub to test your new script.

Michael #2: DBeaver Database UI Tool

  • via exhuma
  • Remember I mentioned BeeKeeper
  • Free multi-platform database tool for developers, database administrators, analysts and all people who need to work with databases.
  • Supports all popular databases: MySQL, PostgreSQL, SQLite, Oracle, DB2, SQL Server, Sybase, MS Access, Teradata, Firebird, Apache Hive, Phoenix, Presto, etc.
  • Out-of-the box DBeaver supports more than 80 databases.
  • Having usability as its main goal, DBeaver offers:
    • Carefully designed and implemented User Interface
    • Support of Cloud datasources
    • Support for Enterprise security standard
    • Capability to work with various extensions for integration with Excel, Git and others.
    • Multiplatform support
    • Nice UML table/entity diagrams
  • Open source: github.com/dbeaver/dbeaver
  • Based on Eclipse

Anna-Lena #3: pdp++ debugger

  • I recently switched from using ipdb to pdb++
  • Extension of the pdb module of the standard library
  • Fully compatible with pdb but introduces some new features to improve debugging experience
  • Can easily be installed with pip install pdbpp (pdb++ is not a valid package name)
  • Favorites: 1) sticky mode, 2) smart command parsing
  • Sticky mode: “When in this mode, every time the current position changes, the screen is repainted and the whole function shown. Thus, when doing step-by-step execution you can easily follow the flow of the execution.”
  • Smart command parsing:
    • pdb tries to interpret entered commands as one of its builtin commands
    • Inconvenient in some situations
    • Example: printing value of a local variable which happens to have the same name as one of the commands (e.g. c could refer to a local variable but is interpreted as the command ‘continue’)
    • pdb++ solution: in case of ambiguity / if a variable with the same name exists in the scope, it’s preferred
    • To execute the corresponding command, you can prefix it with !!

Brian #4: Markdown toys

  1. HackMD.io
    • I just found out about HackMD at hackmd.io and I’m quite impressed.
    • HackMD is a realtime, multi-platform collaborative markdown knowledge base. You can write notes with other people on your desktop, tablet or even on the phone.”
    • Two panel markdown editor with some nice menus to help you remember how to do all the fancy stuff like
      • inserting pictures
      • tables, with all the table options
      • quotes, references, TOC blocks, links, etc.
    • Great for people learning Markdown and for collaborating.
    • Even has fancy addons like
      • math expressions
      • UML Diagrams
      • todo lists
    • And now, sync with github works, so you can edit files that are saved on github.
  2. Markdown Guide
    • Just a really good, clean, “… free and open-source reference guide that explains how to use Markdown, the simple and easy-to-use markup language you can use to format virtually any document.”
    • Includes
      • Getting started page
      • Cheat Sheet for super common elements
      • Basic Syntax for more of the details
      • Extended Syntax page
      • Tools with links to lots of tools, including HackMD

Michael #5: Python Malware and obfuscation

  • via Connor Ferster
  • Malware is starting to appear that has been written using the Python programming language. Traditionally, most malware has been written in compiled languages, such as C or C++.
  • Uses all the tools we promote for distributing apps: py2exe and py2app (which I used for urlify)
  • Specific examples of Python malware include SeaDuke that was used against the Democratic National Committee back in 2015 and 2016.
  • Lots of interesting tools
    • uncompyle6: The successor to decompyle, uncompyle, and uncompyle2- uncompyle6 is a native Python cross-version decompiler and fragment decompiler. It can be used to translate Python bytecode back into Python source code.
    • pyinstxtractor.py: The PyInstaller Extractor can extract Python data from PyInstaller compiled executables.
  • Detecting Python Compiled Executables: Both PyInstaller and py2exe when compiled on Windows place unique strings within their binary executable.

Anna-Lena #6: attrs package

  • What is attrs? → Python package that simplifies writing classes (dunder methods are created automatically)
  • How is this related to dataclasses?
    • PEP 557 added Data Classes to Python 3.7 that resemble attrs in many ways.
    • The PEP was inspired by attrs and is the result of the wish to simplify writing classes without having to deal with the problems of namedtuples
    • Main difference: data classes are less powerful than attrs (certain features were sacrificed for the sake of simplicity)
    • Example: with attrs you can use validators in your initializer that perform some kind of validation of the input arguments (e.g. checking that they have the correct type)

Extras:

Michael:

Joke:

New code quality metric: WTFs/minute

Sep 11 2020

34mins

Play

Rank #8: #197 Structured concurrency in Python

Podcast cover
Read more

Sponsored by us! Support our work through:

Michael #1: Structured concurrency in Python with AnyIO

  • AnyIO is a Python library providing structured concurrency primitives on top of asyncio.
  • Structured concurrency is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by using a structured approach to concurrent programming. The core concept is the encapsulation of concurrent threads of execution (here encompassing kernel and userland threads and processes) by way of control flow constructs that have clear entry and exit points and that ensure all spawned threads have completed before exit. — Wikipedia
  • The best overview is Notes on structured concurrency by Nathaniel Smith (or his video if you prefer).
  • Python has three well-known concurrency libraries built around the async/await syntax: asyncio, Curio, and Trio. (WHERE IS unsync?!?! 🙂 )
  • Since it's the default, the overwhelming majority of async applications and libraries are written with asyncio.
  • The second and third are attempts to improve on asyncio, by David Beazley and Nathaniel Smith respectively
  • The AnyIO library by Alex Grönholm describes itself as follows:
    > an asynchronous compatibility API that allows applications and libraries written against it to run unmodified on asyncio, curio and trio.

Example:

import anyio

async def task(n):
await anyio.sleep(n)

async def main():
try:
async with anyio.create_task_group() as tg:
await tg.spawn(task, 1)
await tg.spawn(task, 2)
finally:
# e.g. release locks
print('cleanup')

anyio.run(main)

Brian #2: The Consortium for Python Data API Standards

  • One unintended consequence of the advances in multiple frameworks for data science, machine learning, deep learning and numerical computing is fragmentation and differences in common function signatures.
  • The Consortium for Python Data API Standards aims to tackle this fragmentation by developing API standards for arrays (a.k.a. tensors) and dataframes.
  • They intend to work with library maintainers and the community and have a review process.
  • One example of the problem, “mean”. Five different interfaces over 8 frameworks:
numpy: mean(a, axis=None, dtype=None, out=None, keepdims=[HTML_REMOVED])
dask.array: mean(a, axis=None, dtype=None, out=None, keepdims=[HTML_REMOVED])
cupy: mean(a, axis=None, dtype=None, out=None, keepdims=False)
jax.numpy: mean(a, axis=None, dtype=None, out=None, keepdims=False)
mxnet.np: mean(a, axis=None, dtype=None, out=None, keepdims=False)
sparse: s.mean(axis=None, keepdims=False, dtype=None, out=None)
torch: mean(input, dim, keepdim=False, out=None)
tensorflow: reduce_mean(input_tensor, axis=None, keepdims=None, name=None,
reduction_indices=None, keep_dims=None)
  • They are going to start with array API
  • Then dataframes
  • Also, it’s happening fast, hoping to make traction in next few months.

Michael #3: Ask for Forgiveness or Look Before You Leap?

  • via PyCoders
  • Think C++ style vs Python style of error handling
  • Or any exception-first/only language vs. some hybrid thing
  • If you “look before you leap”, you first check if everything is set correctly, then you perform an action.
  • Example:
from pathlib import Path
if Path("/path/to/file").exists():
...
  • With “ask for forgiveness,” you don’t check anything. You perform whatever action you want, but you wrap it in a try/catch block.
try:
with open("path/to/file.txt", "r") as input_file:
return input_file.read()
except IOError:
# Handle the error or just ignore it
  • Their example, “Look before you leap” is around 30% slower (155/118≈1.314). Testing for subclass basically with no errors
  • But if there are errors: The tables have turned. “Ask for forgiveness” is now over four times as slow as “Look before you leap” (562/135≈4.163). That’s because this time, our code throws an exception. And handling exceptions is expensive.
  • If you expect your code to fail often, then “Look before you leap” might be much faster.
  • Michael’s counter example: gist.github.com/mikeckennedy/00828db1d49d2cd2dac8fa0295e54c23

Brian #4: myrepos

  • “You have a lot of version control repositories. Sometimes you want to update them all at once. Or push out all your local changes. You use special command lines in some repositories to implement specific workflows. Myrepos provides a mr command, which is a tool to manage all your version control repositories.”
  • Run mr register for all repos under a shared directory.
  • Then be able to do common operations on a subtree of repos, like mr status, mr update, mr diff, or really anything.
  • See also: Maintaining Multiple Python Projects With myrepos - Adam Johnson

Michael #5: A deep dive into the official Docker image for Python

  • by Itamar Turner-Trauring, via PyCoders
  • Wait, there’s an official Docker image for Python
  • The base image is Debian GNU/Linux 10, the current stable release of the Debian distribution, also known as Buster because Debian names all their releases after characters from Toy Story
  • Next, environment variables are added: ENV PATH /usr/local/bin:$PATH
  • Next, the locale is set: ENV LANG C.UTF-8
  • There’s also an environment variable that tells you the current Python version: ENV PYTHON_VERSION 3.8.5
  • In order to run, Python needs some additional packages (the dreaded certificates, etc)
  • Next, a compiler toolchain is installed, Python source code is downloaded, Python is compiled, and then the unneeded Debian packages are uninstalled. Interestingly, The packages—gcc and so on—needed to compile Python are removed once they are no longer needed.
  • Next, /usr/local/bin/python3 gets an alias /usr/local/bin/python, so you can call it either way
  • the Dockerfile makes sure to include that newer pip
  • Finally, the Dockerfile specifices the entrypoint: CMD ["python3"] Means docker run launches into the REPL:
$ docker run -it python:3.8-slim-buster
Python 3.8.5 (default, Aug 4 2020, 16:24:08)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Brian #6: “Only in a Pandemic” section nannernest: Optimal Peanut Butter and Banana Sandwiches

  • Ethan Rosenthal
  • Computer vision, deep learning, machine learning, and Python come together to make sandwiches.
  • Just a really fun read about problems called “nesting” or “packing” and how to apply it to banana slices and bread.

Extras:

Brian:

Michael:

Joke

via Eduardo Orochena

Sep 05 2020

36mins

Play

Rank #9: #196 Version your SQL schemas with git + automatically migrate them

Podcast cover
Read more

Sponsored by Datadog: pythonbytes.fm/datadog

Brian #1: Surviving Django (if you care about databases)

  • Daniele Varazzo
  • Hard to summarize, but this is an interesting perspective on getting to know your database better and using database migrations and database schemas, etc. instead of relying on Django’s seemingly agnostic view of databases.
  • Following the article is a nice civilized discussion in the comments between the author, Paolo Melchiorre, Andrew Godwin, and others.
  • Interesting comment by Andrew: “I agree that at some point in a project/company's life, if it's big enough, SQL migrations are the way to go. … Migrations in the out-of-the-box state are mostly there to supplement rapid prototyping, and then like a lot of Django, can be removed/ignored progressively if and when you outgrow the single set of design constraints we had to choose for them.”

Michael #2: Python Numbers and the Flyweight design pattern

  • Working on allocation and other memory internals from my upcoming Python Memory Management and Tips course
  • Flyweight design pattern via Wikipedia
  • Python numbers are expensive ( >= 28 bytes each)
  • Python does not allocate more than one int in the range [-5, 256]
  • Example code: app_flyweight.py
  • Also working on a Python Design Patterns course and Flyweight is back there too.

Brian #3: What Are Python Wheels and Why Should You Care?

  • Brad Solomon
  • Second half is about creating wheels
  • I’m more interested in the first half, a discussion of wheels from the users perspective.
  • Most package authors now all this stuff, or most of it. But this is a nice quick intro for the rest of the Python ecosystem as package users.
  • If you pip install something that isn’t a wheel, it’s probably a tarball.
    • pip downloads the tar.gz file
    • builds a wheel, which includes calling setup.py
    • labels it
    • then installs it
  • For pre-wheeled packages, the build and label aren’t needed, so it’ s faster
  • Also, the download size is usually smaller, so that part is also faster.
  • Wheels are essentially zip files with specially crafted filenames that tell installers what Python versions and platforms the wheel will support:
    • {dist}-{version}(-{build})?-{python}-{abi}-{platform}.whl
    • ex: cryptography-2.9.2-cp35-abi3-macosx_10_9_x86_64.whl
  • Wheels can have platform specific builds, like for macosx vs unix, etc.
  • Advantages of wheels:
    • install faster
    • smaller
    • cut the setup.py execution out
    • no need for compiler as they can be os specific
    • auto generated .pyc files
    • provide consistency

Michael #4: Pandas_Alive

  • By Jake McKew
  • Pandas_Alive is intended to provide a plotting backend for animated matplotlib charts for Pandas DataFrames, similar to the already existing Visualization feature of Pandas.
  • With Pandas_Alive, creating stunning, animated visualizations is as easy as calling df.plot_animated()
  • Also supports: GeoSpatialData with geopandas, basemaps with contextily, writing to GIF in memory (no external dependencies), progress bars with tqdm
  • Since release 3 weeks ago, Pandas-Alive has been downloaded over 11,000 times off PyPI which is absolutely unreal
  • Comes with visuals :)

Brian #5: How To Use the Python Map Function

  • Kathryn Hancox
  • map() is so useful, but not obvious to people coming from other languages. If you don’t use it much, that’s fine, but it’s nice to review occasionally because it does more than I originally gave it credit for.
  • This tutorial walks through:
    • Normal map use with a lambda function applied to a list.
    • Using user defined functions.
      • Actually this bit is more confusing than it needs to be as it has a function returning a map object, which is kinda weird in the particular circumstance. But doesn’t detract from the rest too much.
    • Using built in functions to map.
    • Using functions that take more than one argument and using map across multiple iterables.
  • Takeaways
    • map() only applies the function one element at a time during iteration, so it’s efficient with large data sets and with sequences that won’t reach the end.
    • Remember you can use lambdas, built in functions, and you own functions with map.
    • You can use functions that take multiple arguments, but that requires passing in multiple iterables, one for each function argument.
    • Comprehensions are often just as useful, especially for small data sets, but don’t forget about map.

Michael #6: Version your SQL schemas with git + automatically migrate them

  • automigrate project
  • Automigrate is a command-line tool for SQL migrations. Unlike other migration tools, it uses git history to do diffs on create table statements instead of forcing you to write up/down diffs for every change.
  • This tool doesn't make you write & manage a giant folder of up/down migrations. It uses git history to infer them instead, and to version production databases.
  • Not as nice as alembic (even though it portrays itself otherwise). But if you are writing DDL by hand, this is much better!
  • Speaking of which: Generate ORM definitions from SQL: Experimental sqlalchemy generator in sa_harness.py. Try it out with:
python -m automig.lib.sa_harness 'test/schema/*.sql'

Extras:

Michael:

  • Get notified of release for new courses at training.talkpython.fm/getnotified
    • Excel to Python
    • Getting Started in Data Science
    • Python Memory Management and Tips
    • Getting started with Git
    • Python Design Patterns

Jokes!

“Engineers remove dead code after dropping a feature flag”, Sir Frank Bernard Dicksee, 1893, Oil on canvas

“CSS without comments”, Pablo Picasso, 1912

“Experienced developer deploys hotfix on production”, Francisco Goya, Oil on canvas, circa 1788

Aug 27 2020

31mins

Play

Rank #10: #195 Runtime type checking for Python type hints

Podcast cover
Read more

Sponsored by us! Support our work through:

Michael #1: watchdog

  • via Prayson Daniel
  • Python API and shell utilities to monitor file system events.
  • Example:
observer = Observer() observer.schedule(event_handler, path, recursive=True) observer.start()
  • Watchdog comes with an optional utility script called watchmedo
  • try $ watchmedo log and see what happens in that folder.
  • Why Watchdog? Compared to other similar libs

Brian #2: Status code 418

  • Thanks Andy Howe for the suggestion
  • Python 3.9 rc1 is out.
  • One nice enhancement that has made it into 3.9, a fix for http library missing HTTP status code 418 “I’m a teapot”.
  • https://bugs.python.org/issue39507
    • Title: http library missing HTTP status code 418 "I'm a teapot"
  • See also status code 418 is also supported by HTCPCP, Hyper Text Coffee Pot Control Protocol, https://tools.ietf.org/html/rfc2324
    • 418 I'm a teapot
Any attempt to brew coffee with a teapot should result in the error code "418 I'm a teapot". The resulting entity body MAY be short and stout.
  • The only other unique HTCPCP code is 406
    • 406 Not Acceptable
… In HTCPCP, this response code MAY be returned if the operator of the coffee pot cannot comply with the Accept-Addition request. Unless the request was a HEAD request, the response SHOULD include an entity containing a list of available coffee additions.

Michael #3: pydantic’s new Validation decorator

  • via Andy Shapiro
  • Built-in type checking for any function via a decorator
  • easy to add for any public methods in a package
  • pydantic uses lots of cython under the hood so it should be fast....
  • The validate_arguments decorator allows the arguments passed to a function to be parsed and validated using the function's annotations before the function is called.
  • Under the hood this uses the same approach of model creation and initialization; it provides an extremely easy way to apply validation to your code with minimal boilerplate.
  • Example:
from pydantic import validate_arguments, ValidationError @validate_arguments def repeat(s: str, count: int, *, separator: bytes = b'') -> bytes: b = s.encode() return separator.join(b for _ in range(count)) a = repeat('hello', 3) print(a) #> b'hellohellohello' b = repeat('x', '4', separator=' ') print(b) #> b'x x x x' try: c = repeat('hello', 'wrong') except ValidationError as exc: print(exc) """ 1 validation error for Repeat count value is not a valid integer (type=type_error.integer) """

Brian #4: Building Python Extension Modules in Assembly

  • Anthony Shaw
  • From twitter announcement:
    • “After a series of highly questionable life decisions, my Python extension written in pure assembly is now on PyPI. https://pypi.org/project/pymult/ it required writing an Assembly extension for distutils, I also added GitHub Actions support so its running CI/CD and testing with pytest”.
  • A proof-of-concept to demonstrate how you can create a Python Extension in 100% assembly.
  • Demonstrates:
    • How to write a Python module in pure assembly
    • How to write a function in pure assembly and call it from Python with Python objects
    • How to call the C API to create a PyObject and parse PyTuple (arguments) into raw pointers
    • How to pass data back into Python
    • How to register a module from assembly
    • How to create a method definition in assembly
    • How to write back to the Python stack using the dynamic module loader
    • How to package a NASM/Assembly Python extension with distutils
  • The simple proof-of-concept function takes 2 parameters,
>>> import pymult >>> pymult.multiply(2, 4) 8
  • May need a few more test cases:
>>> pymult.multiply(2, 3) 6 >>> pymult.multiply(-2, -3) 6 >>> pymult.multiply(-2, 3) 4294967290
  • Also, clearly Anthony has too much time on his hands. Just saying.

Michael #5: easy property

  • via Ruud van der Ham
  • The easy_property module, developed by me, offers a more intuitive way to define a Python property with getter, setter, deleter, getter_setter and documenter decorators.
  • Normally when you want to define a property that has a getter and a setter, you have to do something like
Class Demo: def __init__(self, val): self.a = val @property def a(self): return self._a @a.setter def a(self, val): self._a = val
  • IMHO, the @a.setter is a rather ugly decorator, and hard to remember. And there's no way to not define the getter.
  • With the easy_property module, one can use the decorators
    • getter
    • setter
    • deleted
  • as in:
Class Demo: def __init__(self, val): self.a = val @getter def a(self): return self._a @setter def a(self, val): self._a = val @deleter def a(self): print('delete') del self._a
  • In contrast with an ordinary property, the order of definition of getter, setter and deleter is not important. And it is even possible to define a setter only (without a getter), just in case.
  • With easy_property, you can even create a combined getter/setter decorator:
Class Demo: def __init__(self, val): self.a = val @getter_setter def a(self, val=None): if val is None: return self._a self._a = val
  • Finally, it is possible to add a docstring to the property, with the @documenter decorator:
Class Demo: def __init__(self, val): self.a = val @getter def a(self): return self._a @documenter: def a(self): return "this is the docstring of Demo.a"

Although this might not be always a good solution, I think in many cases this will make it easier and more intuitive to define properties.

Brian #6: Non Blocking Assertion Failures with pytest-check

  • Ryan Howard wrote an article about a project of mine on the TestProject blog.
  • I think it’s a first that someone else wrote an article about something I made. So that’s cool.
  • Most tests do the “check” part with assert statements.
  • The problem is assert stops after the first failure and you often want to check lots of stuff, and you want to see all the failures.
  • Ryan has a good example with checking web pages using selenium and a simple example of wanting to check both the content of an element on the page, and the url.
  • Cool use of pytest-check
  • See also:

Extras:

Brian

  • PSA: There are no capital letters in pytest, even if it begins a sentence.

Michael

Joke:

  • XKCD git - xkcd.com/1597
  • “I used to do low-level programming. Then a product I bought told me, "No assembly required." Since then, I've been coding in Python.” - From Rueven Lerner, Inspired by Anthony Shaw

Aug 18 2020

33mins

Play