Cover image of Python Bytes

Python Bytes

Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.

Weekly hand curated podcast episodes for learning

Popular episodes

All episodes

Warning: This podcast data isn't working.

This means that the episode rankings aren't working properly. Please revisit us at a later time to get the best episodes of this podcast!

Podcast cover

#255 Closember eve, the cure for Hacktoberfest?

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Will McGugan Michael #1: Wrapping C++ with Cython By Anton Zhdan-Pushkin A small series showcasing the implementation of a Cython wrapper over a C++ library. C library: yaacrl - Yet Another Audio Recognition Library is a small Shazam-like library, which can recognize songs using a small recorded fragment. For Cython to consume yaacrl correctly, we need to “teach” it about the API using `cdef extern It is convenient to put such declarations in *.pxd files. One of the first features of Cython that I find extremely useful — aliasing. With aliasing, we can use names like Storage or Fingerprint for Python classes without shadowing original C++ classes. Implementing a wrapper: pyaacrl - The most common way to wrap a C++ class is to use Extension types. As an extension type a just a C struct, it can have an underlying C++ class as a field and act as a proxy to it. Cython documentation has a whole page dedicated to the pitfalls of “Using C++ in Cython.” Distribution is hard, but there is a tool that is designed specifically for such needs: scikit-build. PyBind11 too Brian #2: tbump : bump software releases suggested by Sephi Berry limits the manual process of updating a project version tbump init 1.2.2 initializes a tbump.toml file with customizable settings --pyproject will append to pyproject.toml instead tbump 1.2.3 will patch files: wherever the version listed (optional) run configured commands before commit failing commands stop the bump. commit the changes with a configurable message add a version tag push code push tag (optional) run post publish command Tell you what it’s going to do before it does it. (can opt out of this check) pretty much everything is customizable and configurable. I tried this on a flit based project. Only required one change # For each file to patch, add a [[file]] config # section containing the path of the file, relative to the # tbump.toml location. [[file]] src = "pytest_srcpaths.py" search = '__version__ = "{current_version}"' cool example of a pre-commit check: # [[before_commit]] # name = "check changelog" # cmd = "grep -q {new_version} Changelog.rst" Will #3: Closember by Matthias Bussonnier Michael #4: scikit learn goes 1.0 via Brian Skinn The library has been stable for quite some time, releasing version 1.0 is recognizing that and signalling it to our users. Features: Keyword and positional arguments - To improve the readability of code written based on scikit-learn, now users have to provide most parameters with their names, as keyword arguments, instead of positional arguments. Spline Transformers - One way to add nonlinear terms to a dataset’s feature set is to generate spline basis functions for continuous/numerical features with the new SplineTransformer. Quantile Regressor - Quantile regression estimates the median or other quantiles of Y conditional on X Feature Names Support - When an estimator is passed a pandas’ dataframe during fit, the estimator will set a feature_names_in_ attribute containing the feature names. A more flexible plotting API Online One-Class SVM Histogram-based Gradient Boosting Models are now stable Better docs Brian #5: Using devpi as an offline PyPI cache Jason R. Coombs This is the devpi tutorial I’ve been waiting for. Single machine local server mirror of PyPI (mirroring needs primed), usable in offline mode. $ pipx install devpi-server $ devpi-init $ devpi-server now in another window, prime the cache by grabbing whatever you need, with the index redirected (venv) $ export PIP_INDEX_URL=http://localhost:3141/root/pypi/ (venv) $ pip install pytest, ... then you can restart the server anytime, or even offline $ devpi-server --offline tutorial includes examples, proving how simple this is. Will #6: PyPi command line Extras Brian: I’ve started using pyenv on my Mac just for downloading Python versions. Verdict still out if I like it better than just downloading from pytest.org. Also started using Starship with no customizations so far. I’d like to hear from people if they have nice Starship customizations I should try. vscode.dev is a thing, announcement just today Michael: PyCascades Call for Proposals is currently open Got your M1 Max? Prediction: Tools like Crossover for Windows apps will become more of a thing. Will: GIL removal https://docs.google.com/document/u/0/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0/mobilebasic?urp=gmail_link https://lwn.net/SubscriberLink/872869/0e62bba2db51ec7a/ vscode.dev Joke: The torture never stops IE (“Safari”) Eating Glue


20 Oct 2021

Rank #1

Podcast cover

#254 Do Excel things, get notebook Python code with Mito

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Muhammad Raza Brian #1: yaml, GH Actions, and Python 3.10 Anthony Shaw (and others) Old: python: [3.7, 3.8, 3.9, 3.10-dev] New: python: ["3.7", "3.8", "3.9", "3.10"] Reasons: Github Actions use yaml. yaml treats 3.10-dev as a string, since it’s got non-numbers in it. yaml treats 3.10 as a number, and is the same as 3.1 hence, we have to use quotes for “3.10” using them on “3.7”, etc is not necessary, but is a nice consistency Michael #2: Beating C and Java, Python Becomes the #1 Most Popular Programming Language, Says TIOBE via Brian Skin "For the first time in more than 20 years we have a new leader of the pack..." the TIOBE Index announced this month. "The long-standing hegemony of Java and C is over.” For Tiobe, its enterprise focus, has seen Java and C dominate in recent years, but Python has been snapping at the heels of Java, and has now overtaken it... "Its ease of learning, its huge amount of libraries, and its widespread use in all kinds of domains, has made it the most popular programming language of today. Congratulations Guido van Rossum!" Muhammad #3: Newspaper3k: Article scraping & curation News, full-text, and article metadata extraction This allows you extract useful information from news articles, similar to Pocket or InstaPaper. Brian #4: PEP 660, pip 21.3, flit 3.4 -> easy editable installs pip install -e /local/dir is a great way to have a project installed while you are developing it. It used to not work with pyproject.toml based projects. Flit worked around this with flit install --``pth-file (or --symlink) PEP660 - Editable installs for pyproject.toml based builds (wheel based) Plus tons of work by Stéphane Bidoul and others, see Test & Code, episode 163 pip 21.3 (Oct 11), flit 3.4 (Oct 10) now support PEP660 And now with pip 21.3 and flit 3.4, pip install -e works for flit projects If you are using optional dependencies, for example: [project.optional-dependencies] test = [ "pytest", "tox", ] Then you need to use a quotes: pip install -e ".[test]" Michael #5: Mito - a JupterLab Extension - generates Python code while you work on your analysis via Tomas Rollo Mito is a spreadsheet that helps you complete your Python analyses 10x faster. You edit the Mitosheet, and it generates Python code for you. Best way to experience it is to watch the video Muhammad #6: troposphere Python library to create AWS CloudFormation descriptions The troposphere library allows easier creation of CloudFormation templates by writing Python code to describe AWS resources. Extras Muhammad How to learn Unix Tools Brian PyCon 2022 site is live, https://us.pycon.org/2022/ Joke: Alphabet cancels Loon


13 Oct 2021

Rank #2

Similar Podcasts

Podcast cover

#253 A new Python for you, and for everyone!

Watch the live stream: Watch on YouTube About the show Special guest: Yael Mintz Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Michael #1: awesome-htmx An awesome list of resources about htmx such as articles, posts, videos, talks and more. Good for all sorts of examples and multiple languages We get a few nice shoutouts, thanks Brian #2: Python 3.10 is here !!!! As of Monday. Of course I have it installed on Mac and Windows. Running like a charm. You can watch the Release Party recording. It’s like 3 hours. And starts with hats. Pablo’s is my fav. Also a What’s New video which aired before that with Brandt Bucher, Lukasz Llanga ,and Sebastian Ramirez (33 min) Includes a deep dive into structural pattern matching that I highly recommend. Reminder of new features: PEP 623 -- Deprecate and prepare for the removal of the wstr member in PyUnicodeObject. PEP 604 -- Allow writing union types as X | Y PEP 612 -- Parameter Specification Variables PEP 626 -- Precise line numbers for debugging and other tools. PEP 618 -- Add Optional Length-Checking To zip. bpo-12782: Parenthesized context managers are now officially allowed. PEP 632 -- Deprecate distutils module. PEP 613 -- Explicit Type Aliases PEP 634 -- Structural Pattern Matching: Specification PEP 635 -- Structural Pattern Matching: Motivation and Rationale PEP 636 -- Structural Pattern Matching: Tutorial PEP 644 -- Require OpenSSL 1.1.1 or newer PEP 624 -- Remove Py_UNICODE encoder APIs PEP 597 -- Add optional EncodingWarning Takeaway I wasn’t expecting: black doesn’t handle Structural Pattern Matching yet. Yael #3: Prospector (almost) All Python analysis tools together Instead of running pylint, pycodestyle, mccabe and other separately, prospector allows you to bundle them all together Includes the common Pylint and Pydocstyle / Pep257, but also some other less common goodies, such as Mccabe, Dodgy, Vulture, Bandit, Pyroma and many others Relatively easy configuration that supports profiles, for different cases Built-in support for celery, Flask and Django frameworks https://soshace.com/how-to-use-prospector-for-python-static-code-analysis/ Michael #4: Rich Pandas DataFrames via Avi Perl, by Khuyen Tran Create animated and pretty Pandas Dataframe or Pandas Series (in the terminal, using Rich) I just had Will over on Talk Python last week BTW: Terminal magic with Rich and Textual Can limit rows, control the animation speed, show head or tail, go “full screen” with clear, etc. Example: from sklearn.datasets import fetch_openml from rich_dataframe import prettify speed_dating = fetch_openml(name='SpeedDating', version=1)['frame'] table = prettify(speed_dating) Brian #5: Union types, baby! From Python 3.10: “PEP 604 -- Allow writing union types as X | Y” Use as possibly not intended, to avoid Optional: def foo(x: str | None = None) -> None: pass 3.9 example: from typing import Optional def foo(x: Optional[str] = None) -> None: pass But here’s the issue. I need to support Python 3.9 at least, and probably early, what should I do? For 3.7 and above, you can use from __future__ import annotations. And of course Anthony Sottile worked this into pyupgrade and Adam Johnson wrote about it: Python Type Hints - How to Upgrade Syntax with pyupgrade This article covers: PEP 585 added generic syntax to builtin types. This allows us to write e.g. list[int] instead of using typing.List[int]. PEP 604 added the | operator as union syntax. This allows us to write e.g. int | str instead of typing.Union[int, str], and int | None instead of typing.Optional[int]. How to use these. What they look like. And how to use pyupgrade to just convert your code for you if you’ve already written it the old way. Awesome. Yael #6: Make your code darker - Improving Python code incrementally The idea behind Darker is to reformat code using Black (and optionally isort), but only apply new formatting to regions which have been modified by the developer Instead of having one huge PR, darker allows you to reformat the code gradually, when you're touching the code for other reasons.. Every modified line, will be black formatted Once added to Git pre-commit-hook, or added to PyCharm **/ VScode the formatting will happen automatically Extras Brian: I got a couple PRs accepted into pytest. So that’s fun: 9133: Add a deselected parameter to assert_outcomes() 9134: Add a pythonpath setting to allow paths to be added to sys.path I’ve tested, provided feedback, written about, and submitted issues to the project before. I’ve even contributed some test code. But these are the first source code contributions. It was a cool experience. Great team there at pytest. Michael: New htmx course: HTMX + Flask: Modern Python Web Apps, Hold the JavaScript auto-optional: Due to the comments on the show I remembered to add support for Union[X, None] and python 10’s X | None syntax. Coverage 6.0 released Django 3.2.8 released Yael: data-oriented-programming - an innovative approach to coding without OOP, with an emphasis on code and data separation, which simplifies state management and eases concurrency Help us to make Cornell awesome 🙂 - contributors are warmly welcomed Joke: Pair CAPTCHAing


7 Oct 2021

Rank #3

Podcast cover

#252 Jupyter is now a desktop app!

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Ethan Swan Michael #0: Changing themes to DIY Brian #1: SQLFluff Suggested by Dave Kotchessa. A SQL Linter, written in Python, tested with pytest Configurable, and configuration can live in many places including tox.ini and pyproject.toml. Great docs Rule reference with anti-pattern/best practice format Includes dialects for ANSI, PostgreSQL, MySQL, Teradata, BigQuery, Snoflake Note in docs: “SQLFluff is still in an open alpha phase - expect the tool to change significantly over the coming months, and expect potentially non-backward compatible api changes to happen at any point.” Michael #2: JupyterLab Desktop JupyterLab App is the cross-platform standalone application distribution of JupyterLab. Bundles a Python environment with several popular Python libraries ready to use in scientific computing and data science workflows. JupyterLab App works on Debian and Fedora based Linux, macOS and Windows operating systems. Ethan #3: Requests Cache Create a requests_cache session and call HTTP methods from there You can also do it without a session but that’s a bit weird, looks like it’s monkey patching requests or something… Results are cached Very handy for repeatedly calling endpoints especially if the returned data is large, or the server has to do some compute Reminds me of @functools.lru_cache Can set things like how long the cache should last (when to invalidate) Funny easter egg in example: “# Cache 400 responses as a solemn reminder of your failures” Brian #4: pypi-rename This is a cookiecutter template from Simon Willison Backstory: To refresh my memory on how to publish a new package with flit I created a new pytest plugin. Brian Skinn noticed it somehow, and suggested a better name. Thanks Brian. So, how to nicely rename. I searched and found Simon’s template, which is… A cookiecutter template. So you can use cookiecutter to do some of this work for you. But it’s based on setuptools, and I kinda like flit lately, so I just used the instructions. The README.md includes instructions for the steps needed: Create renamed version Publish under new name Change old one to depend on new one, but be mostly empty Modify readme to tell people what's going on Publish old name as a notice Now people looking for old one will find new one. People just installing old one will end up with new one also since it’s a dependency. Michael #5: Django 4 coming with Redis Adapter #33012 closed New feature (fixed) → Add a Redis cache backend. Adds support for Redis to be used as a caching backend with Django. Redis is the most popular caching backend, adding it to django.core.cache module would be a great addition for developers who previously had to rely on the use of third party packages. It will be simpler than that provided by django-redis, for instance customising the serialiser is out-of-scope for the initial pass. Ethan #6: PEP 612 It wasn’t possible to type a function that took in a function and returned a function with the same signature (which is what many decorators do) This creates a ParamSpec – which is much like a TypeVar, for anyone who has used them to type generic functions/classes It’s a reminder that typing is still missing features and evolving, and it’s good to accept the edge cases for now – “gradual typing” Reading Fluent Python by Ramalho has influenced my view on this – don’t lose your mind trying to type crazy stuff, just accept that it’s “gradual” Mention how typing is still evolving in Python and it’s good to keep an eye out for new features that help you (see also PEP 645 – using int? for Optional[int]; and PEP 655 – annotating some TypedDict keys as required and others not required) Extras Michael Earsketch Django Critical CVE: CVE-2021-35042 Vulnerable versions: >= 3.0.0, < 3.1.13 Patched version: 3.1.13 Django 3.1.x before 3.1.13 and 3.2.x before 3.2.5 allows QuerySet.order_by SQL injection if order_by is untrusted input from a client of a web application. Ethan Pedalboard I happened upon this project recently and checked back, only to see that Brett Cannon was the last committer! A doc fix, like he suggested last episode Brian Zero Cost Exceptions in Python 3.11 Suggested by John Hagen Guido, Mark Shannon, and others at Microsoft are working on speeding up Python faster-cpython/ideas repo includes a slide deck from Guido which includes “Zero overhead” exception handling. Python 3.11 “What’s New” page, Optimizations section includes: “Zero-cost” exceptions are implemented. The cost of try statements is almost eliminated when no exception is raised. (Contributed by Mark Shannon in bpo-40222.) MK: I played with this a bit. Joke: QA 101


29 Sep 2021

Rank #4

Most Popular Podcasts

Podcast cover

#251 A 95% complete episode (wait for it)

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Brett Cannon Michael #1: auto-optional by Daan Luttik Did you know that concrete types cannot be None in Python typing? This is wrong: def do_a_thing(extra_info: str = None): ... auto-optional will fix it: def do_a_thing(extra_info: Optional[str] = None): ... Why would you want this? Easily modify external libraries that didn't pay attention to proper use of optional to improve mypy linting. Force consistency in your own code-base: Enforcing that None parameter implies an Optional type. Run via the CLI: auto-optional [path] Brian #2: Making World-Class Docs Takes Effort Daniel Stenberg Six requirements for a project to get a gold star docs in the code repo NOT extracted from the code examples, lots of examples, more than you think you need document every API call you provide easily accessible and browsable and hopefully offline readable as well easy to contribute to Non-stop iterating is key to having good docs. extra goodness consistency for section titles cross-references I’d add Check for grammar and spelling mistakes Consistency in all things, formatting, style, tone, depth of info of diff topics Don’t be afraid to have a personality. docs that include easter eggs, fun examples, tasteful jokes, etc are nice, as long as that fun stuff doesn’t complicate the docs. Don’t slam projects for having bad docs. Not all open source projects exist for your benefit. You can make them better by contributing. :) Brett #3: Starship Continuing the trend of stuff to help make your coding better, Python or not. 😉 Also to make Michael’s new love of nerd fonts more useful. 😁 And more Rust on this show as Paul Everitt says I must do. 😉 Gives you a common shell prompt no matter which shell you use; I also find it easy to set up compared to most shells for their prompts Lots of integrated support for various developer things such as printing what Python version you have when the directory has a pyproject.toml file. Works nicely with the Python Launcher (as I mentioned the last time I was on). Has some pyenv support that I don’t use. 😁 Michael #4: JMESPath via Josh Thurston Spent tons of time figuring out how to parse the pretty print results that had layers of nested dictionaries and lists. This module saved me time in a big way. JMESPath (pronounced “james path”) allows you to declaratively specify how to extract elements from a JSON document. For example, given this document: {"foo": {"bar": "baz"}} The jmespath expression foo.bar will return “baz”. Even works with a projection-like result: {"foo": {"bar": [{"name": "one"}, {"name": "two"}]}} The expression: foo.bar[*].name will return ["one", "two"]. Negative indexing is also supported (-1 refers to the last element in the list). Given the data above, the expression foo.bar[-1].name will return "two". Brian #5: pedalboard - audio effects library from Spotify The “power, speed, and sound quality of a DAW”, but in Python. Introduction Article (warning: weird color changing header image that is painful to look at, so scroll past that quickly) Built-in support for a number of basic audio transformations: Convolution, Compressor, Chorus, Distortion Gain, HighpassFilter, LadderFilter, Limiter, LowpassFilter Phaser, Reverb Brett #6: PEP 665 (and the journey so far) Attempt to standardize lock files for Python. Spent six months talking w/ folks privately to come up with the first public draft. Initially a strict lock file, but Poetry and PDM feedback was platform-agnostic was important. Proposal morphed to cover that. Took it public and led to over 150 comments on Discourse. People disliked it: from the title to the explanation to the proposed problem space to the actual solution. Gone back to the drawing board privately w/ one of the original objectors participating; looking like we are reaching a good consensus on how to frame things and how it should ultimately look. (Packaging) PEPs are hard. Extras Brian Python is popular, apparently, and “on the verge of another big step forward” (another good place for dun, dun, duuunnn, ?) "It only needs to bridge 0.16% to surpass C. This might happen any time now. If Python becomes number 1, a new milestone has been reached in the TIOBE index. Only 2 other languages have ever been leading the pack so far, i.e. C and Java." Michael Nerd Fonts Evrone interview with me Henry Schreiner’s Fish setup Aliases rather than CLI/venvs Brett Will McGugan did a webinar w/ Paul Everitt about Textual (because it’s not a Python Bytes episode if Will’s name is not brought up). Python Launcher officially launched! (Last covered 30 episodes ago.) Available in AUR, Fedora, and Homeberw (both macOS and Linux). No reported bugs since launch! Still doing my syntactic sugar blog posts. The Python extension for VS Code has a refreshed testing UX; we’re coming for you, Brian. 😉 Joke: Last 5%


22 Sep 2021

Rank #5

Podcast cover

#250 skorch your scikit-learn together with PyTorch

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Prayson Daniel Brian #1: Exciting New Ways To Be Told That Your Python Code is Bad Two new pylint errors consider-ternary-expression if condition(): x = 4 else: x = 5 x = 4 if condition() else 5 while-used it unconditionally flags every use of while expressions. generally, while should be avoided. Michael #2: GitHub Readme Stats via Роман Великий Dynamically generated stats for your github readmes This are for your repo or your stats (others too I suppose) posted somewhere outside of github Card for a project: https://github-readme-stats.vercel.app/api/pin/?username=mikeckennedy&repo=python-switch Card for a user: https://github-readme-stats.vercel.app/api?username=mikeckennedy&show_icons=true&theme=radical Card for your languages: https://github-readme-stats.vercel.app/api/top-langs/?username=mikeckennedy&repo=python-switch Prayson #3: Nox Nox appeared as “footnotes” in Episodes 182 and 248 (Hypermodern Python …) It does tox what invoke did (substituting GNU Make) Brian #4: Two tools for dealing with text python-easyfrontmatter - a small package to load and parse files (or just text) with YAML (or JSON, TOML or other) front matter. >>> post = frontmatter.load('tests/yaml/hello-world.txt') >>> print(post['title']) Hello, world! Tried it with a helper script I’m using with Hugo, and it parses Hugo metadata in blog posts like a dream. ftfy - fixes text for you “Take in bad Unicode and output good Unicode” >>> import ftfy >>> ftfy.fix_text('✔ No problems') '✔ No problems' Michael #5: MPIRE (MultiProcessing Is Really Easy) A Python package for easy multiprocessing, but faster than multiprocessing It combines the convenience of map like functions of multiprocessing.Pool with the benefits of using copy-on-write shared objects of multiprocessing.Process, together with easy-to-use worker state, worker insights, and progress bar functionality. Many features Requisite shoutout to unsync too. Prayson #6: skorch Going deep learning with scikit-learn pipelines (Breaking limits of multi-layer perceptron (MLP)) Using PyTorch, skorch provides an API to extend neural networks models in scikit-learn. Example: Penguins Classification shameless Gist Extras Michael vim + jupyter, via Marco Gorelli PyBay talk Prayson python-decouple Joke: Adoption


15 Sep 2021

Rank #6

Podcast cover

#249 All of Linux as a Python API

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Erik Christiansen Michael #1: Fickling via Oli A Python pickling decompiler and static analyzer Pickled ML models are becoming the data exchange and workflow of ML Analyses pickle files for security risks - It can also remove or insert [malicious] code into pickle files... Created by a security firm, it can be a useful defensive or offensive tool. Perhaps it is time to screen all pickles? >>> import ast >>> import pickle >>> from fickling.pickle import Pickled >>> print(ast.dump(Pickled.load(pickle.dumps([1, 2, 3, 4])).ast, indent=4)) Module( body=[ Assign( targets=[ Name(id='result', ctx=Store())], value=List( elts=[ Constant(value=1), Constant(value=2), Constant(value=3), Constant(value=4)], ctx=Load()))]) You can test for common patterns of malicious pickle files with the --check-safety option You can also safely trace the execution of the Pickle virtual machine without exercising any malicious code with the --trace option. Finally, you can inject arbitrary Python code that will be run on unpickling into an existing pickle file with the --inject option. See Risky Biz's episode for more details. Brian #2: Python Project-Local Virtualenv Management Hynek Schlawack Only works on UNIX-like systems. MacOS, for example. Instructions Install direnv. (ex: brew install direnv) Put this into a .envrc file in your project root: layout python python3.9 Now when you cd into that directory or a subdirectory, your virtual environment is loaded. when you cd out of it, the venv is unloaded Notes: Michael covered direnv on Episode 185. But it wasn’t until Hynek spelled it out for me how to use it with venv that I understood the simplicity and power. Not really faster than creating a venv, but when flipping between several projects, it’s way faster than deactivating/activating. You can also set env variables per directory (kinda the point of direnv) Erik #3: Testcontainers “Python port for testcontainers-java that allows using docker containers for functional and integration testing. Testcontainers-python provides capabilities to spin up docker containers (such as a database, Selenium web browser, or any other container) for testing. “ (pypi description). Provides cloud native services, many databases and the like (e.g. Google Cloud Pub/Sub, Kafka..) Originally a java project, still a way to go for us python programmers to implement all services Provides an example for use in CI/CD by leveraging Docker in Docker import sqlalchemy from testcontainers.mysql import MySqlContainer with MySqlContainer('mysql:5.7.17') as mysql: engine = sqlalchemy.create_engine(mysql.get_connection_url()) version, = engine.execute("select version()").fetchone() print(version) # 5.7.17 Michael #4: jc via Garett CLI tool and python library that converts the output of popular command-line tools and file-types to JSON or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts. Run it as COMMAND ARGS | jc --COMMAND Commands include: systemctl, passwd, ls, jobs, hosts, du, and cksum. Brian #5: What is Python's Ellipsis Object? Florian Dahlitz Ellipsis or … is a constant defined in Python “Ellipsis: The same as the ellipsis literal “...”. Special value used mostly in conjunction with extended slicing syntax for user-defined container data types.” Can be used in type hinting Func returns two int tuple def return_tuple() -> tuple[int, int]: pass Func returns one or more integer: def return_tuple() -> tuple[int, ...]: pass Replacement for pass: def my_function(): ... Ellipsis in the wild, “if you want to implement a certain feature where you need a non-used literal, you can use the ellipsis object.” FastAPI : Ellipsis used to make parameters required Typer: Same Erik #6: PyTorch Forecasting PyTorch Forecasting aims to ease state-of-the-art timeseries forecasting with neural networks for both real-world cases and research alike. The goal is to provide a high-level API with maximum flexibility for professionals and reasonable defaults for beginners. basically tries to achieve for time series what fast.ai has achieved for computer vision and natural language processing The package is built on PyTorch Lightning to allow training on CPUs, single and multiple GPUs out-of-the-box. Implements of Temporal Fusion Transformers interpretable - can calculate feature importance Hyperparameter tuning with optuna Extras Brian Python 3.10rc2 available. 3.10 is about a month away Michael GoAccess follow up Caffinate more - via Nathan Henrie: you mentioned the MacOS /usr/bin/caffeinate tool on "https://pythonbytes.fm/episodes/show/247/do-you-dare-to-press-.". Follow caffeinate with long-running command to keep awake until done (caffeinate python -c 'import time; time.sleep(10)'), or caffeinate -w "$PID" for an already running task. Python Keyboard (via Sean Tabor) Open source is booming (via Mark Little) FFMPEG.WASM ffmpeg.wasm is a pure WebAssembly via Jim Anderson Everything is fine: PyPI packages Python 3.10 RC 2 is out Joke: 200 == 400


9 Sep 2021

Rank #7

Podcast cover

#248 while True: stand up, sit down

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: Paul Everitt Brian #1: Why I use attrs instead of pydantic Tin Tvrtković, @tintvrtkovic attrs vs dataclasses Since dataclasses are a strict subset of attrs functionality. Recommend using attrs in most cases over dataclasses attrs is faster, has more features, releases more frequently, offers over a wider range of Python versions. attrs vs Pydantic attrs is a library for generating the boring parts of writing classes; Pydantic is that but also a complex validation library. a structuring/unstructuring library, ex converting to json and back attrs has opt-in validation that you have more control over cattrs can be used for structuring/unstructuring converters are opt-in for attrs, built into Pydantic, and can be wrong. example using Pendulum that Pydantic mishandles Summary attrs + cattrs + validators where necessary, converters where necessary will be faster you’ll have more control Kind of a “small, sharp, specialized tools” vs “swiss army knife” comparison. Michael #2: mclfy via __dann__ Mcfly is an incredible Ctrl+r replacement McFly replaces your default ctrl-r shell history search with an intelligent search engine that takes into account your working directory and the context of recently executed commands. McFly's suggestions are prioritized in real time with a small neural network. Features Rebinds ctrl-r to bring up a full-screen reverse history search prioritized with a small neural network. Augments your shell history to track command exit status, timestamp, and execution directory in a SQLite database. Maintains your normal shell history file as well so that you can stop using McFly whenever you want. Includes a simple action to scrub any history item from the McFly database and your shell history files. Designed to be extensible for other shells in the future. Written in Rust, so it's fast and safe. Paul #3: Textual and boilerplate removal In the race to make Textual the most talked-about package in Python Bytes history… I’d like to zoom in on a Twitter discussion he had about removing boilerplate I have traditionally been opposed to the convention-over-configuration approach that most successful Python projects have taken I dislike magic variable and file names, prefer explicit is better than implicit, actual symbols Lately, because of…tooling But Will’s approach to “boilerplate removal” is compelling, as it remains mypy friendly Still, I find it flawed…code meant to be read 2 years from now…that stuff that is implied-away, worries me Will is great at working-in-the-open, being a gentle, encouraging public figure Brian #4: xdoctest “The xdoctest package is a re-write of Python's builtin doctest module. It replaces the old regex-based parser with a new abstract-syntax-tree based parser (using Python's ast module). The goal is to make doctests easier to write, simpler to configure, and encourage the pattern of test driven development.” “The main enhancements xdoctest offers over doctest are: All lines in the doctest can now be prefixed with >>>. Old-style doctests with ... are still valid. Additionally, the multi-line strings don't require any prefix (but its ok if they do have either prefix). Tests are executed in blocks, rather than line-by-line, thus comment-based directives (e.g. # doctest: +SKIP) are now applied to an entire block, rather than just a single line. Tests without a "want" statement will ignore any stdout / final evaluated value. This makes it easy to use simple assert statements to perform checks in code that might write to stdout. If your test has a "want" statement and ends with both a value and stdout, both are checked, and the test will pass if either matches. Output from multiple sequential print statements can now be checked by a single "got" statement. (new in 0.4.0).” Features I love “The new got/want tester is very permissive by default; it ignores differences in whitespace” You can make doctest normalize whitespace, but why should you have to? Michael #5: Automate the standing desk with python via Joe Riedley, by David Kong “When I first started using it, I was very excited, but I quickly found myself sitting all day, in spite of the fancy desk.” I took off a few screws and … voila! A row of pins neatly exposed right in front. The pins in my control box, when connected correctly, simulate the pressing of the buttons on the front of the box. Raspberry Pi Zero, the simplest, most basic version. It doesn’t have all the bells and whistles, but it does everything I needed for this simple project, and it’s just $5(!). And the code from gpiozero import LED # The LED library allows easy pin control from time import sleep import randomrelay = LED(17) # I connected the relay to pin 17 and groundwhile True: relay.on() sleep(1) relay.off() sleep(random.randint(45, 60) * 60) Paul #6: Hypermodern Python Cookiecutter I’ve been noodling with some code the last two years about bringing frontend DX to Python web dev Learning and talking more than adoption Running a modern Python project is a LOT of housekeeping Hypermodern Python Cookiecutter from Claudio Jolowicz teleported me to a state of the art I was looking for Poetry, Nox, GHA, pre-commit, flake8, PyPI uploads from CI, release drafter, Black, prettier, pytest, mypy, Sphinx and friends, GitHub labeler It’s NOT AT ALL just a cookiecutter The best part…it’s an enormously-detailed user guide, some blog posts with the “why”, it’s actively maintained The PR workflow is really well explained and wired up This could be…a course, a webinar Thanks Claudio Extras Michael: ActiveState's 2021 Software Supply Chain Security Survey Python 3.9.7 and 3.8.12 are now available From Shlomi Lanton, on your #2 Brian talked about having a history of all files to find the ones that were updated last, so I created granpa Also: wakepy now works correctly on macOS Joke: Meaning


2 Sep 2021

Rank #8

Podcast cover

#247 Do you dare to press "."?

Watch the live stream: Watch on YouTube About the show Special guest: Dan Taylor Michael #1: Keep your computer awake during long processing For now, use Michael’s fork when on macOS. Until this PR is merged. Do you have work that will take a long time? Keeping your OS working away is just a context block from wakepy import keepawake with keepawake(keep_screen_awake=False): ... # do stuff that takes long time Brian #2: How to write a great Stack Overflow question via Kevin Markham The punchline (but it’s not enough) Write a brief introduction Provide a self-contained code example Detail the expected results and why I expect those results Add any important notes Link to any relevant questions Write a title that summarizes the question Kevin starts with a question about pandas dataframes and filling in missing values. The question is really application specific The rewrite of the question is awesome Simplifies the problem into a toy example, literally, and out of the domain specific context. Includes example code that can copied, pasted, and run that sets up the problem Uses short and simple variable names Talks about expected results. And why he expects those results. Includes a dataset in the sample code that covers cases the solution needs to provide Includes non-obvious requirements or non-requirements Links to related questions and why they don’t solve your problem. I don’t think I’ve ever seen this, but I think it’d be cool to add test code that will pass when the problem is solved. But that might make the question unnecessarily long. Dan #3: Github.dev - press ‘.’ to edit code in any GitHub repo Fun bonus feature released at the same time as GitHub Codespaces Runs VS Code entirely in your browser - supercharged “edit button” Nothing to install There’s no server to pay for, though functionality is limited The file system is your browser’s local storage and GitHub repo You can add files and commit changes directly to your repo You can install extensions that support running in “VS Code Web” Added basic web support to the Python Extension just yesterday Syntax checking, auto-complete, go-to-definition Uses type hints for packages (no python interpreter in the browser) You can also install vscode-pyiodide to run Python code using Jupyter+Pyiodide Overall means you can do more powerful code editing quickly in GitHub.com, I’m looking forward to seeing how this evolves Michael #4: Log analyzer (minus google analytics) GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser. Features Fast, real-time, millisecond/second updates, written in C Only ncurses as a dependency Nearly all web log formats (Apache, Nginx, Amazon S3, Elastic Load Balancing, CloudFront, Caddy, etc) Simply set the log format and run it against your log Beautiful terminal and bootstrap dashboards (Tailor GoAccess to suit your own color taste/schemes) Brian #5: KMK: Clackety Keyboards Powered by Python recommended by Blaise “firmware for computer keyboards written and configured in CircuitPython.” Cool list of features Fully configured through a single, easy to understand Python file. Single-piece or two-piece split keyboards are supported Chainable keys such as KC.LWIN(KC.L) to lock the screen on a Windows PC Built-in unicode macros, including emojis RGB underglow and LED backlights One key can turn into many more based on how many times you tap it One writeup I found of someone using it for a 10-key KMK: run Python on your keyboard includes a video Seems like limited hardware so far, and although the coding might not be too difficult, you still gotta swap out of the circuitboard. I’m bringing this topic up because I’m hoping some keyboard kit people will put together something that just starts with the ability to run CircuitPython so I can just skip to the coding part. Dan #6: SQLModel - use the same models for SQL and FastAPI via Sebastián Ramírez (creator of SQLModel and FastAPI) Write a schema once and use everywhere, reduces a lot of repetitive code Traditionally have to manage several layers of code to pass your data from database queries, to the backend code, expose to your API and consume from the client Code-first ORMs (SQLAlchemy, Django ORM) make it easy to write code that generates SQL FastAPI makes it easy to expose objects to your API using Pydantic models Before you would need to create both models and convert from ORM to Pydantic using .from_orm SQLModel unifies those: a SQLModel is both a SQLAlchemy model and a Pydantic model You can use SQLModel to interact with the database (via wrapping SQLAlchemy) You can use that same model as a Pydantic model in FastAPI requests and responses FastAPI also uses the Pydantic models to generate an openapi.json, meaning you could generate a client library in any language using OpenAPI Generator Some other cool things: Designed using type annotations so that editors like VS Code, PyCharm give great auto-complete out of the box, uses the proposed dataclass_transforms spec for dynamic typing Supports async database sessions, alembic migrations because it’s based on SQLAlchemy (not yet documented) Should be possible to integrate with postgis, ts_vectors Extras Brian pip install ./local_directory is pretty interesting. Test & Code 163 The way pip installs from a local directory is about to change. Stéphane Bidoul joins the show to talk about it. Dan type4py - using ML to add type annotations to your codebase retrofitting codebases with types is a pain — static type checkers can only infer so much type4py research paper outlines a state of the art ML model for inferring types, adopting some techniques used in computer vision Open sourced training code, data set, VS Code extension, and inferencing server If you have a need to add type annotations to a large code base, worth giving this a try! WARNING the VS Code extension sends code tokens to their API on type4py.com (they do have a privacy policy) — if this is a concern be sure to host the inferencing server yourself! Joke: Continuous Deployment Also: “If a programmer gets an interview because of a recommendation from a friend, are they being passed by reference?” From @CarlaNotarobot, via @bluefiddleguy


26 Aug 2021

Rank #9

Podcast cover

#246 Love your crashes, use Rich to beautify tracebacks

Watch the live stream: Watch on YouTube About the show Sponsored by us: Check out the courses over at Talk Python And Brian’s book too! Special guest: David Smit Brian #1: mktestdocs Vincent D. Warmerdam Tutorial with videos Utilities to check for valid Python code within markdown files and markdown formatted docstrings. Example: import pathlib import pytest from mktestdocs import check_md_file @pytest.mark.parametrize('fpath', pathlib.Path("docs").glob("**/*.md"), ids=str) def test_files_good(fpath): check_md_file(fpath=fpath) This will take any codeblock that starts with ```python and run it, checking for any errors that might happen. Putting assert statements in the code block will actually check things. Other examples in README.md for markdown formatted docstrings from functions and classes. Suggested usage is for code in mkdocs documentation. I’m planning on trying it with blog posts. Michael #2: Redis powered queues (QR3) via Scot Hacker QR queues store serialized Python objects (using cPickle by default), but that can be changed by setting the serializer on a per-queue basis. There are a few constraints on what can be pickled, and thus put into queues Create a queue: bqueue = Queue('brand_new_queue_name', host='localhost', port=9000) Add items to the queue >> bqueue.push('Pete') >> bqueue.push('John') >> bqueue.push('Paul') >> bqueue.push('George') Getting items out >> bqueue.pop() 'Pete' Also supports deque, or double-ended queue, capped collections/queues, and priority queues. David #3: 25 Pandas Functions You Didn’t Know Existed Bex T So often, I come across a pandas method or function that makes me go “AH!” because it saves me so much time and simplifies my code Example: Transform Don’t normally like these articles, but this one had several “AH” moments between styler options convert dtypes mask nasmallest, nalargest clip attime Brian #4: FastAPI and Rich Tracebacks in Development Hayden Kotelman Rich has, among other cool features, beautiful tracebacks and logging. FastAPI makes it easy to create web API’s This post shows how to integrate the two for API’s that are easy to debug. It’s really only a few simple steps Create a dataclass for the logger config. Create a function that will either install rich as the handler (while not in production) or use the production log configuration. Call logging.basicConfig() with the new settings. And possibly override the logger for Uvicorn. Article contains all code necessary, including examples of the resulting logging and tracebacks. Michael #5: Dev in Residence I am the new CPython Developer in Residence Report on first week Łukasz Langa: “When the PSF first announced the Developer in Residence position, I was immediately incredibly hopeful for Python. I think it’s a role with transformational potential for the project. In short, I believe the mission of the Developer in Residence (DIR) is to accelerate the developer experience of everybody else.” The DIR can: providing a steady review stream which helps dealing with PR backlog; triaging issues on the tracker dealing with issue backlog; being present in official communication channels to unblock people with questions; keeping CI and the test suite in usable state which further helps contributors focus on their changes at hand; keeping tabs on where the most work is needed and what parts of the project are most important. David #6: Dagster Dagster is a data orchestrator for machine learning, analytics, and ETL Great for local development that can be deployed on Kubernetes, etc Dagit provides a rich UI to monitor the execution, view detailed logs, etc Can deploy to Airflow, Dask, etc Quick demo? References https://www.dataengineeringpodcast.com/dagster-data-applications-episode-104/ https://softwareengineeringdaily.com/2019/11/15/dagster-with-nick-schrock/ Extras Michael: Get a vaccine, please. Python 3.10 Type info ---- er Make the 3.9, thanks John Hagen. Here is a quick example. All of these are functionally equivalent to PyCharm/mypy: # Python 3.5-3.8+ from typing import List, Optional def fun(l: Optional[List[str]]) -> None: # Python 3.9+ from typing import Optional def fun(l: Optional[list[str]]) -> None: # Python 3.10+ def fun(l: list[str] | None) -> None: Note how with 3.10 we no longer need any imports to represent this type. David: Great SQL resource Joke: Pray


11 Aug 2021

Rank #10