Python Bytes
Python Bytes
Michael Kennedy and Brian Okken
Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.
#209 JITing Python with .NET, no irons in sight
Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Michael #1: Running Python on .NET 5 by Anthony Shaw Talked about pyjion way back when on episode 49 with Brett Cannon. .NET 5 was released on November 10, 2020. It is the cross-platform and open-source replacement of the .NET Core project and the .NET project that ran exclusively on Windows since the late 90’s. See the conference about it if you want to go deeper. Performance: I just saw a SO post about someone complaining their Python was 31x slower than C#. The most common way around this performance barrier is to compile Python extensions from C or using something like Cython. .NET 5 CLR comes bundled with a performant JIT compiler (codenamed RyuJIT) that will compile .NETs IL into native machine instructions on Intel x86, x86-64, and ARM CPU architectures. Pyjion is a project to replace the core execution loop of CPython by transpiling CPython bytecode to ECMA CIL and then using the .NET 5 CLR to compile that into machine code. It then executes the machine-code compiled JIT frames at runtime instead of using the native execution loop of CPython. A few releases of Python ago (CPython specifically, the most commonly used version of Python) in 3.7 a new API was added to be able to swap out “frame execution” with a replacement implementation. This is otherwise known as PEP 523. This extension uses the same standard library as Python 3.9. Will this be compatible with my existing Python code? What about C Extensions? The short answer is- if your existing Python code runs on CPython 3.9 – yes it will be compatible. Tested against the full CPython “test suite” on all platforms. In fact, it was the first JIT ever to pass the test suite. Is this faster? The short answer a little, but not by much (yet). see also: https://twitter.com/anthonypjshaw/status/1328457723608928256?s=20 Brian #2: PEP 621 -- Storing project metadata in pyproject.toml Progress on standardizing what goes into pyproject.toml Authors Brett Cannon, Paul Ganssle, Pradyun Gedam, Sébastien Eustace (of poetry), Thomas Kluyver (of flit), Tzu-Ping Chung Motivators of this PEP are: Encourage users to specify core metadata statically for speed, ease of specification, unambiguity, and deterministic consumption by build back-ends Provide a tool-agnostic way of specifying metadata for ease of learning and transitioning between build back-ends Allow for more code sharing between build back-ends for the "boring parts" of a project's metadata Doesn’t change any existing core metadata Doesn’t attempt to standardize all possible metadata Included in table named [project]: name version description readme requires-python license authors/maintainers keywords classifiers urls entry points dependencies/optional-dependencies dynamic There’s an example in the PEP that helps clear things up Many items have synonyms specified for flit/poetry/setuptools (presumably for backward compatibility) Michael #3: GitHub revamps copyright takedown policy after restoring YouTube-dl In October following a DMCA complaint from the Recording Industry Association of America (RIAA) it was taken down at GitHub. Citing a letter from the Electronic Frontier Foundation (the EFF), GitHub says it ultimately found that the RIAA’s complaint didn’t have any merit. The RIAA argued the tool ran afoul of section 1201 of the US copyright law by giving people the means to circumvent YouTube’s DRM. the EFF dissects the RIAA’s claims, highlighting where the organization had either misinterpreted the law or how the code of YouTube-dl works. “Importantly, YouTube-dl does not decrypt video streams that are encrypted with commercial DRM technologies, such as Widevine, that are used by subscription videos sites, such as Netflix,” the organization points out when it comes to the RIAA’s primary claim. GitHub is implementing new policies to avoid a repeat of a repeat situation moving forward. First, it says a team of both technical and legal experts will manually evaluate every single section 1201 claim. If the company’s technical and legal teams ultimately find any issues with a project, GitHub will give its owners the chance to address those problems before it takes down their work. GitHub is establishing a $1 million legal defense fund for developers. Sidebar: EFF has just launched How to Fix the Internet, a new podcast mini-series that examines potential solutions to six ills facing the modern digital landscape. Brian #4: Install & Configure MongoDB on the Raspberry Pi Mark Smith Definitely a “wow, I didn’t know you could do that” article. Tutorial walks through Installing 64 bit Ubuntu Server on a Raspberry Pi Configure wifi Install MongoDB on Pi Set up a user account, to safely expose MongoDB on a home network. Now you’ve got a MongoDB server in your house. So cool Michael #5: Extra! extra! extra!, hear all about it! Follow up on my critique of things like SQL & CSS put next to Python and Java. Maybe best to grab the conversation from here. Guido joins Microsoft, why? People seem to see this as a positive for sure. But they checked him out! New code editor roaming the streets: Nova from Panic. Two thumbs up on Big Sur and now waiting on the Mac Mini M1. Brian #6: A Python driven AI Stylist Inspired by Social Media Dale Markowitz A bunch of Google tools (cloud storage, firebase, cloud vision api, product search api) Some React for front end Python to batch script General oversimplified process: photos from social media for inspiration photos of everything in your closet, multiple of each item use AI suggest outfits from your closet that match inspiration photos Ok. The process is really more of a promo for Google AI products, and not so much about Python, but it’s a cool “look what you can do with software” kinda thing. Also, many of the tools used by online retail, like “similar products” and such, are available to lots of people now, and that’s cool. Joke: Back to the [dev] future!
Nov 27
33 min
#208 Dependencies out of control? Just pip chill.
Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Brian #1: pip-chill - Make requirements with only the packages you need Ricardo Bánffy Like pip freeze but lists only the packages that are not dependencies of installed packages. Will be great for creating requirements.txt files that look like the ones you would write by hand. I wish it had an option to not list itself, but pip-chill | grep -v pip-chill works. What do I have installed? (foo) $ pip freeze appdirs==1.4.4 black==20.8b1 click==7.1.2 mypy-extensions==0.4.3 ... No really, what did I myself install? (foo) $ pip-chill black==20.8b1 pip-chill==1.0.0 Without versions? (foo) $ pip-chill --no-version black pip-chill What did those things install as dependencies? (foo) $ pip-chill -v --no-version black pip-chill # appdirs # Installed as dependency for black # click # Installed as dependency for black ... Michael #2: Windows update broke NumPy Sent in by Daniel Mulkey A recent Windows update broke some behavior that I think OpenBLAS (used by NumPy) relied on. There's a Developer Community thread here. I am a NumPy developer. We have been trying to track down a strange issue where after updating to windows 10 2004, suddenly code that worked no longer works. Here is the NumPy issue and here is the corresponding issue in OpenBLAS. The problem can be summarized: when calling fmod, something is changed so that much later calling an OpenBLAS assembly routine fails. The only difference I can see in the registers that visual studio exposes is that after the call to fmod, register ST(0) is set to NAN. Steve Dower and other Microsoft people have commented. The fix is slated to take until January 2021 to be released, though there are workarounds for some scenarios. Matt P. posted a workaround: For all those at home following along and looking for a quick fix, NumPy has released a bugfix 1.19.3 to work around this issue. The bugfix broke something else on Linux, so we had to revert the fix in release 1.19.4, but you can still install the 1.19.3 via pip install numpy==1.19.3. Note this is only works around the way this bug crashes NumPy (technically, in OpenBLAS which is shipped with NumPy), and may not fix all your problems related to this bug, Microsoft’s help is needed to do that. Brian #3: Build Plugins with Pluggy kracekumar Blog post related to talks given at PyGotham and PyCon India Pluggy is the plugin library used by pytest Article starts with a CLI application that has one output format. Need is for more formats, implemented as plugins. Quick look at pluggy architecture of host/caller/core system and plugin/hook. Also plugin manager, hook specs, and hook implementations. Walks through the changes to the application needed to support plugins. I’ve been waiting for an article on pluggy, and this is nice. But I admit I’m still a little lost. I guess I need to watch one of the presentations and try to build something with pluggy. Michael #4: LINQ in Python via Adam: I seem to recall that Michael had a C# background, so this might be of interest: Bringing LINQ-like expressions to Python with linqit Example: last_hot_pizza_slice = programmers.where(lambda e:e.experience > 15) .except_for(elon_musk) .of_type(Avi) .take(3) # [[HTML_REMOVED], [HTML_REMOVED], [HTML_REMOVED]] .select(lambda avi:avi.lunch) # [[HTML_REMOVED], [HTML_REMOVED], [HTML_REMOVED]] .where(lambda p:p.is_hot() and p.origin != 'Pizza Hut'). .last() # [HTML_REMOVED] .slices.last() # [HTML_REMOVED] Also interesting asq: https://github.com/sixty-north/asq Brian #5: Klio : a framework for processing audio files or any binary files, at large scale Recently open sourced by Spotify An article about it Klio is based on Apache Beam and allows integration with cloud processing engines open graph of job dependencies batch and streaming pipelines goals: large-file input/output scalability, reproducibility, efficiency closer collaboration between researchers and engineers uses Python Obviously useful for Spotify, but they are hoping it will help with other audio research and applications. Michael #6: Collapsing code cells in Jupyter Notebooks via Marco Gorelli You mentioned in that episode that you'd like to have a way of collapsing code cells in Jupyter Notebooks so you can export them as reports - incidentally, I wrote a little blog post about how to do that - in case it's useful/of interest to you, here it is! Basically get a static HTML file that is the static notebook output but can start with the code cells collapsed and can toggle their visibility. Extras Michael: New Apple Silicon macs? Bot tweets: twitter.com/MichelARenard/status/1324269474544029696 Joke: By Richard Cairns Q: Why did the data scientist get in trouble with Animal Welfare? A: She was caught trying to import pandas. “10e engineeeeeeeeers are the future.” - detahq
Nov 19
30 min
#207 FastAPI as a web platform (not just APIs)
Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Michael #1: fastapi-chameleon (and fastapi-jinja) Chameleon via Michael, Jinja via Marc Brooks Convert a FastAPI API app to a proper web app Then just decorate the FastAPI view methods (works on sync and async methods): @router.post('/') @fastapi_chameleon.template('home/index.pt') async def home_post(request: Request): form = await request.form() vm = PersonViewModel(**form) return vm.dict() # {'first':'Michael', 'last':'Kennedy', ...} The view method should return a dict to be passed as variables/values to the template. If a fastapi.Response is returned, the template is skipped and the response along with status_code and other values is directly passed through. This is common for redirects and error responses not meant for this page template. Brian #2: Django REST API in a single file, without using DRF Adam Johnson He’s been on Test & Code a couple times, 128 & 135 Not sure if you should do this, but it is possible. Example Django app that is a REST API that gives you information about characters from Rick & Morty. Specifically, just Rick and Morty. / - redirects to /characters/ /characters/ - returns a JSON list /characters - redirects to /characters/ /characters/1 - returns JSON info about Rick /characters/2 - same, but for Morty Shows off how with Django off the shelf, can do redirects and JSON output. Shows data using dataclasses. Hardcoded here, but easy to see how you could get this data from a database or other part of your system. Michael #3: 2020 StackOverflow survey results Most Popular Technologies Languages: JavaScript (68%), Python (44%), Java(40%) Web frameworks: Just broken, jQuery? Seriously!?! Databases: MySQL (56%), PostgreSQL (36%), Microsoft SQL Server (33%), MongoDB (26%) Platforms: Windows (46%), macOS (28%), Linux(27%) Most loved languages: Rust, TypeScript, Python Most wanted languages: Python, JavaScript, Go Most dreaded language: VBA & ObjectiveC Most loved DBs: Redis (67%), PostgreSQL (64%), Elasticsearch (59%), MongoDB (56%) Most wanted DBs: MongoDB (19%), PostgreSQL (16%) Most dreaded DB: DB2 Brian #4: A Visual Guide to Regular Expression Amit Chaudhary Gentle introduction to regex by building up correct mental models using visual highlighting. Goes through different patterns: specific character white space (any whitespace \s, tab \t, newline \n) single-digit number \d word characters \w : lowercase, uppercase, digits, underscore this sometimes throws me, since w seems like it might somehow be related to whitespace. It’s not. dot . : anything except newline pattern negations: \d is digits, \D is anything that is not a digit \s whitespace, \S not whitespace \w word characters, \W everything else character sets with square brackets [], and optionally dash - for range anchors ^ beginning of line $ end of line escaping patterns with \ repetition with {}, *, +, ? Using Python re module findall match and match.group search Michael #5: Taking credit by Tim Nolet Oh @awscloud I really do love you! But next time you fork my OS project https://github.com/checkly/headless-recorder and present it as your new service, give the maintainers a short "nice job, kids" or something. Not necessary as per the APLv2 license, but still, ya know? Amazon CloudWatch Synthetics launches Recorder to generate user flow scripts for canaries A Chrome browser extension, to help you create canaries more easily. Brian #6: Raspberry Pi 400 “complete personal computer, built into a compact keyboard” by itself, or as a kit with mouse and power adapter and cables and such, for $100 4 core, 64-bit processor, 4 GB RAM, wifi & LAN, can drive 2 displays, 4K video 40-pin GPIO header, so you can still play with hardware and such. There’s an adafruit video with Limor Fried where she describes this as something as close as we get today to an Apple IIe from my youth. For me, IIe was at school, at home I had a TRS80 plugged into an old TV and using my sisters tape deck for disk storage. This seems great for education use, but also as a second computer in your house, or a kids computer. Comes with a Beginner’s Guide that includes getting started with Python Extras: Brian: vim-adventures.com - with a dash. Practice vim key bindings while playing an adventure game. Super cool. Michael: TIOBE Index for November 2020 via Tyler Pedersen Joke: You built it, you run it.
Nov 13
33 min
#206 Python dropping old operating systems is normal!
Sponsored by Techmeme Ride Home podcast: pythonbytes.fm/ride Special guest: Steve Dower - @zooba Brian #1: Making Enums (as always, arguably) more Pythonic “I hate enums” Harry Percival Hilarious look at why enums are frustrating in Python and a semi-reasonable workaround to make them usable. Problems with enums of strings: Can’t directly compare enum elements with the values Having to use .value is dumb. Can’t do random choice of enum values Can’t convert directly to a list of values If you use IntEnum instead of Enum and use integer values instead of strings, it kinda works better. Making your own StringEnum also is better, but still doesn’t allow comparison. Solution: class BRAIN(str, Enum): SMALL = 'small' MEDIUM = 'medium' GALAXY = 'galaxy' def __str__(self) -> str: return str.__str__(self) Derive from both str and Enum, and add a *__str(self)__* method. Fixes everything except random.choice(). Michael #2: Python 3.10 will be up to 10% faster 4.5 years in the making, from Yury Selivanov work picked up by Pablo Galindo, Python core developer, Python 3.10/3.11 release manager LOAD_METHOD, CALL_METHOD, and LOAD_GLOBAL improved “Lot of conversations with Victor about his PEP 509, and he sent me a link to his amazing compilation of notes about CPython performance. One optimization that he pointed out to me was LOAD/CALL_METHOD opcodes, an idea first originated in PyPy.” There is a patch that implements this optimization Based on: LOAD_ATTR stores in its cache a pointer to the type of the object it works with, its tp_version_tag, and a hint for PyDict_GetItemHint. When we have a cache hit, LOAD_ATTR becomes super fast, since it only needs to lookup key/value in type's dict by a known offset (the real code is a bit more complex, to handle all edge cases of descriptor protocol etc). Steve #3: Python 3.9 and no more Windows 7 PEP 11 -- Removing support for little used platforms | Python.org Windows 7 - Microsoft Lifecycle | Microsoft Docs Default x64 download Brian #4: Writing Robust Bash Shell Scripts David Pashley Some great tips that I learned, and I’ve been writing bash scripts for decades. set -u : exits your script if you use an uninitialized variable set -e : exit the script if any statement returns a non-true return value. Prevents errors from snowballing. Expect the unexpected, like missing files, missing directories, etc. Be prepared for spaces in filenames. if [ "$filename" = "foo" ]; Using trap to handle interrupts, exits, terminal kills, to leave the system in a good state. Be careful of race conditions Be atomic Michael #5: Ideas for 5x faster CPython Twitter post by Anthony Shaw calling attention to roadmap by Mark Shannon Implementation plan for speeding up CPython: The overall aim is to speed up CPython by a factor of (approximately) five. We aim to do this in four distinct stages, each stage increasing the speed of CPython by (approximately) 50%: 1.5**4 ≈ 5 Each stage will be targeted at a separate release of CPython. Stage 1 -- Python 3.10: The key improvement for 3.10 will be an adaptive, specializing interpreter. The interpreter will adapt to types and values during execution, exploiting type stability in the program, without needing runtime code generation. Stage 2 -- Python 3.11: Improved performance for integers of less than one machine word. Faster calls and returns, through better handling of frames. Better object memory layout and reduced memory management overhead. Stage 3 -- Python 3.12 (requires runtime code generation): Simple "JIT" compiler for small regions. Stage 4 -- Python 3.13 (requires runtime code generation): Extend regions for compilation. Enhance compiler to generate superior machine code. Wild conversation over here. One excerpt, from Larry Hastings: Speaking as the Gilectomy guy: borrowed references are evil. The definition of the valid lifetime of a borrowed reference doesn't exist, because they are a hack (baked into the API!) that we mostly "get away with" just because of the GIL. If I still had wishes left on my monkey's paw I'd wish them away (1). (1) Unfortunately, I used my last wish back in February, wishing I could spend more time at home.* Steve #6: CPython core developer sprints Hosted by pythondiscord.com https://youtu.be/gXMdfBTcOfQ - Core dev Q&A Extras Brian: Tools I found recently that are kinda awesome in their own way - Brian mcbroken.com - Is the ice cream machine near you working? just a funny single purpose website vim-adventures.com - with a dash. Practice vim key bindings while playing an adventure game. Super cool. Joke: Hackobertfest 2020 t-shirt https://twitter.com/McCroden/status/1319646107790704640 5 Most Difficult Programming Languages in the World (Not really long enough for a full topic, but funny. I think I’ll cut short the last code example after we record) suggested by Troy Caudill Author: Lokajit Tikayatray malboge, intercal, brainf*, cow, and whitespace whitespace is my favorite: “Entire language depends on space, tab, and linefeed for writing any program. The Whitespace interpreter ignores Non-Whitespace characters and considers them as code comments.” Intercal is kinda great in that One thing I love about this article is that the author actually writes a “Hello World!” for each language. Examples of “Hello World!” malboge (=<`#9]~6ZY32Vx/4Rs+0No-&Jk)"Fh}|Bcy?`=*z]Kw%oG4UUS0/@-ejc(:'8dc intercal DO ,1 <- #13 PLEASE DO ,1 SUB #1 <- #238 DO ,1 SUB #2 <- #108 DO ,1 SUB #3 <- #112 DO ,1 SUB #4 <- #0 DO ,1 SUB #5 <- #64 DO ,1 SUB #6 <- #194 DO ,1 SUB #7 <- #48 PLEASE DO ,1 SUB #8 <- #22 DO ,1 SUB #9 <- #248 DO ,1 SUB #10 <- #168 DO ,1 SUB #11 <- #24 DO ,1 SUB #12 <- #16 DO ,1 SUB #13 <- #162 PLEASE READ OUT ,1 PLEASE GIVE UP brain**ck (censored) ++++++++++[>+++++++>++++++++++>+++<<<-]>++.>+.+++++++ ..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+. cow MoO MoO MoO MoO MoO MoO MoO MoO MOO moO MoO MoO MoO MoO MoO moO MoO MoO MoO MoO moO MoO MoO MoO MoO moO MoO MoO MoO MoO MoO MoO MoO MoO MoO moO MoO MoO MoO MoO mOo mOo mOo mOo mOo MOo moo moO moO moO moO Moo moO MOO mOo MoO moO MOo moo mOo MOo MOo MOo Moo MoO MoO MoO MoO MoO MoO MoO Moo Moo MoO MoO MoO Moo MMM mOo mOo mOo MoO MoO MoO MoO Moo moO Moo MOO moO moO MOo mOo mOo MOo moo moO moO MoO MoO MoO MoO MoO MoO MoO MoO Moo MMM MMM Moo MoO MoO MoO Moo MMM MOo MOo MOo Moo MOo MOo MOo MOo MOo MOo MOo MOo Moo mOo MoO Moo whitespace S S S T S S T S S S L T L S S S S S T T S S T S T L T L S S S S S T T S T T S S L T L S S S S S T T S T T S S L T L S S S S S T T S T T T T L T L S S S S S T S T T S S L T L S S S S S T S S S S S L T L S S S S S T T T S T T T L T L S S S S S T T S T T T T L T L S S S S S T T T S S T S L T L S S S S S T T S T T S S L T L S S S S S T T S S T S S L T L S S S S S T S S S S T L T L S S L L L APL: (~R∊R∘.×R)/R←1↓ιR
Nov 8
42 min
#205 This is going to be a little bit awkward
Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Michael #1: Awkward arrays via Simon Thor Awkward Array is a library for nested, variable-sized data, including arbitrary-length lists, records, mixed types, and missing data, using NumPy-like idioms. This makes it better than numpy at handling data where e.g. the rows in a 2D array have different lengths. It can even be used together with numba to jit-compile the code to make it even faster. Arrays are dynamically typed, but operations on them are compiled and fast. Their behavior coincides with NumPy when array dimensions are regular and generalizes when they’re not. Recently rewritten in C++ for the 1.0 release and can even be used from C++ as well as Python. Careful on installation: pip install awkward1 ← Notice the 1. Brian #2: Ordered dict surprises Ned Batchelder “Since Python 3.6, regular dictionaries retain their insertion order: when you iterate over a dict, you get the items in the same order they were added to the dict. Before 3.6, dicts were unordered: the iteration order was seemingly random.” The surprises: You can’t get the first item, like d[0], since that’s just the value matching key 0, if key 0 exists. (I’m not actually surprised by this.) equality and order (this I am surprised by) Python 3.6+ dicts ignores order when testing for equality {"a": 1, "b": 2} == {"b": 2, "a": 1} OrderdDicts care about order when testing for equality OrderedDict([("a", 1), ("b", 2)]) != OrderedDict([("b", 2), ("a", 1)]) Michael #3: jupyter lab autocomplete and more via Anders Källmar Examples show Python code, but most features also work in R, bash, typescript, and many other languages. Hover: Hover over any piece of code; if an underline appears, you can press Ctrl to get a tooltip with function/class signature, module documentation or any other piece of information that the language server provides Diagnostics: Critical errors have red underline, warnings are orange, etc. Hover over the underlined code to see a more detailed message Jump to Definition: Use the context menu entries to jump to definitions Highlight References: Place your cursor on a variable, function, etc and all the usages will be highlighted Automatic Completion: Certain characters, for example '.' (dot) in Python, will automatically trigger completion Automatic Signature Suggestions: Function signatures will automatically be displayed Rename: Rename variables, functions and more, in both: notebooks and the file editor. Brian #4: Open Source Tools & Data for Music Source Separation An online “book” powered by Jupyter Book By Ethan Manilow, Prem Seetharaman, and Justin Salamon A tutorial intended to guide people “through modern, open-source tooling and datasets for running, evaluating, researching, and deploying source separation approaches. We will pay special attention to musical source separation, though we will note where certain approaches are applicable to a wider array of source types.” Uses Python and interactive demos with visualizations. Section “basics of source separation” that includes a primer on digitizing audio signals, a look time frequency representations, what phase is, and some evaluations and measurements. Includes use of a library called nussl deep learning approaches datasets training deep networks Brian’s comments: Very cool this is an open source book Even if you don’t care about source separation, the primer on waveform digitization is amazing. The interactive features are great. Michael #5: Pass by Reference in Python: Background and Best Practices Does Python have pointers? Some languages handle function arguments as references to existing variables, which is known as pass by reference. Other languages handle them as independent values, an approach known as pass by value. Python uses pass by assignment, very similar to pass by ref. In languages that default to passing by value, you may find performance benefits from passing the variable by reference instead If you actually want to change the value, consider Returning multiple values with tuple unpacking A mutable data type Returning optional “value” types For example, how would we recreate this in Python? public static bool TryParse (string s, out int result); Tuple unpacking def tryparse(string, base=10): try: return True, int(string, base=base) except ValueError: return False, None success, result = tryparse("123") Optional types: def tryparse(string, base=10) -> Optional[int]: try: return int(string, base=base) except ValueError: return None if (n := tryparse("123")) is not None: print(n) Best Practice: Return and Reassign Brian #6: Visualizing Git Concepts by onlywei Wei Wang Git Basics is good, and important, but hard to get all these concepts to sink in well until you play with it. Visualizing Git Concepts with D3 solidifies the concepts Practice using git commands without any code, just visualizing the changes to the repository (and sometimes the remote origin repository) while typing commands. commit, branch, checkout, checkout -b reset, revert merge, rebase tag fetch, pull, push Incredibly powerful to be able to play around with these concepts without using any code or possibly mucking up your repo. Extras: Brian: micro:bit now has a speaker and a microphone - available in November Michael: Firefox containers Twitch! Joke: Q: Where do developers drink? A: The Foo bar - Knock Knock! - An async function - Who's there?
Oct 31
34 min
#204 Take the PSF survey and Will & Carlton drop by
Sponsored by Techmeme Ride Home podcast: pythonbytes.fm/ride Special guests Carlton Gibson William Vincent Brian #1: nbQA : Quality Assurance for Jupyter Notebooks Sent in by listener and Patreon supporter (woohoo!!!) Marco Gorelli. We’ve now talked about running black on Jupyter notebooks in the past (at least 2) shows? Marco’s recommendation is nbQA nbQA lets you run all this on notebooks: black isort mypy flake8 pylint pyupgrade to upgrade syntax doctest to check examples Run as a pre-commit hook Configure in pyproject.toml Also (from Marco) better than standalone black due to: can run on a directory, not just one file at a time keeps diffs minimal and easier to read then black preserves trailing semicolons, as they are used to suppress output in notebooks supports most standard magic commands And the nbQA project is tested with …. drum roll …. pytest (of course) Michael #2: The PSF yearly survey is out, go take it now! This is the fourth iteration of the official Python Developers Survey. The results of this survey serve as a major source of knowledge about the current state of the Python community Takes about 10 minutes They will randomly choose 100 winners (from those who complete the survey in its entirety), who will each receive an amazing Python Surprise Gift Pack. Analysis is really well done, see the 2019 results. Will #3: From Prototype to Production in Django Django defaults to local/prototype settings initially when run startproject command. settings.py file contains global configs for a project. What needs to change for production? DEBUG set to False SECRET_KEY actually secret and not in source control ALLOWED_HOSTS Database probably not SQLite Configure static/media files Change admin path away from /admin User registration, typically use django-allauth Environment variables preferred method to have local/production settings. environs is Will’s favorite 3rd party package but multiple others. Use Django deployment checklist aka python check --``deploy, add HTTPS all over basically DJ Checkup website What else? Testing, logging, performance, security, etc etc etc Django for Professionals book covers all of this and more including Docker Carlton #4: Deployment: Getting your app online Deployment (and how hard it it) seem to come up almost every week on Django Chat. I think a lot about The Deployment Gap that exists: a new user finishes the Django tutorial, or the DRF tutorial or Will’s Django for Beginners book, and they’re still a long way from being able to deploy. PaaS look like a good option. (Heroku/App Service/DO’s new one, and so on) but they’re a bit of a cul-de-sac (do you use that in America?) — you drive in, get to the end and then have to drive out again. On the other hand, DIY all looks far-too-scary™: there’s provisioning servers, private clouds, firewalls, permissions, block stores, objects stores, … — Argh! (On top of all the usual DNS, and all the rest of it.) Plus there’s a tendency I think towards fashion: you’d think you can’t possibly deploy without adopting a micro-service architecture, or a container orchestration platform. It’s too much. This is same as, You couldn’t possibly use Django, Postgres… — you have to use the New Hotness™. I think there’s a simpler story: start with a VM, a relational database, a simple network setup, and grow from there. There’s still moving parts, but it’s not that complex. I’m working on a tool for all this, Button. It’s coming in 2021. It’s a simpler deployment story. It’s part tool, part guide. It get’s you through the Argh! It’s too scary bit. You can sign up for early updates at https://btn.dev Brian #5: All Contributors “This is a specification for recognizing contributors to an open source project in a way that rewards each and every contribution, not just code. The basic idea is this: Use the project README (or another prominent public documentation page in the project) to recognize the contributions of members of the project community. People are giving themselves and their free time to contribute to open source projects in so many ways, so we believe everyone should be praised for their contributions (code or not).” used by nbQA It is a specification for how to be consistent in listing contributors. Also includes an Emoji Key, to be used with contributors name (and optionally avatar) to denote the kind of contribution they’ve made: 💻 code 📖 doc 🎨 design 💡 example 🚧 maintenance 🔌 plugin and many, many more And a GitHub bot to automate acknowledging contributors to your open source projects. Uses natural language parsing to add people as contributors and add the appropriate emoji Also includes a cli for adding contributors, comparing GitHub contributors to your listed contributors, and more. Michael #6: MovingPandas A Python library for handling movement data based on Pandas and GeoPandas It provides trajectory data structures and functions for analysis and visualization. MovingPandas development started as a QGIS plugin idea in 2018. But made more sense as its own library Features Convert GeoPandas GeoDataFrames of time-stamped points into MovingPandas Trajectories and TrajectoryCollections Add trajectory properties, such as movement speed and direction Split continuous observations into individual trips Generalize Trajectories Aggregate TrajectoryCollections into flow maps Create static and interactive visualizations for data exploration MovingPandas makes it straightforward to compute movement characteristics, such as trajectory length and duration, as well as movement speed and direction. Example df = pd.DataFrame([ {'geometry':Point(0,0), 't':datetime(2018,1,1,12,0,0)}, {'geometry':Point(6,0), 't':datetime(2018,1,1,12,6,0)}, {'geometry':Point(6,6), 't':datetime(2018,1,1,12,10,0)}, {'geometry':Point(9,9), 't':datetime(2018,1,1,12,15,0)} ]).set_index('t') gdf = gpd.GeoDataFrame(df, crs=CRS(31256)) traj = mpd.Trajectory(gdf, 1) Extras Carlton: Listen to Django Chat Check out Will’s Tutorials and Books Sign up for early updates on Button Michael: Transcripts are back! Trying to live in DuckDuckGo land. Crazy or smart? :) And remember to do your periodic Google Takeout Deployed my first Fast API web API & site Joke: “Give a person a program, frustrate them for a day. Teach them to program, frustrate them for a lifetime. 🙂” (…unless you teach them to test at the same time. - Brian) The failed interview: “Sorry, we’re looking for someone aged 22-26… with 30 years of experience with Flask”
Oct 23
40 min
#203 Scripting a masterpiece for Python web automation
Sponsored by DataDog: pythonbytes.fm/datadog Michael #1: Introducing DigitalOcean App Platform Reimagining PaaS to make it simpler for you to build, deploy, and scale apps. Many of our customers have come to DigitalOcean after their PaaS became too expensive, or after hitting various limitations. You can build, deploy, and scale apps and static sites by simply pointing to your GitHub repository. Built on DigitalOcean Kubernetes, the App Platform brings the power, scale, and flexibility of Kubernetes without exposing you to any of its complexity. App Platform is built on open standards providing more visibility into the underlying infrastructure than in a typical closed PaaS environment. You can also enable ‘Autodeploy on Push,’ which automatically re-deploys the app each time you push to the branch containing the source code. To efficiently handle traffic spikes (planned or unplanned), the App Platform lets you scale apps horizontally (i.e., add more instances that serve your app) and vertically (beef up the instances with more CPU and memory resources). (with zero downtime) What can you build with the App Platform? Web apps, Static sites, APIs, Background workers Brian #2: Announcing Playwright for Python playwright-python playwrignt-pytest it’s a Microsoft thing the pitch: “With the Playwright API, you can author end-to-end tests that run on all modern web browsers. Playwright delivers automation that is faster, more reliable and more capable than existing testing tools.” timeout-free automation automatically waits for the UI to be ready Intended to stay modern emulation of mobile viewports geolocation web permissions can automate scenarios across multiple pages cross browser Chromium (Chrome and Edge), WebKit (Safari), and Firefox Safari rendering even works on Windows and Linux pytest compatible Django compatible Can work within CI/CD, even GH actions. Michael #3: Asynchronously Opening and Closing Files in asyncio Article by Chris Wellons asyncio has support for asynchronous networking, subprocesses, and interprocess communication. However, it has nothing for asynchronous file operations — opening, reading, writing, or closing. If a file operation takes a long time, perhaps because the file is on a network mount, then the entire Python process will hang. Let’s build it! The usual way to work around the lack of operating system support for a particular asynchronous operation is to dedicate threads to waiting on those operations. By using a thread pool, we can even avoid the overhead of spawning threads when we need them. Plus asyncio is designed to play nicely with thread pools anyway. open() uses with so build an aopen() to have async with. Here’s the tasty bit: def __aenter__(self): def thread_open(): return open(*self._args, **self._kwargs) loop = asyncio.get_event_loop() self._future = loop.run_in_executor(None, thread_open) return self._future aiofile package Brian #4: Excel: Why using Microsoft's tool caused Covid-19 results to be lost this article was on bbc.com, but it was in several places Nearly 16,000 coronavirus cases went unreported in England. Logs pulled together from data from commercial testing firms (filed as csv files) was combined in a Excel xls template so that it could then be uploaded to a central system and made available to the NHS Test and Trace team, as well as other government computer dashboards. XLS was one problem. Limit is about 65k rows. XLSX increases that limit by about 16 times. But still, …. Excel for this? Comment from Prof Jon Crowcroft from the University of Cambridge: "Excel was always meant for people mucking around with a bunch of data for their small company to see what it looked like.” “And then when you need to do something more serious, you build something bespoke that works - there's dozens of other things you could do.” "But you wouldn't use XLS. Nobody would start with that." In short: Best practices in computing don’t always make it into the rest of the world. Much of the world still runs on Excel. What does this have to do with Python? Well.. Big datasets should use databases and Python. Check out the Talk Python free webcast on moving from Excel to Python: talkpython.fm/excel-webcast Michael #5: locust.io via Prayson Daniel locust.io is awesome tool to simulate users hammering your endpoint. Quite handy. An open source load testing tool: Define user behavior with Python code, and swarm your system with millions of simultaneous users. Usage: after installing it via pip, you can map your local endpoint locust --host=http://localhost:5000 and open http://localhost:8089 to access the locust web ui to simulate usage Features: Define user behavior in code: No need for clunky UIs or bloated XML. Just plain code. Distributed & scalable: Locust supports running load tests distributed over multiple machines, and can therefore be used to simulate millions of simultaneous users Proven & battle tested: Locust has been used to simulate millions of simultaneous users. Battlelog, the web app for the Battlefield games, is load tested using Locust, so one can really say Locust is Battletested ;). Example: from locust import HttpUser, between, task class WebsiteUser(HttpUser): wait_time = between(5, 15) def on_start(self): self.client.post("/login", { "username": "test_user", "password": "" }) @task def index(self): self.client.get("/") self.client.get("/static/assets.js") @task def about(self): self.client.get("/about/") Brian #6: Fixing Hacktoberfest various sources Hacktoberfest is an interesting idea sponsored by Digital Ocean, and other sponsors. Overall, it’s a good idea. Encourage people to contribute by bribing them with a t-shirt and other swag. Problem and some solutions outlined well by Anthony Sottile in what’s (wrong with) hacktoberfest? There’s always been some spam associated with hacktoberfest. Tiny bizarre PRs, PRs to unmaintained repos, etc. This year has been worse A fairly popular YouTuber posted a video showing people how to get a free t-shirt by doing things like adding “- an awesome project” or expanding “It’s” to “It is” to the readme, then submitting it as “improved docs”. Changes: On 10/3, rules changed: An update on efforts to reduce spam with Hacktoberfest: introducing maintainer opt-in and more maintainers can opt-in by adding hacktoberfest topic to their repo. No longer have to opt out Should discourage spamming of inactive repos Summary: PRs count if: > Submitted during the month of October AND ( > The PR is labelled as hacktoberfest-accepted by a maintainer OR > Submitted in a repo with the hacktoberfest topic AND ( > The PR is merged OR > The PR has been approved > ) > ) - The deadline for completions, merging, labeling, and approving is November 1. - I applaud DO and whoever else is working on hacktoberfest for reacting quickly to this. Extras: Michael: PyCascades 2021 will take place on Saturday, February 20th from many locations across the Pacific Northwest and beyond. Call for Proposals 📣 PyCascades has been lucky to give our stage to incredible speakers with wonderful talks over the last three years. We are really looking forward to showcasing our community again next year. Our Call for Proposals (CFP) opens today and closes at the end of the day on Tuesday, November 10th, 2020 Anywhere on Earth. Patricio Reyes, a researcher at Barcelona Supercomputing Center (virtual tour): You could also consider talking about nb_black: a simple black formatter for Jupyter and JupyterLab too. There is another project (only for JupyterLab): JupyterLab Code Formatter: jupyterlab-code-formatter.readthedocs.io Joke: More Classical Programmer Paintings “Delivering a feature in the time of a codefreeze” “RHEL sysadmins entering the Docker convention floor”
Oct 16
40 min
#202 Jupyter is back in black!
Sponsored by DataDog: pythonbytes.fm/datadog Brian #1: New in Python 3.9 scheduled to be released Oct 5 Python 3.9.0rc2 released Sept 17 New features (highlights) Dictionary merge (|) and update (|=) operators. String str.removeprefix(prefix) and str.removesuffix(suffix). This have also been added to bytes, bytearray, and collections.UserString. In type annotations you can now use built-in collection types such as list and dict as generic types instead of importing the corresponding capitalized types (e.g. List or Dict) from typing. New PEG parser Any valid expression can be used as a decorator. see PEP 614. Haven’t quite wrapped my head around the possibilities yet. [zoneinfo](https://docs.python.org/3.9/library/zoneinfo.html#module-zoneinfo) module brings support for the IANA time zone database to the standard library. Lots of other great stuff too, please check out the changelog and give 3.9 a spin Michael #2: jupyter-black via Mary Hoang I recently tuned into the auto racing episode on Talk Python and liked Kane’s pypi suggestion of blackcellmagic. There are a couple of other pypi packages that envelop the idea of black formatting Jupyter Notebooks and I recently started using a new pypi tool called jupyterblack! This tool lets you black format Notebooks like you would Python files, only you call jblack instead of black. Then the extension provides a toolbar button a keyboard shortcut for reformatting the current code-cell (default: Ctrl-B) a keyboard shortcut for reformatting whole code-cells (default: Ctrl-Shift-B) It will also point basic syntax errors. Brian #3: Understanding and preventing DoS in web applications listener submitted suggestion, which led me to a bit of a rabbit hole by Jacob Kaplan-Moss Great discussion of what a DoS attack is, and how to check for and prevent problems, including a focus on Python and django. One example is ReDoS, regular expression DoS “ReDoS bugs occur when certain types of strings can cause improperly crafted regular expressions to perform extremely poorly.” Links to Finding Python ReDoS bugs at scale using Dlint and r2c, which talks about using dlint. dlint DoS linter plugin for flake8 Checks for a huge number of security problems in Python code. Can be used alongside Bandit. Michael #4: bbox-visualizer via Shoumik Sharar Chowdhury (SHOH-mik CHOW-duh-ree) I work with computer vision, and one of the pain points of working with something like object detection or object recognition is positioning the labels once you get the bounding boxes. So for example, in the first image in the README, you get the positions of the boxes around the objects using any object-detection method. That part isn't hard. Positioning the labels like "person", "bicycle", "car" right on top of the boxes, however, is quite annoying. You have to do some clumsy math to make it work like that. This library helps make that very easy. You just use the bounding box locations and their corresponding labels and the library takes care of everything. Moreover, there are some other cool visualizations that you can use, other than the standard label on top of the boxes. Uses Open CV in Python to work with the image files and in memory drawing Define the bounds, set the label text and you’re off. bbv.draw_rectangle(img, bbox) bbv.add_T_label(img, label, bbox) Brian #5: How to NEVER use lambdas. Another listener suggestion. Starts off with a brief example showing how to rewrite a power function as a lambda. Then jumps right into crazy code Replacing import statements with __import__(``'``library``'``) expressions Moving on to lambda-ifying class definitions Ending with a complete Flask application as a lambda expression. Truly horrible stuff Michael #6: Uncommon Contributions: Making impact without touching the core of a library via Alexander, by Vincent Warmerdam Different ways that people can contribute to open source software besides the typical code contribution. Often, contributions include adding features to a library, fixing bugs, or providing examples to a documentation page. But consider: Info rasa --version Before, this command would list the current version of Rasa. In the new version, it lists: The version of python. The path to your virtual environment. The versions of related packages. Cron on Dependencies A user for scikit-lego, a package that I maintain, discovered that the reason the code wasn’t working was because scikit-learn introduced a minor, but breaking, change. To fix this the user added a cronjob with Github actions to the project. Spellcheck Run a spellchecker, not just against our docs, but also on our source code! It turns out we had some issues in our docstrings as well. Error Messages In whatlies, we’ve recently allowed for optional dependencies. If you try to use a part of the library that requires a dependency that is not part of the base package, then you’ll get this error message. In order to use ConveRTLanguage you'll need to install via; > pip install whatlies[tfhub] See installation guide here: https://rasahq.github.io/whatlies/#installation I added something like this to fluentcheck: github.com/csparpa/fluentcheck/pull/22 Failing Unit Tests There’s a lovely plugin for mkdocs called mkdocs-jupyter. It allows you to easily add jupyter notebooks to your documentation pages. When I was playing with it, I noticed that it wasn’t compatible with a new version of mkdocs. Instead of just submitting a bug to Github, I went the extra mile. I created a PR that contained a failing unit-test for this issue. Renaming files Is there a file.py and a class File in file within a package? Careful there. Extras: Brian: Learn to code with Wonder Woman, Smithsonian Learning Labs, and NASA Microsoft Education Direct link: https://www.microsoft.com/inculture/wonderwoman-1984/ At least some of the tutorials are Python. Not sure if all are. Michael: IndyPy: Python Memory Deep Dive with Michael Kennedy Joke: Suggested by Tim Skov Jacobsen Kelsey Hightower’s project nocode “No Code: No code is the best way to write secure and reliable applications. Write nothing; deploy nowhere.” “No Code Style Guide: All no code programs are the same, regardless of use case, any code you write is a liability.” 43.6k stars 3.2k issues 426 PRs
Oct 9
33 min
#201 Understand git by rebuilding it in Python
Sponsored by us! Support our work through: Our courses at Talk Python Training Python Testing with pytest Michael #1: Under the hood of calling C/C++ from Python Basics first: what C compiles to? Each operating system features some exact format to work with. Among the most popular ones are: ELF (Executable and Linkable Format), used by most of Linux distros PE (Portable Executable), used by Windows Mach-O (Mach object), used by Apple products We also need to make our library visible to our programs. An easiest way to do so is to copy it to /usr/lib/ - default system-wide directory for libraries. Maybe put it in system / system32 on Windows? ctypes: the simplest way With the shared object compiled, we are ready to call it. Consider ctypes to be the easiest way to execute some C code, because: it’s included in the standard library, writing a wrapper uses plain Python. lib = ctypes.CDLL(f'/usr/lib/libdullmath.so') lib.get_pi For C: You need to be clear about the calling convention (extern “C” for example) Now we can load libraries at runtime, but we are still missing the way to generate correct caller ABI to use external C libraries. Do deal with it, libffi was created. Libffi is a portable C library, designed for implementing FFI tools, hence the name. Given structs and functions definitions, it calculates an ABI of function calls at runtime. A mature approach to improve in this area is to allow libraries to introduce themselves. We can oblige every library to define a function named entry_point, which will return metadata about functions it contains. Final destination: C/C++ extensions and Python/C API CPython provides a similar API for implementing C-based extensions: “Extending and Embedding the Python Interpreter”. // NOTE: entry point function has dynamic name PyInit_[HTML_REMOVED] PyMODINIT_FUNC PyInit_mymath(void) { return PyModule_Create(&mymathmodule); } The main difference is that we have to wrap initial C functions with Python-specific ones. CPython interpreter uses its own PyObject type internally rather than raw int, char*, and so on, and we need the wrappers to perform the conversion. Cython, Boost.Python, pybind11 and all all all The main challenge of writing pure C extensions is a massive amount of boilerplate that needs to be written. Mainly this boilerplate is related to wrapping and unwrapping PyObject. It becomes especially hard if a module introduces its own classes (object types). To solve this issue, a plethora of different tools was created. All of them introduce a certain way to generate wrapping boilerplate automatically. They also provide easy access to C++ code and advanced tools for the compilation of extensions. Examples aiohttp - asyncio web framework that uses Cython for HTTP parsing, uvloop - event loop that is wrapping libuv, fully written in Cython, httptools - bindings to nodejs HTTP parser, also fully written in Cython (a lot of other big projects like sanic or uvicorn use httptools). Cecil #2: ugit: DIY Git in Python Michael #3: Things I Learned to Become a Senior Software Engineer by Niel Kakkar Growing using different ladders of abstraction Entering my second year, I had all the basics in place. I did figure out something insightful. I’m working inside the software development lifecycle, but this lifecycle is part of a bigger lifecycle: the product and infrastructure development lifecycle. Learning what people around me are doing Since we’re not in a closed system, it makes sense to better understand the job of the product managers, the sales people, and the analysts. Product managers are the best source for this. They know how the business makes money, who are the clients, and what do clients need. Learning good habits of mind Thinking well: Diving into cog sci, one output was a framework for critical thinking. It’s compounding, and compounding is powerful. Strategies for making day-to-day more effective: The other side of the coin is habits that allow you to think well. It starts with noticing little irritations during the day, inefficiencies in meetings, and then figuring out strategies to avoid them. Some good habits I’ve noticed: Never leave a meeting without making the decision / having a next action Decide who is going to get it done. Things without an owner rarely get done. Document design decisions made during a project Acquiring new tools for thought & mental models New tools for thought are related to thinking well, but more specific to software engineering. For example, I was recently struggling with a domain with lots of complex business logic. Edge cases were the norm, and we wanted to design a system that handles this cleanly. That’s when I read about Domain Driven Design Protect your slack When I say slack, I don’t mean the company, but the adjective. One thing that gives me high output and productivity gains is to “slow down”. Want to get more done? Slow down. When there is slack, you get a chance to experiment, learn, and think things through. This means you get enough time to get things done. When there is no slack, deadlines are tight, and all your focus goes into getting shit done. Ask Questions Q: What is a package? A: It’s code wrapped together that can be installed on a system. Q: Why do I need packages? A: They give a consistent way of getting all the files you need in the right place. Without them, things are easy to mess up. You need to ensure every file is where it’s supposed to be, the system paths are set up, and dependent packages are available. Q: How do packages differ from applications I can install on my system? A: It’s a very similar idea! Windows installer is like a package manager that helps install applications. Similarly, DPKG and rpm packages are like .exes that you can install on Linux systems, with the help of apt and yum package managers, which are like the windows installers. Force multipliers One sprint I didn’t get much done myself. I wrote very limited code. Instead, I co-ordinated which changes should go out when (it was a complicated sprint), tested they worked well, did lots of code reviews, made alternate design suggestions, and pair-programmed wherever I could to get things un-stuck. We got everything done, and in this case, zooming out helped make decisions for PRs easier. It was one of our highest velocity sprints. Embrace fear: I’ve learned to embrace this feeling. It excites me. It’s information about what I’m going to learn. I’ve taken it so far that I’ve started tracking it in my human log - “Did I feel fear this week?” If the answer is no too many weeks in a row, I’ve gotten too comfortable. Super powers Getting into the source code when documentation isn’t enough Quest: Reading open source code. Quickly build a mental model for the code you’re looking at Quest: Reading open source code. Embracing fear Quest: Build a side project. Confidence to express ignorance Quest: Overcome the first gotcha with growing. Cecil #4: Build tech skills for space exploration Michael #5: Profiling Django Views by Farhan Azmi We know we need to profile our code Many Python profiling tools exist, but this article will limit only to the most used tools: cProfile and django-silk . The two tools mainly profile in regards to function calls and execution time. To incorporate cProfile to Django views, we can write our own middleware that captures the profiling on every request sent to our Django views. Thankfully, there exists a simpler solution: django-cprofile-middleware. It is a simple profiling middleware created by a Github user omarish. To profile this view with the installed middleware, we can just append prof parameter to the end of the URL, i.e. http://localhost:8000/api/auth/users/availability/?username=[HTML_REMOVED]&email=[HTML_REMOVED]&prof We can visualize the profile result further with Python profiler visualizing library, such as SnakeViz. Just add &download to the request. the profile result could not show which database query that brings performance hit. This is needed especially when our application is centered around database (SQL) queries: That’s where django-silk comes in. Add as middleware: Silk will automatically intercept requests we make to our views and the UI can be accessed from the path /silk/ . Dive into a request to see all the headers/form/etc + DB query and perf. Cecil #6: Send an SMS message with Azure Communication Services Extras: Michael: Was on Real Python podcast Cecil: https://studentambassadors.microsoft.com/ Joke: Dependencies
Oct 2
40 min
#200 No dog-piling please (it's episode 200!)
Sponsored by us! Support our work through: Our courses at Talk Python Training Python Testing with pytest Brian #1: How to be helpful online Ned Batchelder When answering questions. Lots of great advice. We’ll focus on just a few here. Answer the question first. There may be other problems with their code that they are not asking about that you want to point out. But keep that for after you’ve helped them and built up trust. No third rails. “It should be OK for someone to ask for help with a program using sockets, and not have to defend using sockets, especially if the specific question has nothing to do with sockets.” Same for pickle, threads, globals, singletons, etc. Don’t let your strong opinions derail the conversation. The goal is to help people. Strong reactions can make the asker feel attacked. No dog-piling. Meet their level. “Try to determine what they know, and give them a reasonable next step, not the ultimate solution. A suboptimal solution they understand is better than a gold standard they can’t make use of.” Say yes. Avoid absolutes. Step back. Take some blame. Use more words. “IRC and other online mediums encourage quick short responses, which are exactly the kinds of responses that will be easy to misinterpret. Try to use more words, especially encouraging optimistic words.” Understand your motivations. Humility. Make connections. Finally: It’s hard. All of Ned’s advice is great. Good meditations for when you read a question and your mouth drops open and your eyes stare in shock. Michael #2: blackcellmagic IPython magic command to format python code in cell using black. Has a great animated gif ;) Just do: %load_ext blackcellmagic Then in any cell %%black and magic! Accepts “arguments” like %%black -l 79 Tobin Jones has been kind enough to develop a NPM package over blackcellmagic to format all cells at once which can be found here. But it’s archived so no idea whether it’s current. Brian #3: Test smarter, not harder Luke Plant There’s lots of great advice in here, but I want to highlight two parts that are often overlooked. “Write your test code with the functions/methods/classes you wish existed, not the ones you’ve been given.” “If the API you want to use doesn’t exist yet, you still use it, and then make it exist.” This is huge. People tend to think like this while coding, but forget to do it while testing. Also. Your tests are often the first client for your API, so if the API in question is under your control and you need an easier API for testing, consider adding it to the real API. If it’s easier for testing, it may be easier for other clients of the API as well. “Only write necessary tests — specifically, tests whose estimated value is greater than their estimated cost. This is a hard judgement call, of course, but it does mean that at least some of the time you should be saying “it’s not worth it”.” Michael #4: US: The Greatest Package in the World by Jeremy Carbaugh A package for easily working with US and state metadata: all US states and territories postal abbreviations Associated Press style abbreviations FIPS codes capitals years of statehood time zones phonetic state name lookup is contiguous or continental URLs to shapefiles for state, census, congressional districts, counties, and census tracts The state lookup method allows matching by FIPS code, abbreviation, and name Even a CLI: $ states md Brian #5: Think Like A Coder Part of TED-Ed “… a 10-episode series that will challenge viewers with programming puzzles as the main characters— a girl and her robot companion— attempt to save a world that has been plunged into turmoil.” Although, I only count 9 episodes, I was 4 episodes in and hooked. Main cool thing, I think, is introducing terms and topics so they will be familiar when someone really does start coding: loops, for loops, until loops, while loops conditionals variables path logic permutations searches tables recursion Big O Also highly recommended for getting excited about coding: Girls Who Code: Learn to Code and Change the World TED-Ed has tons of other cool series on lots of subjects. CodeCombat Michael #6: Costs of running a Python web app for 55k monthly users How much does running a web app in production actually cost? KeepTheScore is an online software for scorekeeping. Create your own scoreboard for up to 150 players and start tracking points. It's mostly free and requires no user account. Keepthescore.co is a Python flask application running on DigitalOcean and Firebase. It currently has around 55k unique visitors per month, per day it’s around 3.4k. Servers and database on DigitalOcean: Costs per month: $95, the servers are oversized for the load they’re currently seeing. Amazon Web Services: Costs per month: $60, use a reporting tool called Metabase to generate insights and reports from the database Google Cloud, costs per month: $1.32, for Firebase DNS hosting, costs per month: $5 Disqus, costs per month: $10 Is it worth it? Is there revenue? In total that’s around $171 USD per month. If you’re running a company with employees that would be peanuts, but in this case the cost is being borne by a single indie-developer out of his own pocket. The bigger issue is that on the revenue side there’s a big fat zero. This is the reason why we are currently working on monetization. Some Talk Python stats: Maybe 40k monthly visitors, but oh, the podcast clients 3M requests / month just RSS, resulting in 320 GB / mo of XML traffic. We run on two prod servers: $10 & $5 as well as a dedicated MongoDB server @ $10. Total $25/mo. On the other hand, Talk Python Training's AWS bill last month was over $1,000 USD. You can hear a bunch about this on Talk Python 215. Joke: From twitter, originally from Netlify: "Oh no! We lost the hackers! Where did they go?" "I don't know! They just ransomware!” Number of days since I have encountered an array index error: -1.
Sep 25
32 min
Load more