Detailed
Compact
Art
Reverse
August 2, 2020
Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Brian #1: Building a self-updating profile README for GitHub Simon Willison, co-createor of Django “GitHub quietly released a new feature at some point in the past few days: profile READMEs. Create a repository with the same name as your GitHub account (in my case that’s github.com/simonw/simonw), add a README.md to it and GitHub will render the contents at the top of your personal profile page—for me that’s github.com/simonw” Simon takes it one further, and uses GitHub actions to keep the README up to date. Uses Python to: Grab recent releases from certain GH repos using GH GraphQL API Links to blog entries using feedparser Retrieve latest links using SQL queries Michael #2: Handcalcs Created by Connor Ferster In design engineering, you need to do lots of calculations and have those calculation sheets be kept as legal records as part of the project's design history. If they are not being done by hand, then often Excel is used but formatting calculations in Excel is time consuming and a maintenance nightmare. However, doing calculations in Jupyter is not any better even if you fill it up with print() statements and print to PDF: it just looks like a bunch of code output. Even proprietary software like MathCAD cannot render math as good as a hand calculation because it does not show the numerical substitution step. No software does Why handcalcs exists: Type the formula once into a Jupyter cell Have the calculation be rendered out as beautifully as though you had written it by hand. Write your notebooks once, and use them for calculation again and again; the formula you write is the same as the representation of the formula. **Symbolic** The primary purpose of handcalcs is to render the full calculation with the numeric substitution. This allows for easy traceability and verification of the calculation. However, there may be instances when it is preferred to simply display calculations symbolically. For example, you can use the # Symbolic tag to use handcalcs as a fast way to render Latex equations symbolically. Includes longhand vs. shorthand Use units (mm^3) for example. Brian #3: The (non-)return of the Python print statement Article by Jake Edge Idea by Guido van Rossum to bring back the print statement. Short answer: not gonna happen Michael #4: FastAPI for Flask Users Flask has become the de-facto choice for API development FastAPI that has been getting a lot of community traction lately Benefits Automatic data validation documentation generation baked-in best-practices such as pydantic schemas and python typing Running “Hello World” - super similar, but FastAPI is uvicorn out of the box @app.get('/') vs @app.route('/') FastAPI defers serving to a production-ready server called uvicorn. URL Variables: Flask @app.route('/users/[HTML_REMOVED]') def get_user_details(user_id): - FastAPI @app.get('/users/{user_id}') def get_user_details(user_id: int): Query Strings Flask @app.route('/search') def search(): query = request.args.get('q') FastAPI @app.get('/search') def search(q: str): Taking inbound JSON request in FastAPI: def lower_case(json_data: Dict) Nice but if you define a Sentence model via pydantic: @app.post('/lowercase') def lower_case(sentence: Sentence): Blueprints == Routers Automatic validation via pydantic Brian #5: Tweet deleting with tweepy Chris Albon A useful and simple example of using tweepy to interact with Twitter Chris set up and shared a Python script that deletes tweets that are: older than 62 days have been liked by less than a 100 people haven’t been liked by yourself Michael #6: Clinging to memory: how Python function calls can increase your memory usage by Itamar Turner-Trauring I had Itamar on Talk Python episode 274 to discuss FIL which was recently covered. This article basically uses FIL to explore patterns for lowering memory usage within the context of a function. With simple code like this, we expected 2GB of memory usage, but we saw 3GB: - def process_data(): data = load_1GB_of_data() return modify2(modify1(data)) The problem is that first allocation: we don’t need it any more once modify1() has created the modified version. But because of the local variable data in process_data(), it is not freed from memory until process_data() returns. Solution #1: No local variable at all return modify2(modify1(load_1GB_of_data())) Solution #2: Re-use the local variable data = load_1GB_of_data() data = modify1(data) data = modify2(data) return data Solution #3: Transfer object ownership See article Extras: Michael: Pickle Use Example via Adam. I once had to work on an API that spoke to a 3rd party service that was a little unusual. That communication to the 3rd party service was over a raw socket connection, so we were responsible for crafting specifically formatted byte arrays to send to them, and we'd get specifically formatted byte arrays back which we'd then have to parse out to determine what pieces of data were in the message. The other wrinkle: that service wasn't available 24/7 but only during limited specific testing periods which had to be negotiated days in advance. We instrumented the code with a feature flag to enable pickling all received messages from that 3rd party service. Python 3.8.4 is out I'm an Arctic Code Vault Contributor over at GitHub. You might be too. Joke:
July 22, 2020
Special guest: Ines Montani Michael #1: VS Code Device Simulator Want to experiment with MicroPython? Teaching a course with little IoT devices? Circuit Playground Express BBC micro:bit Adafruit CLUE with a screen Get a free VS code extension that adds a high fidelity simulator Easily create the starter code (main.py) Interact with all the sensors (buttons, motion sensors, acceleration detection, device shake detection, etc.) Deploy and debug on a real device when ready Had the team over on Talk Python. Brian #2: pytest 6.0.0rc1 New features You can put configuration in pyproject.toml Inline type annotations. Most user facing API and internal code. New flags - --no-header - --no-summary - --strict-config : error on unknown config key - --code-highlight : turn on/off code highlighting in terminal Recursive comparison for dataclass and attrs Tons of fixes Improved documentation There’s a list of breaking changes and deprications. But really, nothing in the list seems like a big deal to me. Plugin authors, including myself, should go test this. Already found one problem. pytest-check: stop on fail works fine, but failing tests marked with xfail show up as xpass. Gonna have to look into that. And might have to recruit Anthony to help out again. To try it: pip install pytest==6.0.0rc1 I’m currently running through the pytest book to make sure it all still works with pytest 6. So far, so good. The one hiccup I’ve found so far, TinyDB had a breaking change with 4.0, so you need to pip install tinydb==3.15.2 to get the tasks project to run right. I should have pinned that in the original setup.py. However, all of the pytest stuff is still valid. Guido just tweeted: “Yay type annotations in pytest!” Ines #3: TextAttack Python framework for adversarial attacks and data augmentation for natural language processing What are adversarial attacks? You might have seen examples like these: image classifier predicting a cat even if the image is complete noise people at protests wearing shirts and masks with certain patterns to trick facial recognition Google Translate hallucinating bible texts if you feed it nonsense or repetitive syllables What does it mean to "understand" a model? How does it behave in different situations, with unexpected data? We can't just inspect the weights – that's not how neural networks work To understand a model, we need to run it and find behaviours we don't like TextAttack lets you run various different “attacks” from the current academic literature It also lets you create more robust training data using data augmentation, for example, replacing words with synonyms, swapping characters, etc. Michael #4: What is the core of the Python programming language? By Brett Cannon, core developer Brett and I discussed Python implementation for WebAssembly before Get Python into the browser, but with the fact that both iOS and Android support running JavaScript as part of an app it would also get Python on to mobile. We have lived with CPython for so long that I suspect most of us simply think that "Python == CPython". PyPy tries to be so compatible that they will implement implementation details of CPython. Basically most implementations of Python strive to pass CPython's test suite and to be as compatible with CPython as possible. Python’s dynamic nature makes it hard to do outside of an interpreter That has led Brett to contemplate the question of what exactly is Python? How much would one have to implement to compile Python directly to WebAssembly and still be considered a Python implementation? Does Python need a REPL? Could you live without locals()? How much compatibility is necessary to be useful? The answer dictates how hard it is to implement Python and how compatible it would be with preexisting software. [Brett] has no answers It might make sense to develop a compiler that translates Python code directly to WebAssembly and sacrifice some compatibility for performance. It might make sense to develop an interpreter that targets WebAssembly's design but maintains a lot of compatibility with preexisting code. It might make sense to simply support RustPython in their WebAssembly endeavours. Maybe Pyodide will get us there. Michael’s thoughts: How about a Python standard language spec? A standard-library “standard???!?” spec. It’s possible - .NET did it. What would be build if we could build it with web assembly? Interesting options open up, say with NodeJS like capabilities, front-end frameworks This could be MUCH bigger if we got browser makes to support alternative runtimes through WebAssembly Brian #5: Getting started with Pathlib Chris May Blog post: Stop working so hard on paths. Get started with pathlib! PDF “field guide”: Getting started with Pathlib Really great introduction to Pathlib Some of the info This file as a path object: Path(__file__) Parent directory: Path(__file__).parent Absolute path: Path(__file__).parent.resolve() Two levels up: Path(__file__).resolve(strict=True).parents[1] See pdf for explanation. Current working dir: Path.cwd() Path building with / Working with files and folders Using glob Finding parts of paths and file names. Any time spent learning Pathlib is worth it. If I can do it in Pathlib, I do. It makes my code more readable. Ines #6: Data Version Control (DVC) We're currently working on v3.0 of spaCy and one of the big features is going to be a completely new way to train your custom models, manage end-to-end training workflows and make your experiments reproducible It will also integrate with a tool called DVC (short for Data Version Control), which we've started using internally DVC is an open-source tool for version control, specifically for machine learning and data Machine learning = code + data. You can check your code into a Git repo, but you can't really check in your datasets and model weights. So it's very difficult to keep track of changes. You can think of DVC as “Git for data” and the command line usage is actually pretty similar – for example, you run dvc init to initialize a repo and dvc add to start tracking assets DVC lets you track any assets by adding meta files to your repository. So everything, including your data, is versioned, and you can always go back to the commit with the best accuracy It also builds a dependency graph based on the inputs and outputs of each step, so you only have to re-run a step if things changed for example, you might have a preprocessing step that converts your data and then a step that trains your model. If the data hasn't changed, you don't have to re-run the preprocessing step. They recently released a new tool called CML (short for Continuous Machine Learning), which we haven't tried yet. CI for Machine Learning Previews look pretty cool: you can submit a PR with some changes and a GitHub action will run your experiment and auto-comment on the PR with the results, changes in accuracy and some graphs (similar to tools like Code Coverage etc.) Extra Michael: Podcast Python Search API package, by Anton Zhiyanov Mid-string f-string upgrades coming to PyCharm. And Flynt! via Colin Martin Ines: Built-in generic types in 3.9 (PEP 585): you can now write list[str] ! Brian: https://testandcode.com/120: FastAPI & Typer - Sebastián Ramírez Jokes Fast API Job Experience Sebastián Ramírez - @tiangolo I saw a job post the other day. It required 4+ years of experience in FastAPI. I couldn't apply as I only have 1.5+ years of experience since I created that thing. Maybe it's time to re-evaluate that "years of experience = skill level". Defragged Zebra
July 16, 2020
Our courses at Talk Python Training Brian’s pytest book Brian #1: Python async frameworks - Beyond developer tribalism Tom Christie Written on encode.io. encode also encompasses several awesome projects: Django REST framework HTTPX async projects: starlette, uvicorn, orm, databases, broadcaster Partly a reaction to “Async Python is not faster” Tom would like to see the Python community move beyond polarizing discussions. “… we could probably benefit from a bit more recognition of where there is shared ground. And in areas where there’s less clarity, to be able to have constructive conversations around the relative merits in differing approaches.” Some points about performance You probably shouldn’t care about performance when you start a project. Success of a project is more related to development experience and strength of the surrounding ecosystem. We should care enough about performance that people don’t dismiss Python due to performance issues. Be careful about the word “performance”. Single async function calls are slightly slower. But as concurrency increases on I/O bound systems, async Python will remain more efficient at interleaving the concurrent tasks. There are no good benchmarks. There are valid unknowns. Should we have hybrid frameworks or have new async frameworks? There are different approaches asyncio, trio, twisted, curio In general, Python async discussions continue to move toward positive discourse, even with this divisive topic and strong opinions. “In short this is a call for the benefits of adopting a genuinely collaborative mindset rather than a competitive mindset. We may all working on different little corners of the landscape, but we’re can still all appreciate that in the bigger view, we’re all working together.” Michael #2: commitizen Create committing rules for projects 🚀, auto bump versions ⬆️ and auto changelog generation 📂 by Wei Lee For teams Main purpose is to define a standard way of committing rules and communicating it (using the cli provided by commitizen). Commitizen features Command-line utility to create commits with your rules. Defaults: Conventional commits Display information about your commit rules (commands: schema, example, info) Bump version automatically using semantic versioning based on the commits. Read More Generate a changelog using Keep a changelog Anthony #3: International PyCons go online (kind of) There are lots of regional PyCons across the world. Some of the larger ones are Cancelled: PyConWeb, PyCon Israel, PyCon Odessa (Ukraine), EuroSciPy (Spain), PyCon Brasil, PyCon UK, PyCon Thailand, PyCon ES (Spain), DragonPy (Slovenia), PyLatam, PyCon DE, PyCon CZ Online: PyCon US, Python Web Conference, FlaskCon, SciPy, PyHEP, EuroPython, PyCon JP (Japan), PyCon AU (Australia), PyCon Bolivia, PyCon ZA (RSA), PyCon APAC (Malaysia), PyCon HK, PyBay, PyCon Africa In Person: PyCon Taiwan, PyCon Italia, PyCon Russia Favourites: Deceptive Security using Python - Gajendra Deshpande If Statements are a Code Smell - Aly Sivji Stop Using Mocks.. For a while - Harry Percival Network Analysis and Text PEP in Analysis - Tomoko Furuki Using Python to Detect Vulnerabilities in Binaries - Terri Oda Optimize Python and Django apps with PostGres superpowers - Louise Grandjonc Brian #4: PEP 618 -- Add Optional Length-Checking To zip This PEP proposes adding an optional strict boolean keyword parameter to the built-in zip. When enabled, a ValueError is raised if one of the arguments is exhausted before the others. Accepted for Python 3.10 Awesome. I really dislike checking length of everything before using zip Without it, zip silently throws away any data past the point where there’s data in every iterable. >>> x = (1, 2, 3) # 3 will be lost >>> y = ('a', 'b') >>> list(zip(x,y)) [(1, 'a'), (2, 'b')] with this change, list(zip(x,y), strict=True) would raise ValueError Michael #5: timedelta and division? How do you find the difference between two times? t0 = datetime.datetime.now() dt = datetime.datetime.now() - t0 Now what? We have dt.total_seconds() and things like d.days, d.seconds, d.microseconds but these combine in, … weird ways. What about dt.total_hours()? Oh, and why aren’t these properties? Well, I learned the right way from Paul Ganssle on TP 271: talkpython.fm/271 weeks = dt / timedelta(days=7) hours = dt / timedelta(hours=1) And so on! Jeff commented on the episode page “Learning that you can divide a timedelta by a timedelta to come up with days, weeks, etc is like the Python tip of the year.” I agree. https://docs.python.org/3/library/datetime.html#timedelta-objects Anthony #6: Pylance released for Microsoft VS Code New Extension for VS Code Designed to be used with the Python extension, The Python Extension and the Python Language Server (using the LSP) were different projects. The Python LSP was written in .NET and the Python Extension is written in Typescript. This extension is closed-source as Microsoft plan to use it for some proprietary technology. Features docstring automation, Signature help, parameter suggestions, code completion (better than existing) Supports auto-imports, if you start typing from a namespace, like a standard library module, it will add the import for you. Go to Reference, Go to Implementation shortcuts Uses the pyright type checker (an alternative to MyPy (Dropbox), Pyre (Facebook) and Pytype (Google)) If you have pyright extension installed, remove it first! Extension is more useful with a couple of non-default settings Change diagnostic mode to workspace so it inspects all files, not just open ones Change type checking mode to basic (is off by default) In my testing had a few issues resolving super methods with mixin (multiple inheritance) in Django Extras: Anthony: My book is out and Guido is reviewing it! Michael: Humble Bundle is going strong: talkpython.fm/humble2020 Python 3.9.0b4 Is Now Ready for Testing Excel to Python course is coming to Talk Python Training (get notified). Joke:
July 9, 2020
Sponsored by us! Support our work through: Our courses at Talk Python Training Brian’s pytest book Brian #1: Improving Python exception chaining with raise-from Ram Rachum Python3 has a change called PEP 3134: Exception Chaining and Embedded Tracebacks It should be used more than it is. If an exception is raised from an except clause, it could be because: something unexpected happened “An exception was raised, and we decided to replace it with a different exception that will make more sense to whoever called this code. Maybe the new exception will make more sense because we’re giving a more helpful error message. Or maybe we’re using an exception class that’s more relevant to the problem domain, and whoever’s calling our code could wrap the call with an except clause that’s tailored for this failure mode.” If it’s the second case, you should change your code to something like this: try: [HTML_REMOVED] except ExpectedExceptionType as e: raise BetterException('Better explanation') from e It’s the from e that does the magic. And now instead of getting During handling of the above exception, another exception occurred: You get: The above exception was the direct cause of the following exception: “That’s how you know you have a case of a friendly wrapping of an exception.” Michael #2: Create and publish interactive reports in Python via Tim Pogue Datapane is an open source framework which makes it easy to turn scripts and notebooks into interactive reports. Free for individuals, paid(?) for teams Build reports in Python and deploy scripts and notebooks as self-service reporting tools. Analyze data in your own tools: Write code and analyze data in your own editor or environment, whether its Jupyter, Colab, or Airflow. Build reports in code: Datapane's framework makes it easy to create rich reports from DataFrames, Markdown, and visualization libraries, such as Altair. Publish and share: Export as standalone HTML files, or publish reports to Datapane.com for free, where they can be shared and embedded. Add forms to filter / drive the report Everything in Datapane is an API. Deploy scripts and generate reports from your server, GitHub, Airflow, or CI system. Check out the gallery. Brian #3: Pickle’s nine flaws Ned Batchelder Instead of “never use pickle”, Ned says “only use pickle if you are OK with it’s nine flaws” The flaws Insecure : Malicious pickles can get the unpickler to run bad code Old pickles look like old code : Any changes to your data structures might break your ability to read old pickles Implicit: All data is serialized as class objects, even if that’s not what you want. Over-serializes: Serializes everything in your objects, even things like cached computation __init__ isn’t called : during unpickling, even if it really should be for your situation Python only : for the most part, it’s not something you can use with other languages Unreadable: binary, so good luck debugging problems Appears to pickle code: but doesn’t really. Keeping a list of functions or classes? It’ll get pickled as names and get bound to a function/class matching the name during unpickling. Slow Some of it you can work around, but then, why? Alternatives: JSON, marshmallow, cattrs, protocol buffers, … Michael #4: PEP 602 -- Annual Release Cycle for Python by Łukasz Langa Status accepted This PEP proposes that Python 3.X.0 will be developed for around 17 months: The first five months overlap with Python 3.(X-1).0's beta and release candidate stages and are thus unversioned. The next seven months are spent on versioned alpha releases where both new features are incrementally added and bug fixes are included. The following three months are spent on four versioned beta releases where no new features can be added but bug fixes are still included. The final two months are spent on two release candidates (or more, if necessary) and conclude with the release of the final release of Python 3.X.0. Annual release cadence: Feature development of Python 3.(X+1).0 starts as soon as Python 3.X.0 Beta 1 is released. This creates a twelve month delta between major Python versions. This change provides the following advantages: makes releases smaller: since doubling the cadence doesn't double our available development resources, consecutive releases are going to be smaller in terms of features; puts features and bug fixes in hands of users sooner; creates a more gradual upgrade path for users, by decreasing the surface of change in any single release; creates a predictable calendar for releases where the final release is always in October (so after the annual core sprint), and the beta phase starts in late May (so after PyCon US sprints), which is especially important for core developers who need to plan to include Python involvement in their calendar; Brian #5: More git Resources: On episode 187 we talked about Oh Sh*t, Git!, a zine by Julia Evans I mentioned that I was concerned about buying it for a team due to the mild swearing. John Place reached out to tell us there’s a non-swearing version: Dangit, git!, the zine. Also both of these are inspired by two websites by Katie Sylor-Miller: dangitgit.com ohshitgit.com These are free websites with “something went wrong, how to I fix it” solutions. All issues have titles that are links/anchors, so you can send someone a link if they ask you a question of how to fix something with git, and hopefully they can figure it out themselves sometime. Also Git Cheatsheet Not just a pdf An interactive single page site that is, for one, beautifully designed. There’s 5 columns: Stash, Workspace, Index, Local Repo, Upstream Repo Hover over a column and it shows you git commands that affect that part and flows to other columns. Hover over a command and the description pops up at the bottom. The visual is great for reinforcing how actions move data between different parts of a repository, and a great way to teach people to have that mental model that git is not just your repo, it’s all of these components working together. Lastly, git-pretty Similar goals as the dangit and ohsh*t offerings, this is a single page png flowchart that starts with “so you have a mess on your hands” and asks a bunch of questions to funnel you to how to fix it. A fun thing to print out and pin to your wall. Michael #6: PEP 616 -- String methods to remove prefixes and suffixes Dennis Sweeney Status: Accepted Question: What does this return? “saturday is the 1st".strip('st') Answer: aturday is the 1 If you expected it to remove the string st, well, no. That’s PEP 616. Add two new methods, removeprefix() and removesuffix(), to the APIs of Python's various string objects. These methods would remove a prefix or suffix (respectively) from a string, if present, and would be added to Unicode str objects, binary bytes and bytearray objects, and collections.UserString. Extras: Michael: Manning conference Python Bytes event Michael's 10 tips for web dev PyCon recording out Learn Python Humble Bundle Telegram bots by Abhishek Pednekar Python Bytes https://t.me/TalkPythonBot Joke: Karen Chee (@karencheee): you: A famous engineer / inventor is coming over for dinner tonight, want to join us? me: Sure, who is it? you: His name is Rube Goldberg me: That name rings a bell, which sets off a trap that undoes a buckle and releases a ball that rolls down a pipe and …
July 3, 2020
Sponsored by us! Support our work through: Our courses at Talk Python Training Brian’s pytest book Michael #1: Making a trading bot asynchronous using Python’s “unsync” library by Matt Gosden The older way — using the threading and multiprocessing libraries The newer way — using async and await from the asyncio library embedded into core Python from 3.7 onwards The easier way (I think)— using the @unsync decorator from the Python unsync library Somewhat realistic example worth looking at. Could discuss scalability more Also, proper def async and asyncio.sleep() for those playing at home But its absence kind shows unsync winning anyway. 🙂 It does work, right? Brian #2: *Fruit salad scrum estimation scale* From twitter question by Lacy Henschel, answered by Kathleen Jones Fruit related to work how easy potential for mess how many seeds, possible problems does it need divided The scale 1 - grape - trivial 2 - apple - may take a bit of time but everyone knows how to divide it 3 - cherry - easy but with some unknowns (what do you do with the pit?) 5 - pineapple - somewhat undefined, no major unknowns, still a lot of work (lots of opinions on how to cut it) 8 - watermelon - lots of work, some unknowns, messy (don’t know what you are getting into until you cut it open) ?? - tomato - unknown task, needs more info before estimating (doesn’t belong in a fruit salad) ?? - avacado - not scopable, probably urgent (goes bad quickly) Michael #3: Math to Code Math to Code is an interactive Python tutorial to teach engineers how to read and implement math using the NumPy library. by vernon thommeret Nice flashcard style of learning the building blocks of np for standard math Give it a try, solutions if you get stuck Python and NP together Source at github Interesting building blocks Skulpt for interpreting Python Skulpt NumPy for a subset of NumPy KaTex for rendering LaTeX Next.js for frontend framework Tailwind CSS for styling remark for rendering Markdown questions gray-matter for extracting Markdown frontmatter RealFavIconGenerator for generating favicons Brian #4: PEP 622 -- Structural Pattern Matching Draft status, targeted for Python 3.10 Syntax looks similar to switch/case statement, even though two switch PEPs were rejected earlier Designed not only to optimize if/elif/else statements but also to focus on sequence, mapping, and object destructuring. match/case statement with many allowed patterns: literal pattern: would then act similar to a switch/case statement name pattern: assigns expression to new variable if previous case doesn’t succeed constant value pattern: enums, similar to literal sequence pattern: works like unpacking assignment mapping pattern: like sequence unpacking, but for mappings, like dictionaries class pattern: create objects for each case and call __match__() combining patterns: | for multiple patterns. including binding patterns like name guards: if expression to further clarify a case named sub-patterns: ok. still getting my head around this Michael #5: CodeArtifact from AWS via Tormod Macleod AWS CodeArtifact is a fully managed software artifact repository service that makes it easy for organizations of any size to securely store, publish, and share packages used in their software development process AWS CodeArtifact works with commonly used package managers and build tools such as Maven and Gradle (Java), npm and yarn (JavaScript), pip and twine (Python), making it easy to integrate CodeArtifact into your existing development workflows. Can be configured to automatically fetch software packages from public artifact repositories such as npm public registry, Maven Central, and Python Package Index (PyPI), ensuring teams have reliable access to the most up-to-date packages. Brian #6: invoke suggested by Joreg Benesch replacement for Makefiles Confusion: documentation is at pyinvoke.org install with pip install invoke there’s also another pypi package, called pyinvoke, which is NOT what we are talking about. invoke: task execution library Write tasks.py files in Python for Makefile like things tasks are Python functions decorated with @task, like `` @task def build(c, clean=False): if clean: print("Cleaning!") print("Building!") - invoke tasks withinvoke $ invoke build -c $ invoke build --clean - you can - run shell commands withc.run()` - declare pre-tasks, tasks that need to run before this one. like “build” requires “clean”, etc. - namespaces with multiple files - tool intended for building documentation, but could probably run lots of stuff with it, like deployment, testing, etc. Extras: Brian: Michael: From Guido: Python 3.9.0 beta 3 is out now, for your immediate testing. Wait, what happened to beta 2? Interesting story. The next pre-release, the fourth beta release of Python 3.9, will be 3.9.0b4. It is currently scheduled for 2020-06-29. Joke: Parenting a geek
June 26, 2020
Sponsored by us! Support our work through: Our courses at Talk Python Training Brian’s pytest book Brian #1: LEGO Mindstorms Robot Inventor supports Python Past NXT 2006 NXT 2.0 2009 EV3 2013 (plus, weird post apocalypse thing going on) Robot Inventor will be available Autumn 2020 (not sure what that means). Controllable with both Scratch and Python Great updates to help with STEM education Instructions for 5 different robots interesting: 5x5 LED matrix 6 input/output ports for connecting a variety of sensors and motors. 6 axis gyro/accelerometer color sensor distance sensor and Python! Can be programmed with Windows & Mac, of course. But also iOS & Android tablets and phones and even some FireOS devices. Related: MicroscoPy - IBM open source, motorized, modular microscope built using LEGO bricks, Arduino, Raspberry Pi and 3D printing. Michael #2: Step-by-step guide to contributing on GitHub by Kevin Markham Want to contribute to an open source project? Follow this detailed visual guide to make your first contribution TODAY Although there are other guides like it out there, mine is (1) up-to-date with the latest GitHub interface, (2) much more detailed, and (3) highly visual. Includes 16 annotated screenshots + 2 workflow diagrams. The only prerequisite is that the reader has a tiny bit of Git knowledge. They don't even have to be a great coder, because what I suggest is that they start by fixing a typo or broken link in the documentation. That way they can focus on learning the contribution workflow! Steps: choose a project to contribute to fork the project clone your fork locally load your local copy in an editor make sure you have an "origin" remote add the project repository as the "upstream" remote pull the latest changes from upstream into your local repository create a new branch make changes in your local repository commit your changes push your changes to your fork create the pull request review the pull request add more commits to your pull request discuss the pull request delete your branch from your fork synchronize your fork with the project repository Nice Tips for contributing code section too. Brian #3: sneklang Snek: A Python-inspired Language for Embedded Devices An even smaller footprint than MicroPython or CircuitPython Can’t wait for Robot Inventor? Snek supports Lego EV3. “Snek is a tiny embeddable language targeting processors with only a few kB of flash and ram. … These processors are too small to run MicroPython.” Can develop using Mu editor Custom Snekboard runs either Snek or CircuitPython. Or run Snek on Lego EV3. Smaller language than Python, but intended to have all learning of Snek transferable to later development with Python. “The goals of the Snek language are: Text-based. A text-based language offers a richer environment for people comfortable with using a keyboard. It is more representative of real-world programming than building software using icons and a mouse. Forward-looking. Skills developed while learning Snek should be transferable to other development environments. Small. This is not just to fit in smaller devices: the Snek language should be small enough to teach in a few hours to people with limited exposure to software. Snek is Python-inspired, but it is not Python. It is possible to write Snek programs that run under a full Python system, but most Python programs will not run under Snek.” Michael #4: Oh sh*t git via Andrew Simon, by Julia Evans Does cost $10, no affiliations This zine explains git fundamentals (what’s a SHA?) How to fix a lot of common git mistakes (I committed to the wrong branch!!). Fundamentals Mistakes and how to fix them Merge conflicts Committed the wrong file Going back in time Brian #5: Why I don't like SemVer anymore Brett Cannon Interesting thoughts on SemVer SemVer isn't as straightforward as it sounds; we don't all agree on what a major, minor, or micro change really is. Is adding a depreciation warning a bug fix? or a major interface break? What if projects depending on your project have CI with warnings as errors? Your version number represents your branching strategy, so you choose a versioning scheme that's appropriate your branching and release strategy. While maintaining multiple branches, x.y.z might make sense: x - current release x.y - current development x.y.z - bug fixes x+1 - crazy new stuff If you aren’t maintaining 3+ branches at all times, that might be overkill Maybe x.y is enough Maybe just x is enough Rely on CI, potentially on a cron job, to detect when a project breaks for you instead of leaving it up to the project to try and make that call based on their interpretation of SemVer; will inevitably disagree Remember to pin your dependencies in your apps if you really don't want to have to worry about a dependency breaking you unexpectedly Libraries/packages should be setting a floor, and if necessary excluding known buggy versions, but otherwise don't cap the maximum version as you can't predict future compatibility Michael #6: git fame via Björn Olsson Pretty-print git repository collaborators sorted by contributions. Install via pip: pip install --user git-fame Register with git: git config --global alias.fame "!python -m gitfame``" Run in a repo directory: git fame Get a table of contributors including: Author, Lines of Code, Files, Distribution (stats), sorted by most contributions. Extras: Patreon Shoutout: We have 26 supporters at https://www.patreon.com/pythonbytes Many donate $1 a month, and that’s awesome. A few go above and beyond with more than that: Special shout out to those above a buck: Brent Kincer Brian Cochrane Bert Raeymaekers Richard Stonehouse Jeff Keifer Thank you Michael: __pypackages__ follow up from Kushal Das Joke: https://www.commitstrip.com/en/2017/02/28/definitely-not-lazy/
June 18, 2020
Sponsored by us! Support our work through: Our courses at Talk Python Training Brian’s pytest book Michael #1: sidetable - Create Simple Summary Tables in Pandas by Chris Moffitt Makes it easy to build a frequency table and simple summary of missing values in a DataFrame. Example without and with A useful tool when starting data exploration on a new data set At its core, sidetable is a super-charged version of pandas value_counts with a little bit of crosstab mixed in. With sidetable is imported, you have a new accessor on all your DataFrames - stb that you can use to build summary tables. Brian #2: tabulate suggested by Tom McDermott Pretty-print tabular data in Python, a library and a command-line utility. from tabulate import tabulate table = [["Sun",696000,1989100000], ["Earth",6371,5973.6], ["Moon",1737,73.5], ["Mars",3390,641.85]] headers=["Planet","R (km)", "mass (x 10^29 kg)"] table_str = tabulate(table, headers=headers) print(table_str) Planet R (km) mass (x 10^29 kg) -------- -------- ------------------- Sun 696000 1.9891e+09 Earth 6371 5973.6 Moon 1737 73.5 Mars 3390 641.85 lots of table formats, including simple (Markdown extended) github (github flavored markdown) pipe jira mediawiki html plain (just spaces) different column alignment options number formatting Michael #3: treebeard - ci for notebooks via Brian Skinn Continuous Integration for binder-ready repos A solution for setting up continuous integration on data science projects requiring minimal configuration. Functionality: Automatically installs dependencies for binder-ready repos (which can use conda, pip, or pipenv) Runs notebooks in the repo (using papermill) Uploads outputs, providing versioned URLs and nbcoverted output notebooks Integrates with repos via a GitHub App Slack notifications A secret store for integrating with existing infrastructure A notebook that can run all code cells successfully will be tagged as successful. Treebeard shows a summary of all notebook statuses once execution is finished. Brian #4: Upcoming features in venv/virtualenv In episode 184, we discussed how virtualenv and venv Coming in Python 3.9, venv will get --upgrade-deps flag. `--upgrade-deps Upgrade core dependencies: pip setuptools to the latest version in PyPI`` It’s listed as being changed in 3.8, but it just missed 3.8 by a smidge and will have to wait until 3.9, which is available as beta now. Here’s beta 3. Automatically updates pip and setuptools in the new environment. virtualenv is also getting a new goodie, periodic update. Not only does it create environments with updated setuptools, pip, wheel packages, it will periodically go out and check for updates to make sure it’s ready for your next virtual environment. You can also manually have it update, with the --upgrade-embed-wheels flag. Michael #5: PEP 582 now! via Luiz Irber This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__ directory and prefer importing packages installed in this location over user or global site-packages. How virtual environments work is a lot of information for anyone new. It takes a lot of extra time and effort to explain them. Different platforms and shell environments require different sets of commands to activate the virtual environments. Virtual environments need to be activated on each opened terminal. Tools like pip can be used to install the required dependencies directly into this directory. Still in draft mode but Python 3.8? https://github.com/David-OConnor/pyflow implements PEP 582 Unfortunately requires everything running via pyflow for now. Brian #6: awesome pyproject.toml projects “We think pyproject.toml is pretty awesome, so this awesome list contains projects already using it, or discussing its inclusion.” Testing and formatting apparently switched pretty quick coverage.py pytest tox ward (new to me, no test names, test names are strings) black isort code analysis projects pylint unimport wemake-python-styleguide packaging projects some articles on pyproject.toml and a list of projects discussing the switch Python bytes awesome list Extras: Brian: new website for Pragmatic Michael: Check out our latest episode on pytest-plugins Managing Secrets and your Environment with 1Password Joke: Spouse: Stop by the store on the way home from work, "Honey, please stop at the market and buy 1 bottle of milk. If they have eggs, bring 6" Me: I came back with 6 bottles of milk. Spouse: "Why the hell did you buy 6 bottles of milk? It's just the two of us!" Me: "Why do you think? Because they had eggs!"
June 12, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: MyST - Markedly Structured Text I think this came from a tweet from Chris Holdgraf A fully-functional markdown flavor and parser for Sphinx. MyST allows you to write Sphinx documentation entirely in markdown. MyST markdown provides a markdown equivalent of the reStructuredText syntax, meaning that you can do anything in MyST that you can do with reStructuredText. It is an attempt to have the best of both worlds: the flexibility and extensibility of Sphinx with the simplicity and readability of Markdown. MyST has the following main features: A markdown parser for Sphinx. You can write your entire Sphinx documentation in markdown. Call Sphinx directives and roles from within Markdown, allowing you to extend your document via Sphinx extensions. Extended Markdown syntax for useful rST features, such as line commenting and footnotes. A Sphinx-independent parser of MyST markdown that can be extended to add new functionality and outputs for MyST. A superset of CommonMark markdown. Any CommonMark markdown (such as Jupyter Notebook markdown) is natively supported by the MyST parser. Michael #2: direnv via __dann__ direnv is an extension for your shell. It augments existing shells with a new feature that can load and unload environment variables depending on the current directory. Use cases Load 12factor apps environment variables Create per-project isolated development environments Load secrets for deployment Before each prompt, direnv checks for the existence of a .envrc file in the current and parent directories. If the file exists, it is loaded into a bash sub-shell and all exported variables are then captured by direnv and then made available to the current shell. It supports hooks for all the common shells like bash, zsh, tcsh and fish. This allows project-specific environment variables without cluttering the ~/.profile file. Because direnv is compiled into a single static executable, it is fast enough to be unnoticeable on each prompt. Brian #3: Convert a Python Enum to JSON Alexander Hultner Problem: Enum values by default are not serializable. So you can't use them as values in JSON. and can't use them as values passed to databases. Solution: Derived enumerations, like IntEnum or custom derived enumerations are simple to define and serializable. You can convert them to json and store them as database values. Example: >>> from enum import Enum, IntEnum >>> import json >>> class Color(Enum): ... red = 1 ... blue = 2 ... >>> c = Color.red >>> c [HTML_REMOVED] >>> >>> json.dumps(c) Traceback (most recent call last): ... TypeError: Object of type Color is not JSON serializable >>> class Color(IntEnum): ... red = 1 ... blue = 2 ... >>> c = Color.red >>> c [HTML_REMOVED] >>> json.dumps(c) '1' >>> class Color(str, Enum): ... red = "red" ... blue = "blue" ... >>> c = Color.red >>> c [HTML_REMOVED] >>> json.dumps(c) '"red"' Michael #4: Pendulum: Python datetimes made easy via tuckerbeck Drop-in replacement for the standard datetime class. Time deltas dur = pendulum.duration(days=15) # More properties dur.weeks dur.hours # Handy methods dur.in_hours() 360 dur.in_words(locale="en_us") '2 weeks 1 day' Intervals dt = pendulum.now() # A period is the difference between 2 instances period = dt - dt.subtract(days=3) period.in_weekdays() # A period is iterable for dt in period: print(dt) Brian #5: PySnooper - Never use print for debugging again Thanks @pylang23 for the suggestion. With PySnooper you can just add one decorator line to a function and you get a play-by-play log of your function, including which lines ran and when, and exactly when local variables were changed. Logs every modified variable with value which line of code is being run return value passed in parameters elapsed time Options to: isolate logging to a section of a function with a with block log to a file instead of stdout extend watch to a list of non-local variables extend watch to functions called by the function being decorated All with a simple decorator and a pretty simple API Michael #6: Fil: A New Python Memory Profiler for Data Scientists and Scientists via PyCoders If your Python data pipeline is using too much memory, it can be very difficult to figure where exactly all that memory is going. Yes, there are existing memory profilers for Python that help you measure memory usage, but none of them are designed for batch processing applications that read in data, process it, and write out the result. What you need is some way to know exactly where peak memory usage is, and what code was responsible for memory at that point. And that’s exactly what the Fil memory profiler does. Because of this difference in lifetime, the impact of memory usage is different. Servers: Because they run forever, memory leaks are a common cause of memory problems. Even a small amount of leakage can add up over tens of thousands of calls. Most servers just process small amounts of data at a time, so actual business logic memory usage is usually less of a concern. Data pipelines: With a limited lifetime, small memory leaks are less of a concern with pipelines. Spikes in memory usage due to processing large chunks of data are a more common problem. This is Fil’s primary goal: diagnosing spikes in memory usage. Many tools track just Python memory. *Fil captures *all allocations going to the standard C memory allocation APIs. Extras: Michael: Student cohorts: training.talkpython.fm/cohorts/apply, but had to close after just a day due to high volume Joke: Senior dev: Where did you get the code that does this from? Junior dev: Stack Overflow Senior dev: Was it from the question part or from the answer part?
June 5, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean - $100 credit for new users to build something awesome. Michael #1: Waiting in asyncio by Hynek Schlawack One of the main appeals of using Python’s asyncio is being able to fire off many coroutines and run them concurrently. How many ways do you know for waiting for their results? The simplest case is to await your coroutines: result_f = await f() result_g = await g() Drawbacks: The coroutines do not run concurrently. g only starts executing after f has finished. You can’t cancel them once you started awaiting. [asyncio.Task](https://docs.python.org/3/library/asyncio-task.html#asyncio.Task)s wrap your coroutines and get independently scheduled for execution by the event loop whenever you yield control to it task_f = asyncio.create_task(f()) task_g = asyncio.create_task(g()) await asyncio.sleep(0.1) #
May 29, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Calvin Hendryx-Parker @calvinhp Brian #1: fastpages: An easy to use blogging platform, with enhanced support for Jupyter Notebooks. Uses GH actions to Jekyll blog posts on GitHub Pages. Create posts with code, output of code, formatted text, directory from Jupyter Notebooks. Altair interactive visualizations Collapsible code cells that can be open or closed by default. Metadata like title, summary, in special markdown cells. twitter cards and YouTube videos tags support Support for pure markdown posts and even MS Word docs for posts. (but really, don’t). Documentation and introduction written in fastpages itself, https://fastpages.fast.ai/ Michael #2: BeeKeeper Studio Open Source SQL Editor and Database Manager Use Beekeeper Studio to query and manage your relational databases, like MySQL, Postgres, SQLite, and SQL Server. Runs on all the things (Windows, Linux, macOS) Features Autocomplete SQL query editor with syntax highlighting Tabbed interface, so you can multitask Sort and filter table data to find just what you need Sensible keyboard-shortcuts Save queries for later Query run-history, so you can find that one query you got working 3 days ago Default dark theme Connect: Alongside normal connections you can encrypt your connection with SSL, or tunnel through SSH. Save a connection password and Beekeeper Studio will make sure to encrypt it to keep it safe. SQL Auto Completion: Built-in editor provides syntax highlighting and auto-complete suggestions for your tables so you can work quickly and easily. Open Lots of Tabs: Open dozens of tabs so you can write multiple queries and tables in tandem without having to switch windows. Save queries View Table Data: Tables get their own tabs too! Use our table view to sort and filter results by column. Calvin #3: 2nd Annual Python Web Conference The most in-depth Python conference for web developers Targeted at production users of Python Talks on Django, Flask, Twisted, Testing, SQLAlchemy, Containers, Deployment and more June 17th-19th — One day of tutorials and two days of talks in 3 tracks Keynote talks by Lorena Mesa Hynek Schlawack Russell Keith-Magee Steve Flanders Fireside Chat with Carl Meyer about Instragram’s infrastructure, best practices Participate in 40+ presentations and 6 tutorials Fun will be had and connections made Virtual cocktails Online gaming Board game night Tickets are $199 and $99 for Students As a bonus, for every Professional ticket purchased, we'll donate a ticket to an attendee in a developing country. As a Python Bytes listener you can get a 20% discount with the code PB20 Brian #4: Mimesis - Fake Data Generator “…helps generate big volumes of fake data for a variety of purposes in a variety of languages.” Custom and generic data providers >33 locales Lots of locale dependent providers, like address, Food, Person, … Locale independent providers. Super fast. Benchmarking with 10k full names was like 60x faster than Faker. Data generation by schema. Very cool >>> from mimesis.schema import Field, Schema >>> _ = Field('en') >>> description = ( ... lambda: { ... 'id': _('uuid'), ... 'name': _('text.word'), ... 'version': _('version', pre_release=True), ... 'timestamp': _('timestamp', posix=False), ... 'owner': { ... 'email': _('person.email', domains=['test.com'], key=str.lower), ... 'token': _('token_hex'), ... 'creator': _('full_name'), ... }, ... } ... ) >>> schema = Schema(schema=description) >>> schema.create(iterations=1) - Output: [ { "owner": { "email": "aisling2032@test.com", "token": "cc8450298958f8b95891d90200f189ef591cf2c27e66e5c8f362f839fcc01370", "creator": "Veronika Dyer" }, "name": "widget", "version": "4.3.1-rc.5", "id": "33abf08a-77fd-1d78-86ae-04d88443d0e0", "timestamp": "2018-07-29T15:25:02Z" } ] Michael #5: Schemathesis A tool for testing your web applications built with Open API / Swagger specifications. Supported specification versions: Swagger 2.0 Open API 3.0.x Built with: hypothesis hypothesis_jsonschema pytest It reads the application schema and generates test cases which will ensure that your application is compliant with its schema. Use: There are two basic ways to use Schemathesis: Command Line Interface Writing tests in Python CLI supports passing options to hypothesis.settings. To speed up the testing process Schemathesis provides -w/--workers option for concurrent test execution If you'd like to test your web app (Flask or AioHTTP for example) then there is --app option for you Schemathesis CLI also available as a docker image Code example: import requests import schemathesis schema = schemathesis.from_uri("http://0.0.0.0:8080/swagger.json") @schema.parametrize() def test_no_server_errors(case): # `requests` will make an appropriate call under the hood response = case.call() # use `call_wsgi` if you used `schemathesis.from_wsgi` # You could use built-in checks case.validate_response(response) # Or assert the response manually assert response.status_code < 500 Calvin #6: Finding secrets by decompiling Python bytecode in public repositories Jesse’s initial research revealed that thousands of GitHub repositories contain secrets hidden inside their bytecode. It has been common practice to store secrets in Python files that are typically ignored such as settings.py, config.py or secrets.py, but this is potentially insecure Includes a nice crash course on Python byte code and cached source This post comes with a small capture-the-flag style lab for you to try out this style of attack yourself. You can find it at https://github.com/veggiedefender/pyc-secret-lab/ Look through your repositories for loose .pyc files, and delete them If you have .pyc files and they contain secrets, then revoke and rotate your secrets Use a standard gitignore to prevent checking in .pyc files Use JSON files or environment variables for configuration Extras: Michael: Python 3.9.0b1 Is Now Available for Testing Python 3.8.3 Is Now Available Ventilators and Python: Some particle physicists put some of their free time to design and build a low-cost ventilator for covid-19 patients for use in hospitals. https://arxiv.org/pdf/2003.10405.pdf Search of the PDF for Python: "Target computing platform: Raspberry Pi 4 (any memory size), chosen as a trade-off between its computing power over power consumption ratio and its wide availability on the market; • Target operating: Raspbian version 2020-02-13; • Target programming language: Python 3.5; • Target PyQt5: version 5.11.3." "The MVM GUI is a Python3 software, written using the PyQt5 toolkit, that allows steering and monitoring the MVM equipment." Brian: Call for Volunteers! Python GitHub Migration Work Group migration from bugs.python.org to GitHub Calvin: Learn Python Humble Bundle Pay $15+ and get an amazing set of Python books to start learning at all levels Book Industry Charitable Foundation The No Starch Press Foundation Joke: More O’Really book covers
May 19, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: PSF / JetBrains Survey via Jose Nario Let’s talk results: 84% of people who use Python do so as their primary language [unchanged] Other languages: JavaScript (down), Bash (down), HTML (down), C++ (down) Web vs Data Science languages: More C++ / Java / R / C# on Data Science side More SQL / JavaScript / HTML Why do you mainly use Python? 58% work and personal What do you use Python for? Average answers was 3.9 Data analysis [59% / 59% — now vs. last year] Web Development [51% / 55%] ML [40% / 39%] DevOps [39% / 43%] What do you use Python for the most? Web [28% / 29%] Data analysis [18% / 17%] Machine Learning [13% / 11%] Python 3 vs Python 2: 90% Python 3, 10% Python 2 Widest disparity of versions (pro 3) is in data science. Web Frameworks: Flask [48%] Django [44%] Data Science NumPy 63% Pandas 55% Matplotlib 46% Testing pytest 49% unittest 30% none 34% Cloud AWS 55% Google 33% DigitalOcean 22% Heroku 20% Azure 19% How do you run code in the cloud (in the production environment) Containers 47% VMs 46% PAAS 25% Editors PyCharm 33% VS Code 24% Vim 9% tool use version control 90% write tests 80% code linting 80% use type hints 65% code coverage 52% Brian #2: Hypermodern Python Claudio Jolowicz, @cjolowicz An opinionated and fun tour of Python development practices. Chapter 1: Setup Setup a project with pyenv and Poetry, src layout, virtual environments, dependency management, click for CLI, using requests for a REST API. Chapter 2: Testing Unit testing with pytest, using coverage.py, nox for automation, pytest-mock. Plus refactoring, handling exceptions, fakes, end-to-end testing opinions. Chapter 3: Linting Flake8, Black, import-order, bugbear, bandit, Safety. Plus more on managing dependencies, and using pre-commit for git hooks. Chapter 4: Typing mypy and pytype, adding annotations, data validation with Desert & Marshmallow, Typeguard, flake8-annotations, adding checks to test suite Chapter 5: Documentation docstrings, linting docstrings, docstrings in nox sessions and test suites, darglint, xdoctest, Sphinx, reStructuredText, and autodoc Chapter 6: CI/CD CI with GithHub Actions, reporting coverage with Codecov, uploading to PyPI, Release Drafter for release documentation, single-sourcing the package version, using TestPyPI, docs on RTD The series is worth it even for just the artwork. Lots of fun tools to try, lots to learn. Michael #3: Open AI Jukebox via Dan Bader Listen to the songs under “Curated samples.” A neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artist styles. Code is available on github. Dataset: To train this model, we crawled the web to curate a new dataset of 1.2 million songs (600,000 of which are in English), paired with the corresponding lyrics and metadata from LyricWiki. The top-level transformer is trained on the task of predicting compressed audio tokens. We can provide additional information, such as the artist and genre for each song. Two advantages: first, it reduces the entropy of the audio prediction, so the model is able to achieve better quality in any particular style; second, at generation time, we are able to steer the model to generate in a style of our choosing. Brian #4: The Curious Case of Python's Context Manager Redowan Delowar, @rednafi A quick tour of context managers that goes deeper than most introducitons. Writing custom context managers with __init__, __enter__, __exit__. Using the decorator contextlib.contextmanager Then it gets even more fun Context managers as decorators Nesting contexts within one with statement. Combining context managers into new ones Examples Context managers for SQLAlchemy sessions Context managers for exception handling Persistent parameters across http requests Michael #5: nbstripout via Clément Robert In the latest episode, you praised NBDev for having a git hook that strips out notebook outputs. strip output from Jupyter and IPython notebooks Opens a notebook, strips its output, and writes the outputless version to the original file. Useful mainly as a git filter or pre-commit hook for users who don’t want to track output in VCS. This does mostly the same thing as the Clear All Output command in the notebook UI. Has a nice youtube tutorial right in the pypi listing Just do nbstripout --``install in a git repo! Brian #6: Write ups for The 2020 Python Language Summit Guido talked about this in episode 179 But these write-ups are excellent and really interesting. Should All Strings Become f-strings?, Eric V. Smith Replacing CPython’s Parser with a PEG-based parser, Pablo Galindo, Lysandros Nikolaou, Guido van Rossum A Formal Specification for the (C)Python Virtual Machine, Mark Shannon HPy: a Future-Proof Way of Extending Python?, Antonio Cuni CPython Documentation: The Next 5 Years, Carol Willing, Ned Batchelder Lightning talks (pre-selected) What do you need from pip, PyPI, and packaging?, Sumana Harihareswara A Retrospective on My "Multi-Core Python" Project, Eric Snow The Path Forward for Typing, Guido van Rossum Property-Based Testing for Python Builtins and the Standard Library, Zac Hatfield-Dodds Core Workflow Updates, Mariatta Wijaya CPython on Mobile Platforms, Russell Keith-Magee Wanted to bring this up because Python is a living language and it’s important to pay attention and get involved, or at least pay attention to where Python might be going. Also, another way to get involved is to become a member of the PSF board of directors What’s a PSF board of directors member do? video There are some open seats, Nominations are open until May 31 Extras: Michael: Updated search engine for better result ranking Windel Bouwman wrote a nice little script for speedscope https://github.com/windelbouwman/pyspeedscope (follow up from Austin profiler) Jokes: “Due to social distancing, I wonder how many projects are migrating to UDP and away from TLS to avoid all the handshakes?” - From Sviatoslav Sydorenko “A chef and a vagrant walk into a bar. Within a few seconds, it was identical to the last bar they went to.” - From Benjamin Jones, crediting @lufcraft Understanding both of these jokes is left as an exercise for the reader.
May 14, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: interrogate: checks your code base for missing docstrings Suggested by Herbert Beemster Written and Maintained by Lynn Root, @roguelynn Having docstrings helps you understand code. They can be on methods, functions, classes, and modules, and even packages, if you put a docstring in __init__.py files. I love how docstrings pop up in editors like VS Code & PyCharm do with them. If you hover over a function call, a popup shows up which includes the docstring for the function. Other tools like Sphinx, pydoc, docutils can generate documentation with the help of docstrings. But good is your project at including docstrings? interrogate is a command line tool that checks your code to make sure everything has docstrings. Neato. What’s missing? -vv will tell you which pieces are covered and not. Don’t want to have everything forced to include docstrings? There are options to select what needs a docstring and what doesn’t. Also can be incorporated into tox testing, and CI workflows. Michael #2: Streamlit: Turn Python Scripts into Beautiful ML Tools via Daniel Hoadley Many folks come to Python from “scripting” angles The gap between that and interactive, high perf SPA web apps is gigantic Streamlit let’s you build these as if they were imperative top-to-bottom code Really neat tricks make callbacks act like blocking methods Use existing data science toolkits Brian #3: Why You Should Document Your Tests Hynek Schlawack, @hyneck All test_ methods should include a docstring telling you or someone else the what and why of the test. The test name should be descriptive, and the code should be clear. But still, you can get confused in the future. Hynek includes a great example of a simple test that is not obvious what it’s doing because the test is checking for a side effect of an action. “This is quite common in testing: very often, you can’t ask questions directly. Instead you verify certain properties that prove that your code is achieving its goals.” “If you don’t explain what you’re actually testing, you force the reader (possibly future you) to deduce the main intent by looking at all of its properties. This makes it tiring and time-consuming to quickly scan a file for a certain test or to understand what you’ve actually broken if a test starts failing.” Want to make sure all of your test methods have docstrings? interrogate -vv --fail-under 100 --whitelist-regex "test_.*" tests will do the trick. See also: How to write docstrings for tests Michael #4: HoloViz project HoloViz is a coordinated effort to make browser-based data visualization in Python easier to use, easier to learn, and more powerful. HoloViz provides: High-level tools that make it easier to apply Python plotting libraries to your data. A comprehensive tutorial showing how to use the available tools together to do a wide range of different tasks. A Conda metapackage "holoviz" that makes it simple to install matching versions of libraries that work well together. Sample datasets to work with. Comprised of a bunch of cool independent projects Panel for making apps and dashboards for your plots from any supported plotting library hvPlot to quickly generate interactive plots from your data HoloViews to help you make all of your data instantly visualizable GeoViews to extend HoloViews for geographic data Datashader for rendering even the largest datasets Param to create declarative user-configurable objects Colorcet for perceptually uniform colormaps. Brian #5: A cool new progress bar for python Rogério Sampaio, @rsalmei project: alive-progress Way cool CLI progress bars with or without spinners Clean coding interface. Fun features and options like sequential framing, scrolling, bouncing, delays, pausing and restarting. Repo README notes: Great animations in the README. (we love this) “To do” list, encourages contributions “Interesting facts” functional style extensive use of closures and generators no dependencies “Changelog highlights” I love this. 1-2 lines of semicolon separated features added per version. Michael #6: Awesome Panel by Marc Skov Madsen Awesome Panel Project is to share knowledge on how awesome Panel is and can become. A curated list of awesome Panel resources. A gallery of awesome panel applications. This app as a best practice multi page app with a nice layout developed in Panel. Kind of meta as it’s built with Panel. :) Browse the gallery to get a sense of what it can do Extras: Michael: Kevin Vanderveen created a cool COVID explorer with Streamlit app: https://analysis-covid-19.herokuapp.com repo: https://github.com/kvanderveen/analysis_covid_19 Code Together for editor sharing Brian: PyCon 2020 online new content still being posted, through first few weeks of May. My talk is up. yay! Multiply your Testing Effectiveness with Parameterized Testing description, video, github repo with slides and code Plus lots of other great talks, tutorials, charlas, sponsor workshops, online poster hall pytest resolution https://twitter.com/pytestdotorg/status/1257940818255462408 Joke: O’Really book covers
May 8, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean - $100 credit for new users to build something awesome. Michael #1: Ubuntu 20.04 is out! Next LTS support version since 26th April 2018 (18.04). Comes with Python 3.8 included! Already upgraded all our servers, super smooth. Kernel has been updated to the 5.4 based Linux kernel, with additional support for Wireguard VPN, AUFS5, and improved support for IBM, Intel, Raspberry Pi and AMD hardware. Features the latest version of the GNOME desktop environment. Brings support for installing an Ubuntu desktop system on top of ZFS. 20.04 already an option on DigitalOcean ;) Brian #2: Working with warnings in Python (Or: When is an exception not an exception?) Reuven Lerner Exceptions, the class hierarchy of exceptions, and warnings. “… most of the time, warnings are aimed at developers rather than users. Warnings in Python are sort of like the “service needed” light on a car; the user might know that something is wrong, but only a qualified repairperson will know what to do. Developers should avoid showing warnings to end users.” Python’s warning system …: It treats the warnings as a separate type of output, so that we cannot confuse it with either exceptions or the program’s printed text, It lets us indicate what kind of warning we’re sending the user, It lets the user indicate what should happen with different types of warnings, with some causing fatal errors, others displaying their messages on the screen, and still others being ignored, It lets programmers develop their own, new kinds of warnings. Reuven goes on to show how to use warnings in your code. using them creating custom warnings filtering Michael #3: Safer file writer pip installable, see the article and the repo too. Consider this code: with open(filename, 'w') as fp: json.dump(data, fp) It’s using with, so it’s good right? Well the file itself may be overwritten and maybe corrupted With safer, you write almost identical code: with safer.open(filename, 'w') as fp: json.dump(data, fp) Now if json.dump() throws an exception, the original file is unchanged, so your important data file lives to see another day. The actual 28 lines of code is pretty interesting: https://github.com/rec/safer/blob/v1.0.0/safer.py#L70-L97 Brian #4: codespell codespell : Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code, but it can be used with other files as well. I got a cool pull request against the cards project to add a pre-commit hook to run codespell. (Thanks Christian Clauss) codespell caught a documentation spelling error in cards, where I had spelled “arguments” as “arguements”. Oops. Spelling errors are annoying and embarrassing in code and comments, and distracting. Also hard to deal with using traditional spell checkers. So super glad this is a thing. Michael #5: Austin profiler via Anthony Shaw Python frame stack sampler for CPython Profiles CPU and Memory! Why Austin? Written in pure C Austin is written in pure C code. There are no dependencies on third-party libraries. Just a sampler - fast: Austin is just a frame stack sampler. It looks into a running Python application at regular intervals of time and dumps whatever frame stack it finds. Simple output, powerful tools Austin uses the collapsed stack format of FlameGraph that is easy to parse. You can then go and build your own tool to analyse Austin's output. You could even make a player that replays the application execution in slow motion, so that you can see what has happened in temporal order. Small size Austin compiles to a single binary executable of just a bunch of KB. Easy to maintain Occasionally, the Python C API changes and Austin will need to be adjusted to new releases. However, given that Austin, like CPython, is written in C, implementing the new changes is rather straight-forward. Creates nice flame graphs The Austin TUI is nice! Web Austin is yet another example of how to use Austin to make a profiling tool. It makes use of d3-flame-graph to display a live flame graph in the web browser that refreshes every 3 seconds with newly collected samples. Austin output format can be converted easily into the Speedscope JSON format. You can find a sample utility along with the TUI and Austin Web. Brian #6: Numbers in Python Moshe Zadka A great article on integers, floats, fractions, & decimals Integers They turn into floats very easily, (4/3)*3 → 4.0, int → float Floats don’t behave like the floating point numbers in theory don’t obey mathematical properties subtraction and addition are not inverses 0.1 + 0.2 - 0.2 - 0.1 != 0.0 addition is not associative My added comment: Don’t compare floats with ==, use pytest.approx or other approximation techniques. Fractions Kinda cool that they are there but be very careful about your input Algorithms on fractions can explode in time and to some extent memory. Generally better to use floats Decimals Good for financial transactions. Weird dependence on a global state variable, the context precision. Safer to use a local context to set the precision locally >>> with localcontext() as ctx: ... ctx.prec = 10 ... Decimal(1) / Decimal(7) ... Decimal('0.1428571429') See also fractions in std lib decimals in std lib What Every Computer Scientist Should Know About Floating-Point Arithmetic Extras: Brian: python 3.9.0a6, now with the new PEG parser for CPython Michael: Keep subscribing over at youtube: pythonbytes.fm/youtube Joke: Unix is user friendly. It's just very particular about who its friends are. (via PyJoke) If you put 1000 monkeys at 1000 computers eventually one will write a Python program. The rest will write PERL. (via @JamesAbel)
April 30, 2020
Sponsored by DigitalOcean: pythonbytes.fm/datadog Special guest: Guido van Rossum Brian #1: New governance model for the Django project James Bennet on DjangoProject Blog DEP 10 (Django Enhancement Proposal) Looks like it’s been in the making since at least 2018 The specifics are definitely interesting “core team” dissolved new role, “merger” with commit access only for merging pull requests. hold no decision making privileges technical decisions made in public venues “technical board” kept where necessary, but historically it’s rare. no longer elected by committers, but anyone can run and be elected by DSF individual members. More interesting to me is the rationale Grow the set of people contributing to Django Remove the barriers to participation Looking at how decisions are made anyway historically, by reviewing pull requests, and merges done by “Fellows”, paid contractors of the DSF. Specifically, taking into account the specifics of the current state of participation in Django, trying to set it up for inclusion and growth in the future, and the specifics of this project. Not trying to clone the governance of a different project. Michael #2: missingno Missing data visualization module for Python. A small toolset of flexible and easy-to-use missing data visualizations Quick visual summary of the completeness (or lack thereof) of your dataset Just call msno.matrix(collisions.sample(250)) and here’s what you’ll see: The sparkline at right summarizes the general shape of the data completeness and points out the rows with the maximum and minimum nullity in the dataset. Other visualizations are available (heat maps, bar charts, etc) The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap. The dendrogram uses a hierarchical clustering algorithm (courtesy of scipy) to bin variables against one another by their nullity correlation. Guido #3: Announcements from the language summit. See the schedule of topics covered here. Brian #4: Codes of Conduct and Enforcement I’ve been thinking about this a lot lately. No reason. Just interesting topic, I think. Interesting the differences in CoC and enforcement clauses of different projects based on the types of interaction most likely to need enforcement. Two examples PSF Scope (focus seems to be first on events, second on online) PSF Code of Conduct being open focus on what’s best for the community acknowledging time and effort being respectful of different viewpoints and experiences showing empathy towards other community members being considerate being respectful gracefully accepting constructive criticism using welcoming and inclusive language list of inappropriate behavior PSF CoC Enforcement Procedures 2/3 majority vote among non conflicted work group members. Process for disagreement of the work group Django Scope (focus on online spaces, events seem to be covered elsewhere) Django Code of Conduct be friendly and patient be welcoming be considerate be respectful be careful in the words you choose Includes examples of harassment and exclusionary behavior that isn’t acceptable. when we disagree try to understand why Django CoC Enforcement Manual Resolution timelines in place. Aiming for resolution within a week. Unilateral authority: Any committee member may act immediately (before consensus) to end the situation if the act is ongoing or threatening. Otherwise, consensus must be reached. Otherwise, it’s turned over to the DSF board for resolution. Differences are interesting The focus on online interactions and the Django push to try to get more people involved I think are part of the need for really fast reaction times for problems, and then trying to reach consensus. The ability to bump the decision up to the DSF is interesting too. Also the 2/3 vs consensus. For other projects Looking at these two examples, why they are different, and what similarities and needs for inclusion and growth of more developers, online vs events, etc, before deciding how to enforce CoC on your project. Enforcement and quick enforcement and public statement of what enforcement looks like seems really important. Don’t ignore it. Figure out the process before you have to use it. Michael #5: Myths about Indentation Python can come across as a funky language using spacing, not { } for code blocks So let’s talk about some myths #1 Whitespace is significant in Python source code. No, not in general. Only the indentation level of your statements is significant (i.e. the whitespace at the very left of your statements). Everywhere else, whitespace is not significant and can be used as you like, just like in any other language. The exact amount of indentation doesn't matter at all, but only the relative indentation of nested blocks (relative to each other). Furthermore, the indentation level is ignored when you use explicit or implicit continuation lines. # For example: >>> foo = [ ... 'some string', ... 'another string', ... 'short string' ... ] #2 Python forces me to use a certain indentation style Yes and no. You can write the inner block all on one line if you like, therefore not having to care about indentation at all. These are equivalent >>> if 1 + 1 == 2: ... print("foo") ... print("bar") ... x = 42 >>> if 1 + 1 == 2: ... print("foo"); print("bar"); x = 42 >>> if 1 + 1 == 2: print("foo"); print("bar"); x = 42 If you decide to write the block on separate lines, then yes, Python forces you to obey its indentation rules The conclusion is: Python forces you to use indentation that you would have used anyway, unless you wanted to obfuscate the structure of the program. Seen C code like this: if (some condition) if (another condition) do_something(fancy); else this_sucks(badluck); Either the indentation is wrong, or the program is buggy. In Python, this error cannot occur. The program always does what you expect when you look at the indentation. #3 You cannot safely mix tabs and spaces in Python That's right, and you don't want that. Most good editors support transparent translation of tabs, automatic indent and dedent. It's behaving like you would expect a tab key to do, but still maintaining portability by using spaces in the file only. This is convenient and safe. #4 I just don't like it - That's perfectly OK; you're free to dislike it - But it does have a lot of advantages, and you get used to it very quickly when you seriously start programming in Python. #5 How does the compiler parse the indentation The parsing is well-defined and quite simple. Basically, changes to the indentation level are inserted as tokens into the token stream. After the lexical analysis (before parsing starts), there is no whitespace left in the list of tokens (except possibly within string literals, of course). In other words, the indentation is handled by the lexer, not by the parser. Guido #6: Parsers and LibCST - https://github.com/Instagram/LibCST Extras: Michael: Django no longer supports Python 2 AT ALL (via Adam (Codependent Codr)). April 1st this year, the 1.11 line of Django has left Long Term Support (LTS). Leaving only 2.2.12+ with exclusively Python 3 support. Quick follow up on “Coding is Googling”. I went through a recent blip of mad googling. Brian: Gotta get my talk recorded this week, deadlines Friday. A little worried. As a writer and developer, me and deadlines don’t always see eye to eye. Follow-ups from previous episodes: Got lots of help with my Mac / Windows problem and modifier keys. Thanks everyone. Simplest solution Apple→System Prefs→Keyboard→Modifier Keys, and swap control and command for my external keyboard. So far, so good. You can’t use the setuptools_scm trick to get github actions to automatically publish to Test PyPI or PyPI for Flit or Poetry projects, since the version number is a simple string in the repo. Would love to hear if anyone has a solution to this one. Otherwise I’m fine with a make or tox snippet for publishing that combines bumping the version. Guido: PyCon goes online. Python 2.7.8 was released, the last Python 2 release ever. Joke: Via https://twitter.com/derchambers/status/1226760532763410432 How can you borrow more money at the same time? With asyncIOUs!
April 22, 2020
This episode is brought to you by Digital Ocean: pythonbytes.fm/digitalocean YouTube is going strong over at pythonbytes.fm/youtube Michael #1: Python String Format Website by Lachlan Eagling Have you ever forgotten the arguments to datetime.str``f``time()? Quick: What’s the format for Wed April 15, 10:30am? I don’t know but the site says '%a %B %H, %M:%Sam' and it’s right! Brian #2: Pandas-Bokeh Suggested by Jack McKew “Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of Pandas. Importing the library adds a complementary plotting method plot_bokeh() on DataFrames and Series.” “With Pandas-Bokeh, creating stunning, interactive, HTML-based visualization is as easy as calling: df.plot_bokeh()" You can also switch the default plotting of pandas to Bokeh with pd.set_option('plotting.backend', 'pandas_bokeh') This interface looks a lot easier to me, instead of frames and plots and shows and such. Lots of options, and all collected in parameters to the plot call. Can also export a notebook or a standalone html file. Plus, the combined install of pip install pandas-bokeh pulls in everything you need. Michael #3: NBDev nbdev is a library that allows you to fully develop a library in Jupyter Notebooks, putting all your code, tests and documentation in one place. That is: you now have a true literate programming environment, as envisioned by Donald Knuth back in 1983! This seems to be a massive upgrade for notebooks and related tooling Creates Python packages out of a notebook Creates documentation from the notebook Solves the git perma-conflict issues with git pre-commit hooks Use #export to declare a cell should become a function in the package Manages the boilerplate issues for creating Python packages (setup.py, etc) Makes testing possible inside notebooks Navigate and edit your code in a standard text editor or IDE, and sync any changes automatically back into your notebooks (reverse basically) Follow getting started instructions. Docs render slightly better at nbdev.fast.ai Brian #4: Stop naming your python modules “utils” Sebastian Buczyński, @EnforcerPL Lots of projects, public and private, end up having a utils.py. “utils is arguably one of the worst names for a module because it is very blurry and imprecise. Such a name does not say what is the purpose of code inside. On the contrary, a utils module can as well contain almost anything. By naming a module utils, a software developer lays down perfect conditions for an incohesive code blob. Since the module name does not hint team members if something fits there or not, it is likely that unrelated code will eventually appear there, as more utils.” one occurrence of misbehavior invites more of them I have seen this in action. I’ve put 2-3 hard to classify methods, but used in lots of modules, into a utils.py, only to come back in a few months and see a couple dozen completely unrelated methods, now that the team has a junk drawer to throw things in. Excuses: It’s just one function There is no other place to put this code I need a place for company commons But Django does it Instead: Try naming based on role of the code or group functions by theme. If you see a utils.py crop up in a code review, request that it be renamed or split and renamed. Michael #5: Scalene A high-performance, high-precision CPU and memory profiler for Python It runs orders of magnitude faster than other profilers while delivering far more detailed information. Scalene is fast. It uses sampling instead of instrumentation or relying on Python's tracing facilities. Its overhead is typically no more than 10-20% (and often less). Scalene is precise. Unlike most other Python profilers, Scalene performs CPU profiling at the line level, pointing to the specific lines of code that are responsible for the execution time in your program. Scalene separates out time spent running in Python from time spent in native code (including libraries). Scalene profiles memory usage. In addition to tracking CPU usage, Scalene also points to the specific lines of code responsible for memory growth. It accomplishes this via an included specialized memory allocator. Requires special install, not just pip (see brew install instructions for the docs) Scalene profiles copying volume, making it easy to spot inadvertent copying, especially due to crossing Python/library boundaries (e.g., accidentally converting numpy arrays into Python arrays, and vice versa). See the performance comparison chart. Would be nice to have integrated in the editors (PyCharm and VS Code) Brian #6: From 1 to 10,000 test cases in under an hour: A beginner's guide to property-based testing Carolyn Stransky, @carolynstran Excellent intro to property based testing and hypothesis Starts with a unit test that uses example based testing. Before showing similar test using hypothesis, she talks about the different mindset of testing for properties instead of exact examples. Like not the exact sorted list you should but instead, the length should be the same the contents should contain the same things, for instance, using set for that assertion you could element-wise walk the list and make sure i
April 16, 2020
Sponsored by Datadog: pythonbytes.fm/datadog We’re launching a YouTube Project: pythonbytes.fm/youtube Brian #1: Announcing a new Sponsorship Program for Python Packaging “The Packaging Working Group of the Python Software Foundation is launching an all-new sponsorship program to sustain and improve Python's packaging ecosystem. Funds raised through this program will go directly towards improving the tools that your company uses every day and sustaining the continued operation of the Python Package Index.” Improvements since 2017, as a result of one time grants, a contract, and a gift: relaunch PyPI in 2018 added security features in 2019 improve support for users with disabilities and multiple locales in 2019 security features in 2019, 2020 pip & dependency resolver in 2020 Let’s keep it going We use PyPI every day We need packaging to keep getting better You, and your company, can sponsor. View the prospectus, apply to sponsor, or ask questions. Individuals can also donate. Michael #2: energy-usage A Python package that measures the environmental impact of computation. Provides a function to evaluate the energy usage and related carbon emissions of another function. Emissions are calculated based on the user's location via the GeoJS API and that location's energy mix data (sources: US E.I.A and eGRID for the year 2016). Can save report to PDF, run silently, etc. Only runs on Linux Brian #3: Coding is 90% Google Searching — A Brief Note for Beginners Colin Warn Short article, mostly chosen to discuss the topic. Michael & Brian disagree, so, what’s wrong with this statement? Michael #4: Using WSL to Build a Python Development Environment on Windows Article by Chris Moffet VMs aren’t fair to Windows (or macOS or …) But you need to test on linux-y systems! Enter WSL. In 2016, Microsoft launched Windows Subsystem for Linux (WSL) which brought robust unix functionality to Windows. May 2019, Microsoft announced the release of WSL 2 which includes an updated architecture that improved many aspects of WSL - especially file system performance. Check out Chris’ article for What is WSL and why you may want to install and use it on your system? Instructions for installing WSL 2 and some helper apps to make development more streamlined. How to use this new capability to work effectively with python in a combined Windows and Linux environment. The main advantage of WSL 2 is the efficient use of system resources. Running a very minimal subset of Hyper-V features and only using minimal resources when not running. Takes about 1 second to start. The other benefit of this arrangement is that you can easily copy files between the virtual environment and your base Windows system. Get the most out of this with VS Code + Remote - WSL Python Extension Anaconda Extension Pack Brian #5: A Pythonic Guide to SOLID Design Principles Derek D Again, mostly including this as a discussion point But for reference, here’s the decoder Single Responsibility Principle Every module/class should only have one responsibility and therefore only one reason to change. Open Closed Principle Software Entities (classes, functions, modules) should be open for extension but closed to change. Liskov's Substitutability Principle If S is a subtype of T, then objects of type T may be replaced with objects of Type S. Interface Segregation Principle A client should not depend on methods it does not use. Dependency Inversion Principle High-level modules should not depend on low-level modules. They should depend on abstractions and abstractions should not depend on details, rather details should depend on abstractions. Michael #6: Types for Python HTTP APIs: An Instagram Story Let’s talk about Typed HTTP endpoints Instagram has a few (thousand!) on a single Django app We can have data access layers with type annotations, but how do these manifest in HTTP endpoints? Instagram has a cool api_view decorator to “upgrade” regular typed methods to HTTP endpoints. For data exchange, dataclasses are nice, they have types, they have type validation, they are immutable via frozen. But some code is old and crusty, so TypedDict out of mypy allows raw dict usage with validation still. OpenAPI can be used for very nice documentation generation. Comments are super interesting. Suggesting pydantic, fastapi, and more. But that all ignores the massive legacy code story. But one is helpful and suggests Schemathesis: A tool for testing your web applications built with Open API / Swagger specifications. Extras: Michael: superstring follow up Joke: "How many programmers does it take to kill a cockroach? Two: one holds, the other installs Windows on it."
April 7, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Topic #0: Quick chat about COVID 19 Brian #1: What the heck is pyproject.toml? Brett Cannon pyproject.toml PEP 517 and 518 define what this file looks like and how to use it to build projects We’re familiar with it being used for flit and poetry based projects. Not so much with setuptools, but it does work with setuptools. You can add configuration for non-build related activities, such as coverage, tox, even though those tools support their own config files. Black is gaining popularity, probably more so than the use of flit. Black only uses pyproject.toml for configuration (what little config is available. But there is some.) So. Project adds use of black, ends up configuring with with pyproject.toml, but not specifying build steps, No builds are broken. :( Brett has the answers. Add the following to pyproject.toml. Then go read the rest of Brett’s article. It’s good. [build-system] requires = ["setuptools >= 40.6.0", "wheel"] build-backend = "setuptools.build_meta" Michael #2: Awesome Python Bytes Awesome List By Jack McKew Will be adding to this repo whenever I hear about awesome packages (in my opinion), PRs are welcome for anyone else though! Already has 5 PRs accepted Comes with graphics!!! Like all good presentations should. Some fun projects this made me recall: Great Expectations - for validating, documenting, and profiling, your data pandas-vet - a plugin for flake8 that provides opinionated linting for pandas code. GeoAlchemy - Using SQLAlchemy with Spatial Databases. vue.py - Provides Python bindings for Vue.js. It uses brython to run Python in the browser. Remember we have speedy search for our content over at pythonbytes.fm/search Brian #3: Publishing package distribution releases using GitHub Actions CI/CD workflows PyPA You’ve moved to flit (or not) and started using GitHub actions to build and test whenever you push to GitHub. So awesome. But now, there’s still a manual step to remember to publish to PyPI. And maybe we should be checking publish more often with the Test PyPI server. This article is a step by step walkthrough. It’s a bit dated, 3.7. So I’m trying to walk through all the steps with my cards project and it will be finished by the time this episode goes live. Stumbling blocks right now: I’ve left my email blank, no email for author or maintainer in pyproject.toml, because neither flit, nor pip require it. But PyPI still does. grrrr. Trying to decide between: normal email, setting up a new email for it, using a me+pypi gmail alias, setting up a new email address just for pypi, etc. test pypi fails due to “file already exists”, so, that’s always gonna be the case unless I bump the version, so gonna have to try to figure out a way around that. Michael #4: Rich text for terminals Rich is a Python library for rich text and beautiful formatting in the terminal. Add colorful text (up to 16.7 million colors) with styles (bold, italic, underline etc.) to your script or application. Rich can also render pretty tables, progress bars, markdown, syntax highlighted source code, and tracebacks -- out of the box. Centered or justified text Tables, tables! Syntax highlighted code Markdown! Can replace print() and does pretty printing of dictionaries with color. Good Windows support for the new Windows Terminal Brian #5: psutil: Cross-platform lib for process and system monitoring in Python “psutil (process and system utilities) is a cross-platform library for retrieving information on running processes and system utilization (CPU, memory, disks, network, sensors) in Python. It is useful mainly for system monitoring, profiling and limiting process resources and management of running processes. It implements many functionalities offered by classic UNIX command line tools such as ps, top, iotop, lsof, netstat, ifconfig, free and others.” Useful for an incredible amount of information about the system you are running on: cpu times, stats, load, number of cores memory size and usage disk partitions, usage sensors, including battery users processes and process management getting ids, names, etc. cpu, memory, connections, files, threads, etc per process signaling processes, like suspend, resume, kill Michael #6: How python implements super long integers by Arpit Bhayani In C, you worry about picking the right data type and qualifiers for your integers; at every step, you need to think if int would suffice or should you go for a long or even higher to a long double. In python, you need not worry about these "trivial" things because python supports integers of arbitrary size. 2 ** 20000 in C is INF where as in Python’s it’s fine, just at 6,021 digit result. But how!?! Integers are represented as: typedef struct { PyObject ob_base; Py_ssize_t ob_size; /* Number of items in variable part */ } PyVarObject; Other types that has PyObject_VAR_HEAD are PyBytesObject PyTupleObject PyListObject # Python's number: struct _longobject { PyObject ob_base; Py_ssize_t ob_size; /* Number of items in variable part */ digit ob_digit[1]; }; A "digit" is base 230 hence if you convert 1152921504606846976 into base 230 you get 100 Operations on super long integers Addition: Integers are persisted "digit-wise", this means the addition is as simple as what we learned in the grade school Subtraction: Same Multiplication: In order to keep things efficient implements the Karatsuba algorithm that multiplies two n-digit numbers in O(nlog23) elementary steps. Optimization of commonly-used integers: Python preallocates small integers in a range of -5 to 256. This allocation happens during initialization Extras: Michael: We're coming to YouTube, probably. :) npm is joining GitHub Joke:
April 1, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Special Guest: Matt Harrison Topic #0: Quick chat about COVID 19. What does your world look like? Amusing to see news channels, daily shows, etc, learning what we podcasters have figured out years ago Brian #1: Dictionary Merging and Updating in Python 3.9 Yong Cui, Ph.D. Python 3.9, scheduled for Oct release, will introduce new merge (|) and update (|=) operators, a.k.a. union operators Available in alpha 4 and later see also pep 584 # merge d1 = {'a': 1, 'b': 2} d2 = {'c': 3, 'd': 4} d3 = d1 | d2 # d3 is now {'a': 1, 'b': 2, 'c': 3, 'd': 4} # update d1 = {'a': 1, 'b': 2} d1 |= {'c': 3, 'd': 4} # d1 is now {'a': 1, 'b': 2, 'c': 3, 'd': 4} # last one wins if contention for both | and |= d1 = {'a': 1, 'b': 2} d1 |= {'a': 10, 'c': 3, 'd': 4} # d1 is now {'a': 10, 'b': 2, 'c': 3, 'd': 4} Matt #2: superstring An efficient library for heavy-text manipulation in Python, that achieves a remarkable memory and CPU optimization. Uses Rope (data structure) and optimization techniques. Performance comparisons for 50,000 char text memory: 1/20th speed: 1/5th Features Fast and Memory-optimized Rich API concatenation (a + b) len() and .length() indexing slicing strip lower upper Similar functionalities to python built-in string Easy to embed and use. I wonder if any of these optimizations could be brought into CPython Beware, it’s lacking tests Michael #3: New pip resolver to roll out this year via PyCoders The developers of pip are in the process of developing a new resolver for pip (as announced on the PSF blog last year). As part of that work, there will be some major changes to how pip determines what to install, based on package requirements. What will change: It will reduce inconsistency: it will no longer install a combination of packages that is mutually inconsistent. It will be stricter - if you ask pip to install two packages with incompatible requirements, it will refuse (rather than installing a broken combination, like it does now). What you can do to help First and most fundamentally, please help us understand how you use pip by talking with our user experience researchers. Even before we release the new resolver as a beta, you can help by running **pip check** on your current environment. Please make time to test the new version of pip, probably in May. Spread the word! And if you develop or support a tool that wraps pip or uses it to deliver part of your functionality, please make time to test your integration with our beta in May Matt #4: Covid-19 Data Think global act local Problem - No local data Made my own plots - current status no predictions ML works ok for basic model Implementing SIR Model with ordinary differential equations scipy odeint function Brian #5: Why does all() return True if the iterable is empty? Carl Johnson Q: “Why does all() return True if the iterable is empty? Shouldn’t it return False just like if my_list:would evaluate to False if the list is empty? What’s the thinking behind it returning True?” Lesson 1: "… basically doesn’t matter. The Python core team chose to make all([])return True, and whatever their reasons, you can program your way around by adding wrapper functions or if tests. ” Lesson 2: “all unicorns are blue” Lesson 3: “This is literally a 2,500 year old debate in philosophy. The ancients thought “all unicorns are blue” should be false because there are no unicorns, but modern logic says it is true because there are no unicorns that aren’t blue. Python is just siding with modern predicate logic, but your intuition is also quite common and was the orthodox position until the last few hundred years.” Blog post goes into teaching about predicate logic, Socrates, Aristotelean syllogisms, and such. And, really, no answer to why. But now, I’ll never forget that all([]) == True. Michael #6: pytest-monitor written by Jean-Sébastien Dieu pytest plugin for analyzing resource usage during test sessions Analyze your resources consumption through test functions: memory consumption time duration CPU usage Keep a history of your resource consumption measurements. Compare how your code behaves between different environments. Usage: Simply run pytest as usual: pytest-monitor is active by default as soon as it is installed. After running your first session, a .pymon sqlite database will be accessible in the directory where pytest was run. You will need a valid Python 3.5+ interpreter. To get measures, we rely on: psutil to extract CPU usage memory_profiler to collect memory usage and pytest (obviously!) Extras: Michael: switchlang is now on pypi : pip install switchlang markdown-subtemplate is now on pypi: pip install markdown-subtemplate Joke: Light timer fix: https://twitter.com/Sarcastic_Pharm/status/1238060786658009089
March 26, 2020
Sponsored by us! Talk Python courses & pytest book. Topic #0: Quick chat about COVID 19. Brian #1: Documentation as a way to build Community Melissa Mendonça “… educational materials can have a huge impact and effectively bring people into the community.” Quality documentation for OSS is often lacking due to: decentralized development documentation is not as glamorous or as praised as new features or major bug fixes “Even when the community is welcoming, documentation is often seen as a "good first issue", meaning that the docs end up being written by the least experienced contributors in the community.” Possible solution: organize/re-organize docs into: tutorials how-tos reference guide explanations consequences: Improving on the quality and discoverability Clear difference between docs aimed at different users Give users more opportunities to contribute, generating content that can be shared directly on the official documentation Building a documentation team as a first-class team in the project, which helps create an explicit role as documentation creator. This helps people better identify how they can contribute beyond code. Diversifying our contributor base, allowing people from different levels of expertise and different life experiences to contribute. This is also extremely important so that we have a better understanding of our community and can be accessible, unbiased and welcoming to all people. Referenced in article: "What nobody tells you about documentation" Michael #2: The Django Speed Handbook: making a Django app faster By Shibel Mansour Speed of your app is very important: 100ms is an eternity. SEO, user conversions, bounce rates, etc. Use the tried-and-true django-debug-toolbar. Analyze your request/response cycles and see where most of the time is spent. Provides database query execution times and provides a nice SQL EXPLAIN in a separate pane that appears in the browser. ORM/Database: Two ORM functionalities I want to mention first: these are select_related and prefetch_related. Nice 24x perf improvement example in the article. Basically, beware of the N+1 problem. Indexes: Be sure to add them but they slow writes. Pagination: Use it if you have lots of data Async / background tasks. Content size: Shrunk 9x by adding gzip middleware Static files: minify and bundle as you can, cache, serve through nginx, etc. At Python Bytes, Talk Python, etc, we use webassets, cssmin, and jsmin. PageSpeed from Google, talk python’s ranking. ImageOptim (for macOS, others) Lazy-loading images: Lazily loading images means that we only request them when or a little before they enter the client’s (user’s) viewport. With excellent, dependency-free JavaScript libraries like LazyLoad, there really isn’t an excuse to not lazy-load images. Moreover, Google Chrome natively supports the lazy attribute. Remember: Test and measure everything, before and after. Brian #3: dacite: simplifies creation of data classes from dictionaries Konrad Hałas dataclasses are awesome quick and easy fields can have default values be excluded from comparison and/or repr and more data often gets to us in dictionaries Converting from dict to dataclass is trivial for trivial cases: x = MyClass(**data_as_dict) For more complicated conversions, you need dacite dacite.from_dict supports: nested structures optional fields and unions collections type_hooks, which allow you to have custom converters for certain types strict mode. Normally allows extra input data that is just ignored if it doesn’t match up with fields. But you can use strict to not allow that. Raises exceptions when something weird happens, like the wrong type, missing values, etc. Michael #4: How we retired Python 2 and improved developer happiness By Barry Warsaw The Python Clock is at 0:00. In 2018, LinkedIn embarked on a multi-quarter effort to fully transition to a Python 3 code base. In total, the effort entailed the migration of about 550 code repositories. They don't use Python in our product or as a monolithic web service, and instead have hundreds of independent microservices and tools, and dozens of supporting libraries, all owned by independent teams in separate repositories. In the early days, most of internal libraries were ported to be “bilingual,” meaning they could be used in either Python 2 or 3. Given that the migration affected all of LinkedIn engineering across so many disparate teams and thousands of engineers, the effort was overseen by our Horizontal Initiatives (HI) program. Phase 1: In the first quarter of 2019, we performed detailed dependency graphing, identifying a number of repositories that were more foundational, and thus needed to be fully ported first because they blocked the ports of everything that depended on them. Phase 2: In the second quarter of 2019, we identified the remainder of repositories that needed porting Post-migration reflections: Our primary indicator for completing the migration of a multiproduct was that it built successfully and passed its unit and integration tests. For other organizations planning or in the midst of their own migration paths, we offer the following guidelines: Plan early, and engage your organization’s Python experts. Find and leverage champions in your affected teams, and promote the benefits of Python 3. Adopt the bilingual approach to supporting libraries so that consumers of your libraries can port to Python 3 on their own schedules. Invest in tests and code coverage—these will be your best success metrics. Ensure that your data models are explicit and clear, especially in identifying which data are bytes and which are human-readable text. Benefits: No longer have to worry about supporting Python 2 and have seen our support loads decrease. Can now depend on the latest open source libraries and tools, and free from the constrictions of having to write bilingual Python. Opportunistically and enthusiastically adopting type hinting and the mypy type checker, improving the overall quality, craft, and readability of Python code bases. Brian #5: The Troublesome Active Record Pattern Cal Paterson "Object relational mappers" (ORMs) exist to bridge the gap between the programmers' friend (the object), and the database's primitive (the relation). Examples include Django ORM and SQLAlchemy The Active Record pattern of data access is marked by: A whole-object basis Access by key (mostly primary key) Problem: Queries that don’t need all information for objects retrieve it all anyway, and it’s easy to code for loops to select or collect info that are wildly inefficient. how many books are there how many books about software testing written by Oregon authors Problem: transactions. people can forget to use transactions, some ORMs don’t support them, they are not taught in beginner tutorials, etc. SQLAlchemy has sessions Django has atomic() REST APIs can suffer the same problems. Solutions: just use SQL first class queries first class transactions avoid Active Record style access patterns Be careful with REST APIs Alternatives: GraphQL RPC-style APIs Michael #6: Types at the edges in Python By Steve Brazier For a new web service in python there are 3 things to start with: Pydantic mypy Production error tracking of some kind Why: Because what is this about? AttributeError: 'NoneType' object has no attribute 'strip' It should be: none is not an allowed value (type=type_error.none.not_allowed) We then launch this code into production and our assumptions are tested against reality. If we’re lucky our assumptions turn out to be correct. If not we likely encounter some cryptic NoneType errors like the one at the start of this post. Pydantic can help by formalizing our assumptions. mypy carries on helping: Once you see the error at the start of this post (thanks error reporting) you know what is wrong about assumptions. Make the following change to your code: field: typing.Optional[str] BTW: FastAPI integrates with Pydantic out of the box. A mini-kata like exercise here that can be worked through: meadsteve/types-at-the-edges-minikata Extras: Michael: Python Bytes Awesome Package List by Jack Mckew Visual Basic Will Stall Out With .NET 5 COVID 19 data sets New course in dev: Adding a CMS to Your Data-Driven Web App [in Pyramid|Flask] Joke: https://trello-attachments.s3.amazonaws.com/58e3f7c543422d7f3ad84f33/5e5ff59ebac11305019c191c/73ba3e752f0d132242e626d8ffc53cf2/docs.jpg
March 19, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Advanced usage of Python requests - timeouts, retries, hooks Dani Hodovic, @DaniHodovic “While it's easy to immediately be productive with requests because of the simple API, the library also offers extensibility for advanced use cases. If you're writing an API-heavy client or a web scraper you'll probably need tolerance for network failures, helpful debugging traces and syntactic sugar.” Lots of cool tricks I didn’t know you could do with requests. Using hooks to call raise_for_status() on every call. Using sessions and setting base URLs Setting default timeouts with transport adapters Retry on failure, with gobs of configuration options. Combining timeouts and retries Debugging http requests by printing out headers or printing everything. Testing and mocking requests Mimicking browser behaviors by overriding the User-Agent header request Michael #2: Fluent Assertions Via Dean Agan fluentcheck helps you reducing the lines of code providing a human-friendly and fluent way to make assertions. Example (for now): def my_function(n, obj): assert n is not None assert instanceof(n, float) assert 0. < n < 1 assert obj is not None assert isinstance(obj, MyCustomType) can be def my_function(n, obj): Check(n).is_not_None().is_float().is_between(0., 1.) Check(obj).is_not_None().is_subtype_of(MyCustomType) With a PR I’m working on (now accepted), it’ll support: def my_function(n, obj): Is(n).not_none.float.between(0., 1.) Is(obj).not_none.subtype_of(MyCustomType) Brian #3: Python in GitHub Actions Hynek Schlawack, @hynek “for an open source Python package, … my current recommendation for most people is to switch to GitHub Actions for its simplicity and better integration.” vs Azure Pipelines. Article describes how to get started and some basic configuration for: Running tests through tox, including coverage, for multiple Python versions. Including yml config and tox.ini changes necessary. Nice reminder to clean out old configurations for other CIs. Combining coverage reports and pushing code coverage info to Codecov Building the package. Running twine check to check the long description. Checking the install on Linux, Windows, and Mac Related: How to write good quality Python code with GitHub Actions Michael #4: VCR.py via Tim Head VCR.py simplifies and speeds up tests that make HTTP requests. The first time you run code that is inside a VCR.py context manager or decorated function, VCR.py records all HTTP interactions that take place through the libraries it supports and serializes and writes them to a flat file (in yaml format by default). Intercept any HTTP requests that it recognizes from the original test run and return the responses that corresponded to those requests. This means that the requests will not actually result in HTTP traffic, which confers several benefits including: The ability to work offline Completely deterministic tests Increased test execution speed If the server you are testing against ever changes its API, all you need to do is delete your existing cassette files, and run your tests again. Test and Code 102 pytest-vcr: pytest plugin for managing VCR.py cassettes @pytest.mark.vcr() def test_iana(): response = urlopen('http://iana.org/domains/reserved').read() assert b'Example domains' in response Brian #5: 8 Coolest Python Programming Language Features Jeremy Grifski, @RenegadeCoder94 Nice reminder of why I love Python and things I miss when I use other languages. The list list comprehensions generator expressions slice assignment iterable unpacking negative indexing dictionary comprehensions chaining comparisons f-strings Michael #6: Bento Find Python web-app bugs delightfully fast, without changing your workflow Find bugs that matter: Checks find security and reliability bugs in your code. They’re vetted across thousands of open source projects and never nit your style. Upgrade your tooling: You don’t have to fix existing bugs to adopt Bento. It’s diff-centric, finding new bugs introduced by your changes. And there’s zero config. Go delightfully fast: Run Bento automatically locally or in CI. Either way, it runs offline and never sends your code anywhere. Checks: https://bento.dev/checks/ Joke: https://trello-attachments.s3.amazonaws.com/58e3f7c543422d7f3ad84f33/5e5ff5b454e93258e907753b/ecd7567c50cc0d073820bf961f489365/debugging.jpg
March 13, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Python in Production Hynek Missing a key part from the public Python discourse and I would like to help to change that. Hynek was listening to a podcast about running Python services in production. Disagreed with some of the choices they made, it acutely reminded me about what I’ve been missing in the past years from the public Python discourse. And yet despite the fact that the details aren’t relevant to me, the mindsets, thought processes, and stories around it captivated me and I happily listened to it on my vacation. Python conferences were a lot more like this. I remember startups and established companies alike to talk about running Python in production, lessons learned, and so on. (Instagram and to a certain degree Spotify being notable exceptions) An Offer: So in a completely egoistical move, I would like to encourage people who do interesting stuff with Python to run websites or some kind of web and network services to tell us about it at PyCons, meetups, and in blogs. Dan Bader and I covered this back on Talk Python, episode 215. Brian #2: How to cheat at unit tests with pytest and Black Simon Willison Premise: “In pure test-driven development you write the tests first, and don’t start on the implementation until you’ve watched them fail.” too slow, so …, “cheat” write a pytest test that calls the function you are working on and compares the return value to something obviously wrong. when it fails, copy the actual output and paste it into your test now it should pass run black to reformat the huge return value to something manageable Brian’s comments: That’s turning exploratory and manual testing into automated regression tests, not cheating. There is no “pure test-driven development”, we still can’t agree on what a unit is or if mocks are good or evil. Michael #3: Goodbye Microservices: From 100s of problem children to 1 superstar Retrospective by Alexandra Noonan Javascript but the lessons are cross language Microservices is the architecture du jour Segment adopted this as a best practice early-on, which served us well in some cases, and, as you’ll soon learn, not so well in others. Microservices is a service-oriented software architecture in which server-side applications are constructed by combining many single-purpose, low-footprint network services. Touted benefits are improved modularity, reduced testing burden, better functional composition, environmental isolation, and development team autonomy. Instead of enabling us to move faster, the small team found themselves mired in exploding complexity. Essential benefits of this architecture became burdens. As our velocity plummeted, our defect rate exploded. Her post is the story of how we took a step back and embraced an approach that aligned well with our product requirements and needs of the team. Brian #4: Helium Michael #5: uncertainties package From Tim Head on upcoming Talk Python Binder episode. Do you know how uncertainty flows through calculations? Example: Jane needs to calculate the volume of her pool, so that she knows how much water she'll need to fill it. She measures the length, width, and height: length L = 5.56 +/- 0.14 meters = 5.56 m +/- 2.5% width W = 3.12 +/- 0.08 meters = 3.12 m +/- 2.6% depth D = 2.94 +/- 0.11 meters = 2.94 m +/- 3.7% One can find the percentage uncertainty in the result by adding together the percentage uncertainties in each individual measurement: percentage uncertainty in volume = (percentage uncertainty in L) + (percentage uncertainty in W) + (percentage uncertainty in D) = 2.5% + 2.6% + 3.7% = 8.8% We don’t want to deal with these manually! So we use the uncertainties package. Example of using the library: >>> from uncertainties import ufloat >>> from uncertainties.umath import * # sin(), etc. >>> x = ufloat(1, 0.1) # x = 1+/-0.1 >>> print 2*x 2.00+/-0.20 >>> sin(2*x) # In a Python shell, "print" is optional 0.9092974268256817+/-0.08322936730942848 Brian #6: Personalize your python prompt Arpit Bhayani Those three >>> in the interactive Python prompt. you can muck with those by changing sys.ps1 Fun. But you can also implement dynamic behavior by creating class and putting code in the __str__ method. Very clever. note to self: task for the day: reproduce the windows command prompt with directory listing and slashes in the other direction. Extras: Michael: Now that Python for Absolute Beginners is out, starting on a new course: Hybrid Data-Driven + CMS web apps. Joke: A Python Editor Limerick via Alexander A. CODING ENVIRONMENT, IN THREE PARTS: To this day, some prefer BBEdit. VSCode is now getting some credit. Vim and Emacs are fine; so are Atom and Sublime. Doesn't matter much, if you don't let it. But wait! Let's not forget IDEs! Using PyCharm sure is a breeze! Komodo, Eclipse, and IDEA; CLion is my panacea, and XCode leaves me at ease. But Jupyter Notebook is also legit! Data scientists must prefer it. In the browser, you code; results are then showed. But good luck when you try to use git.
March 5, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: David Amos David #1: PEP 614 – Relaxing Grammar Restrictions on Decorators Python currently requires that all decorators consist of a dotted name, optionally followed by a single call. E.g., can’t use subscripts or chained calls PEP proposes allowing any valid expression. Motivation for limitation is not a technical requirement: “I have a gut feeling about this one. I'm not sure where it comes from, but I have it... So while it would be quite easy to change the syntax in the future, I'd like to stick to the more restricted form unless a real use case is presented where [changing the syntax] would increase readability.” (Guido van Rossom, Source) Use case highlighted by PEP: List of Qt buttons: buttons = [button0, button1, …] Decorator is a method on a class attribute: button.clicked.connect Under current restrictions you can’t do @button[0].clicked.connect Workarounds involve assigning list element to a variable first: button0 = buttons[0] @button0.clicked.connect Author points out grammar is already loose enough to hack around: Define function def _(x): return x Then use _ as your decorator: @_(buttons[0].clicked.connect) That’s less readable than just using the subscript PEP proposes relaxing grammar to “any valid expression” (sort of), i.e. anything that you can use as a test in if, elif, or while blocks (as opposed to valid string input to eval) Some things wouldn’t be allowed, though E.g., tuples require parentheses, @f, g doesn’t make sense Does a tuple as a decorator make sense in the first place, though? CPython implementation on GitHub: https://github.com/brandtbucher/cpython/tree/decorators Michael #2: Create a macOS Menu Bar App with Python (Pomodoro Timer) by Camillo Visini Nice article: Learn how to create your very own macOS Menu Bar App using Python, rumps and py2app The mac menu bar is super useful. I leverage the heck out of this thing. Why not write Python for it? Tools: Python 3 and PyCharm as an IDE Rumps → Ridiculously Uncomplicated macOS Python Statusbar apps py2app → For creating standalone macOS apps from Python code (how cool is that?) Get started with the code: app = rumps.App("Pomodoro", "🍅") app.run() Then easily use Py2App to convert this into a full macOS app. Would love to see somebody try to submit one of these to the mac app store. Brian #3: Conditional Coverage Nikita Sobolev - CTO of wemake.services announcement post, repo suggested from @OpensourceF: https://twitter.com/OpensourceF/status/1232264318323957760 From README.md: Conditional coverage based on any rules you define! Some project have different parts that relies on different environments: Python version, some code is only executed on specific versions and ignored on others OS version, some code might be Windows, Mac, or Linux only External packages, some code is only executed when some 3rd party package is installed Traditional method: combine coverage data before reporting. This works ok on CI systems or with tox for multiple Python/package version. Doesn’t help much locally if wanting split is due to OS dependencies Requires multiple test runs to get full coverage New coverage plugin allows you to maintain coverage while developing locally. single test run and a reasonable coverage report So cool. Recommend to keep conditionals to a minimum and somewhat isolated. I wouldn’t want this all over my code base. Still want real full coverage on CI. David #4: Pycel – A library for compiling excel spreadsheets to python code & visualizing them as a graph Compile an Excel file with formulas as a Python object The compiler converts formulas in the spreadsheet to executable code Once compiled, you can set values for cells and inspect the output in other cells This is all happening in Python now, not touching Excel anymore You can visualize all of the formulas as a graph to explore how formulas depend on one another The author of the package wrote it to solve a problem in civilian aerospace engineering Blog post here: https://dirkgorissen.com/2011/10/19/pycel-compiling-excel-spreadsheets-to-python-and-making-pretty-pictures/ From 2011, but still relevant! Finally, with all the formulas compiled, the package can solve for variables using an optimization process In original use case this was to optimize engineering parameters to produce aircraft that could actually fly Author describes how using Python he increased the cases that could be optimized from 65% to 98% and reduced calculation time from 10 minutes to around 30 seconds to 1 minute. Michael #5: markdown-subtemplate A template engine to render Markdown with external template imports and basic variable replacements. Choice between data-driven server apps (typical Flask app), CMSes that let us edit content on the web such as WordPress, and even flat file systems like Pelican. This should not be a black and white decision. Here's how it works: You write standard markdown files for content. Markdown files can be shared and imported into your top-level markdown. Fragments of HTML can be used when css classes and other specializations are needed, but generally HTML is avoided. A dictionary of variables and their values to replace in the merged markdown is processes. Markdown content is converted to HTML and embedded in your larger site layout (e.g. within a Jinja2 template). Markdown transforms are cached to achieve very high performance regardless of the complexity of the content. Extensible logging and caching. Extensible storage coming soon. PRs and contributions are welcome. More to come Brian #6: FlakeHell wemake.services, from Conditional Coverage, also makes the wemake-python-styleguide, and recommends using FlakeHell Allows you to configure flake8 and plugins more easily in pyproject.toml files. Provides a ramp to start using linting tools with “legacy first”: flakehell baseline > .flakehell_baseline specify that file in your pyproject.toml flakehell lint will run your liniting tools and only report new failures you can start fixing older stuff later, or just apply style guide to new code. Lots of awesome shortcuts for configuration with wildcards and such. Can specify a shared config in one repo and use it multiple projects as a starting point with local changes. FlakeHell: It's a Flake8 wrapper to make it cool. Shareable and remote configs. Legacy-friendly: ability to get report only about new errors. Caching for much better performance. Use only specified plugins, not everything installed. Manage codes per plugin. Enable and disable plugins and codes by wildcard. Make output beautiful. pyproject.toml support. Show codes for installed plugins. Show all messages and codes for a plugin. Check that all required plugins are installed. Syntax highlighting in messages and code snippets. PyLint integration. Allow codes intersection for different plugins. Extras: Brian: Lots of great new content weekly on Test & Code Podcast Michael Qt follow up Moon base geekout David: PyTexas 2020 Registration Opening Registration page Joke: Why does it work!?!
February 25, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean - $100 credit for new users to build something awesome. Michael #1: Python visualization graph via Prayson Daniel The PyViz.org website is an open platform for helping users decide on the best open-source (OSS) Python data visualization tools for their purposes, with links, overviews, comparisons, and examples. Overviews of the OSS visualization packages High-level tools for getting started A live table for comparing maturity, popularity, and support. Dashboarding tools SciVis tools for rendering data embedded in three-dimensional space. Tutorials Topic examples of using Python viz tools to analyze or describe specific datasets Brian #2: Awesome Zen of Python A Rabbit Hole lot of Zen yes, I know, that’s a terrible mixed metaphor List of articles on “the Zen of Python” Well, articles, talks, tools, and “other?” Al Sweigart: The Zen of Python, Explained is a nice quick reference. Moshe Tadka: Meditations on the Zen of Python is slightly longer, but good and still a quick read. One line (“There should be one-- and preferably only one --obvious way to do it.”) is a joke making fun of pre-decrement, post-decrement in C. Abdur-Rahmaan Janhangeer: The Zen Of Python Is A Joke And Here Is Why is a must read. Michael #3: Jupytext via Matt Harrison Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts Wished Jupyter notebooks were plain text documents? Wished you could edit them in your favorite IDE? And get clear and meaningful diffs when doing version control? Then... Jupytext may well be the tool you're looking for! Jupytext can save Jupyter notebooks as Markdown and R Markdown documents Scripts in many languages. The languages that are currently supported by Jupytext are: Julia, Python, R, Bash, Scheme, Clojure, Matlab, Octave, C++, q/kdb+, IDL, TypeScript, Javascript, Scala, Rust/Evxcr, PowerShell, C#, F#, and Robot Framework. Brian #4: Tour of Python Itertools Martin Heinz Very cool quick look at some of the cool-ness to be found in itertools and more_itertools. itertools compress - one iterator to another eliminating elements that fail a bool expression accumulate - like functools.reduce but returns all intermediate values cycle - so cool, create a never ending repeating iterable tee - multiple references to one iterable more_itertools divide - divides iterable into sub-iterables partition - split into two based on a predicate bool expression side_effect - attach a side effect function to an iterable that gets called with each element collapse - like flatten split_at - multiple iterables splitting at divider items, specified with predicate bucket - multiple iterables based on multi-return-value expression map_reduce - specify 3 functions: key function (for categorizing), value function (for transforming) and finally reduce function (for reducing). sort_together seekable filter_except unique_to_each Michael #5: justpy.io JustPy is an object-oriented, component based, high-level Python Web Framework that requires no front-end programming. JustPy has no front-end/back-end distinction. All programming is done on the back-end allowing a simpler, more productive, and more Pythonic web development experience. JustPy removes the front-end/back-end distinction by intercepting the relevant events on the front-end and sending them to the back-end to be processed. Elements on the web page are instances of component classes. A component in JustPy is a Python class that allows you to instantiate reusable custom elements whose functionality and design is encapsulated away from the rest of your code. Custom components can be created using other components as building blocks. Out of the box, JustPy comes with support for HTML and SVG components as well as more complex components such as charts and grids. Supports most of the components and the functionality of the Quasar library Based on solid libraries: Starlette, uvicorn, and Vue.js. Brian #6: Modularity for Maintenance Glyph A list of many automation tools you can use to help with the maintenance of open source projects. CI, tox, linting, type checking, dependencies, security, coverage, formatting, releasing with lots of options and links A request for some kind of tool to help automate all the automation when starting new projects. Maybe a cookie-cutter thing…. That would be cool. But frankly, the list is super helpful also. Extras: Brian: Sentry helping fund some OSS projects. black, pypi, pytest, structlog, gimli (last one is a Rust thing). Michael: Just launched a new 7.2 hour course: Python for absolute beginners Talk Python Training now streaming newest courses in HiDPI (nearly 4K) and it’s super crisp. More details here. AWS Cloud has decided to no longer publish awscli to #pypi pulling a 700M+ download package (via Anthony Sottile) The podcast RSS feed is a little smaller now. Joke: First law of software quality: e = mc^2 → errors = (more code)^2.
February 19, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: D-Tale suggested by @davidouglasmit via twitter “D-Tale is the combination of a Flask back-end and a React front-end to bring you an easy way to view & analyze Pandas data structures. It integrates seamlessly with ipython notebooks & python/ipython terminals. Currently this tool supports such Pandas objects as DataFrame, Series, MultiIndex, DatetimeIndex & RangeIndex.” way cool UI for visualizing data Live Demo shows Describe shows column statistics, graph, and top 100 values filter, correlations, charts, heat map Michael #2: Carnets by Nicolas Holzschuch A standalone Jupyter notebooks implementation for iOS. The power of Jupyter notebooks. In your pocket. Anywhere. Everything runs on your device. No need to setup a server, no need for an internet connection. Standard packages like Numpy, Matplotlib, Sympy and Pandas are already installed. You're ready to edit notebooks. Carnets uses iOS 11 filesharing ability. You can store your notebooks in iCloud, access them using other apps, share them. Extended keyboard on iPads, you get an extended toolbar with basic actions on your keyboard. Install more packages: Add more Python packages with %pip (if they are pure Python). OpenSource: Carnets is entirely OpenSource, and released under the FreeBSD license. Brian #3: BeeWare Podium suggested by Katie McLaughlin, @glasnt on twitter NOT a pip install, download a binary from https://github.com/beeware/podium/releases Linux and macOS Still early, so you gotta do the open and trust from the apps directory thing for running stuff not from the app store. But Oh man is it worth it. HTML5 based presentation frameworks are cool. run a presentation right in your browser. My favorite has been remark.js presenter mode, notes are especially useful while practicing a talk running timer super helpful while giving a talk write talk in markdown, so it’s super easy to version control issues: presenter mode, full screen, with extended monitor hard to do. notes and timer on laptop, full presentation on extended screen super cool but requires full screening with mouse Podium uses similar syntax as remark.js and I think uses remark under the hood. but it’s a native app, not a browser Handles the presenter mode and extended screen smoothly, like keynote and others. Removes the need for boilerplate html in your markdown file (remark.js md files have cruft). Can’t wait to try this out for my next presentation Michael #4: pytest-mock-resources via Daniel Cardin pytest fixture factories to make it easier to test against code that depends on external resources like Postgres, Redshift, and MongoDB. Code which depends on external resources such a databases (postgres, redshift, etc) can be difficult to write automated tests for. Conventional wisdom might be to mock or stub out the actual database calls and assert that the code works correctly before/after the calls. Whether the actual query did the correct thing truly requires that you execute the query. Having tests depend upon a real postgres instance running somewhere is a pain, very fragile, and prone to issues across machines and test failures. Therefore pytest-mock-resources (primarily) works by managing the lifecycle of docker containers and providing access to them inside your tests. Brian #5: How James Bennet is testing in 2020 Follow up from Testing Django applications in 2018 Favors unittest over pytest. tox for testing over multiple Django and Python versions, including tox-travis plugin pyenv for local Python installation management and pyenv-virtualenv plugin for venvs. Custom runtests.py for setting up environment and running tests. Changed to src/ directory layout. Coverage and reporting failure if coverage dips, with a healthy perspective: “… this isn’t because I have 100% coverage as a goal. Achieving that is so easy in most projects that it’s meaningless as a way to measure quality. Instead, I use the coverage report as a canary. It’s a thing that shouldn’t change, and if it ever does change I want to know, because it will almost always mean something else has gone wrong, and the coverage report will give me some pointers for where to look as I start investigating.” Testing is more than tests, it’s also black, isort, flake8, mypy, and even spell checking sphinx documentation. Using tox.ini for utility scripts, like cleanup, pipupgrade, … Michael #6: Python and PyQt: Building a GUI Desktop Calculator by by Leodanis Pozo Ramos at realpython Some interesting take-aways: Basics of PyQt Widgets: QWidget is the base class for all user interface objects, or widgets. These are rectangular-shaped graphical components that you can place on your application’s windows to build the GUI. Layout Managers: Layout managers are classes that allow you to size and position your widgets at the places you want them to be on the application’s form. Main Windows: Most of the time, your GUI applications will be Main Window-Style. This means that they’ll have a menu bar, some toolbars, a status bar, and a central widget that will be the GUI’s main element. Applications: The most basic class you’ll use when developing PyQt GUI applications is QApplication. This class is at the core of any PyQt application. It manages the application’s control flow as well as its main settings. Signals and Slots: PyQt widgets act as event-catchers. Widgets always emit a signal, which is a kind of message that announces a change in its state. Due to Qt licensing, you can only use the free version for non-commercial projects or internal non-redistributed or purchase a commercial license for $5,500/yr/dev. Extras Brian PyCascades 2020 livestream videos of day 1 & day 2 are available. Huge shout-out and thank you to all of the volunteers for this event. In particular Nina Zakharenko for calming me down before my talk. Michael Recording for Python for .NET devs webcast available. Take some of our free courses with our mobile app. Joke Why do programmers confuse Halloween with Christmas? Because OCT 31 == DEC 25. Speed dating is useless. 5 minutes is not enough to properly explain the benefits of the Unix philosophy.
February 11, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Kojo Idrissa! Michael #1: donkeycar Have you ever seen a proper RC car race? Donkeycar is minimalist and modular self driving library for Python. It is developed for hobbyists and students with a focus on allowing fast experimentation and easy community contributions. Use Donkey if you want to: Make an RC car drive its self. Compete in self driving races like DIY Robocars Experiment with autopilots, mapping computer vision and neural networks. Log sensor data (images, user inputs, sensor readings). Drive your car via a web or game controller. Leverage community contributed driving data. Use existing CAD models for design upgrades. Brian #2: RIP Pipenv: Tried Too Hard. Do what you need with pip-tools. Nick Timkovich No releases of pipenv in 2019. It “has been held back by several subdependencies and a complicated release process” main benefits of pipenv: pin everything and use hashes for verifying packages The two file concept (Pipfile Pipfile.lock) is pretty cool and useful But we can do that with pip-tools command line tool pip-compile, which is also used by pipenv: pip-compile --generate-hashes --ouptut-file requirements.txt requirements.in What about virtual environment support? python -m venv venv --prompt $(basename $PWD) or equivalent for your shell works fine, and it’s built in. Kojo #3: str.casefold() used for caseless matching “Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string.” especially helpful for Unicode characters firstString = "der Fluß" secondString = "der Fluss" # ß is equivalent to ss if firstString.casefold() == secondString.casefold(): print('The strings are equal.') else: print('The strings are not equal.') # prints "The strings are equal." Michael #4: Virtualenv via Brian Skinn Virtualenv 20.0.0 beta1 is available Announcement by Bernat Gabor Why the major release I identified three main pain points: Creating a virtual environment is slow (takes around 3 seconds, even in offline mode; while 3 seconds does not seem that long if you need to create tens of virtual environments, it quickly adds up). The API used within PEP-405 is excellent if you want to create virtual environments; however, only that. It does not allow us to describe the target environment flexibly or to do that without actually creating the environment. The duality of virtualenv versus venv. Right, python3.4 has the venv module as defined by PEP-405. In theory, we could switch to that and forget virtualenv. However, it is not that simple. virtualenv offers a few benefits that venv does not Benefits over venv Ability to discover alternate versions (-p 2 creates a python 2 virtual environment, -p 3.8 a python 3.8, -p pypy3 a PyPy 3, and so on). virtualenv packages out of the box the wheel package as part of the seed packages, this significantly improves package installation speed as pip can now use its wheel cache when installing packages. You are guaranteed to work even when distributions decide not to ship venv (Debian derivates notably make venv an extra package, and not part of the core binary). Can be upgraded out of band from the host python (often via just pip/curl - so can pull in bug fixes and improvements without needing to wait until the platform upgrades venv). Easier to extend, e.g., we added Xonsh activation script generation without much pushback, support for PowerShell activation on POSIX platforms. Brian #5: Property-based tests for the Python standard library (and builtins) Zac Hatfield-Dodds and Paul Ganssle, so far. Goal: Find and fix bugs in Python, before they ship to users. “CPython's existing test suite is good, but bugs still slip through occasionally. We think that using property-based testing tools - i.e. Hypothesis - can help with this. They're no magic bullet, but computer-assisted testing techniques routinely try inputs that humans wouldn't think of (or bother trying), and turn up bugs that humans missed.” “Writing tests that describe every valid input often leads to tighter validation and cleaner designs too, even when no counterexamples are found!” “We aim to have a compelling proof-of-concept by PyCon US, and be running as part of the CPython CI suite by the end of the sprints.” Hypothesis and property based testing is superb to throw at algorithmic pure functions, and the test criteria is relatively straightforward for function pairs that have round trip logic, like tokenize/untokenize, encode/decode, compress/decompress, etc. And there’s probably tons of those types of methods in Python. At the very least, I’m interested in this to watch how other people are using hypothesis. Kojo #6: PyCon US Tutorial Schedule & Registration Find the schedule at https://us.pycon.org/2020/schedule/tutorials/ They tend to sell out FAST Videos are up fast afterwards What’s interesting to me? Migration from Python 2 to 3 Welcome to Circuit Python (Kattni Rembor) Intro to Property-Based Testing Minimum Viable Documentation (Heidi Waterhouse) Extras Michael: Foreword for Mastering Python Networking Pyramid (Waitress) and Django both issued security CVEs. You should upgrade! StackOverflow Survey 2020 is open. Go fill it out! Joke See the cartoon: https://trello-attachments.s3.amazonaws.com/58e3f7c543422d7f3ad84f33/5df14f77efb5642d017a593f/31cba5cdf0e9805d47837916555dd7ab/b5cb6570af72883f06c3dcbf47679e9d.jpg
February 3, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Vicki Boykis: @vboykis Michael #1: clize: Turn functions into command-line interfaces via Marcelo Follow up from Typer on episode 164. Features Create command-line interfaces by creating functions and passing them to [clize.run](https://clize.readthedocs.io/en/stable/api.html#clize.run). Enjoy a CLI automatically created from your functions’ parameters. Bring your users familiar --help messages generated from your docstrings. Reuse functionality across multiple commands using decorators. Extend Clize with new parameter behavior. I love how this is pure Python without its own API for the default case Vicki #2: How to cheat at Kaggle AI contests Kaggle is a platform, now owned by Google, that allows data scientists to find data sets, learn data science, and participate in competitions Many people participate in Kaggle competitions to sharpen their data science/modeling skills Recently, a competition that was related to analyzing pet shelter data resulted in a huge controversy Petfinder.my is a platform that helps people find pets to rescue in Malaysia from shelters. In 2019, they announced a collaboration with Kaggle to create a machine learning predictor algorithm of which pets (worldwide) were more likely to be adopted based on the metadata of the descriptions on the site. The total prize offered was $25,000 After several months, a contestant won. He was previously a Kaggle grandmaster, and won $10k. A volunteer, Benjamin Minixhofer, offered to put the algorithm in production, and when he did, he found that there was a huge discrepancy between first and second place Technical Aspects of the controversy: The data they gave asked the contestants to predict the speed at which a pet would be adopted, from 1-5, and included input features like type of animal, breed, coloration, whether the animal was vaccinated, and adoption fee The initial training set had 15k animals and the teams, after a couple months, were then given 4k animals that their algorithms had not seen before as a test of how accurate they were (common machine learning best practice). In a Jupyter notebook Kernel on Kaggle, Minixhofer explains how the winning team cheated First, they individually scraped Petfinder.my to find the answers for the 4k test data Using md5, they created a hash for each unique pet, and looked up the score for each hash from the external dataset - there were 3500 overlaps Did Pandas column manipulation to get at the hidden prediction variable for every 10th pet and replaces the prediction that should have been generated by the algorithm with the actual value Using mostly: obfuscated functions, Pandas, and dictionaries, as well as MD5 hashes Fallout: He was fired from H20.ai Kaggle issued an apology Michael #3: Configuring uWSGI for Production Deployment We run a lot of uWSGI backed services. I’ve spoken in-depth back on Talk Python 215: The software powering Talk Python courses and podcast about this. This is guidance from Bloomberg Engineering’s Structured Products Applications group We chose uWSGI as our host because of its performance and feature set. But, while powerful, uWSGI’s defaults are driven by backward compatibility and are not ideal for new deployments. There is also an official Things to Know doc. Unbit, the developer of uWSGI, has “decided to fix all of the bad defaults (especially for the Python plugin) in the 2.1 branch.” The 2.1 branch is not released yet. Warning, I had trouble with die-on-term and systemctl Settings I’m using: # This option tells uWSGI to fail to start if any parameter # in the configuration file isn’t explicitly understood by uWSGI. strict = true # The master uWSGI process is necessary to gracefully re-spawn # and pre-fork workers, consolidate logs, and manage many other features master = true # uWSGI disables Python threads by default, as described in the Things to Know doc. enable-threads = true # This option will instruct uWSGI to clean up any temporary files or UNIX sockets it created vacuum = true # By default, uWSGI starts in multiple interpreter mode single-interpreter = true # Prevents uWSGI from starting if it is unable to find or load your application module need-app = true # uWSGI provides some functionality which can help identify the workers auto-procname = true procname-prefix = pythonbytes- # Forcefully kill workers after 60 seconds. Without this feature, # a stuck process could stay stuck forever. harakiri = 60 harakiri-verbose = true Vicki #4: Thinc: A functional take on deep learning, compatible with Tensorflow, PyTorch, and MXNet A deep learning library that abstracts away some TF and Pytorch boilerplate, from Explosion Already runs under the covers in SpaCy, an NLP library used for deep learning type checking, particularly helpful for Tensors: PyTorchWrapper and TensorFlowWrapper classes and the intermingling of both Deep support for numpy structures and semantics Assumes you’re going to be using stochastic gradient descent And operates in batches Also cleans up the configuration and hyperparameters Mainly hopes to make it easier and more flexible to do matrix manipulations, using a codebase that already existed but was not customer-facing. Examples and code are all available in notebooks in the GitHub repo Michael #5: pandas-vet via Jacob Deppen A plugin for Flake8 that checks pandas code Starting with pandas can be daunting. The usual internet help sites are littered with different ways to do the same thing and some features that the pandas docs themselves discourage live on in the API. Makes pandas a little more friendly for newcomers by taking some opinionated stances about pandas best practices. The idea to create a linter was sparked by Ania Kapuścińska's talk at PyCascades 2019, "Lint your code responsibly!" Vicki #6: NumPy beginner documentation NumPy is the backbone of numerical computing in Python: Pandas (which I mentioned before), scikit-learn, Tensorflow, and Pytorch, all lean heavily if not directly depend on its core concepts, which include matrix operations through a data structure known as a NumPy array (which is different than a Python list) - ndarray Anne Bonner wrote up new documentation for NumPy that introduces these fundamental concepts to beginners coming to both Python and scientific computing Before, you went directly to the section about arrays and had to search through it find what you wanted. The new guide, which is very nice, includes a step-by-step on how arrays work, how to reshape them, and illustrated guides on basic array operations. Extras: Vicki I write a newsletter, Normcore Tech, about all things tech that I’m not seeing covered in the mainstream tech media. I’ve written before about machine learning, data for NLP, Elon Musk memes, and Nginx. There’s a free version that goes out once a week and paid subscribers get access to one more newsletter per week, but really it’s more about the idea of supporting in-depth writing about tech. vicki.substack.com Michael: pip 20.0 Released - Default to doing a user install (as if --user was passed) when the main site-packages directory is not writeable and user site-packages are enabled, cache wheels built from Git requirements, and more. Homebrew: brew install python@3.8 Joke: An SEO expert walks into a bar, bars, pub, public house, Irish pub, tavern, bartender, beer, liquor, wine, alcohol, spirits...
January 27, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Amazon is now offering quantum computing as a service Amazon Braket – A fully managed service that allows scientists, researchers, and developers to begin experimenting with computers from multiple quantum hardware providers in a single place. We all know about bits. Quantum computers use a more sophisticated data representation known as a qubit or quantum bit. Each qubit can exist in state 1 or 0, but also in superpositions of 1 and 0, meaning that the qubit simultaneously occupies both states. Such states can be specified by a two-dimensional vector that contains a pair of complex numbers, making for an infinite number of states. Each of the complex numbers is a probability amplitude, basically the odds that the qubit is a 0 or a 1, respectively. Amazon Braket is a new service designed to let you get some hands-on experience with qubits and quantum circuits. You can build and test your circuits in a simulated environment and then run them on an actual quantum computer. See linked announcement. Language looks familiar: [1]: bell = Circuit().h(0).cnot(0, 1) print(device.run(bell, s3_folder).result().measurement_counts()) How it Works: Quantum computers work by manipulating the amplitudes of the state vector. To program a quantum computer, you figure out how many qubits you need, wire them together into a quantum circuit, and run the circuit. When you build the circuit, you set it up so that the correct answer is the most probable one, and all the rest are highly improbable. Brian #2: A quick-and-dirty guide on how to install packages for Python Brett Cannon Good modern intro to venv use. Pro short. simple. quick uses --prompt in every example (more people need to use this) and suggests using the directory name containing the env. send it to all your co-workers that STILL aren’t using virtual environments hints at an improved form of --prompt coming in Python 3.9 Con uses .venv, I’m a venv (no dot kinda guy) hints at an improved form of --prompt coming in Python 3.9 --prompt . will deduce the directory name. In 3.8 it just names your env “.”. Michael #3: Say No to the no code movement Article by Alex Hudson 2020 is going to be the year of “no code”: the movement that say you can write business logic and even entire applications without having the training of a software developer. Every company is a software company But software devs are in short supply and outcomes are variable two distinct benefits to transitioning business processes into the software domain “change control” becomes a software problem rather than a people problem. it’s easier to innovate on what makes a business distinct. The basic problem with “no code” the idea of writing business logic in text form according to the syntax of a technical programming language is anathema. The “simpler abstraction” misconception The “simpler syntax” misconception Configuration over code: Many No Code advocates are building significant systems by pulling together off-the-shelf applications and integrating them. But the logic has been implemented as configuration as opposed to code. The equivalence of code: There are reasons why developers still use plain text, if something came along that was better, many (not all!) developers would drop text like a hot rock. Where does “No code” fail in practice? 80% there and then … Where does “No code” succeed? “No Code” systems are extremely good for putting together proofs-of-concept which can demonstrate the value of moving forward with development. Brian #4: What I learned going from prison to Python Shadeed “Sha” Wallace-Stepter Presented at North Bay Python I got this recommended to be by many people, even those not in the Python community, including my good friends Chuck Forbes and Dr. Donna Beegle, who work to fight poverty. Amazing story. Go listen to it. Michael #5: A real QUICK → Qt5 based gUI generator for ClicK Via Ricky Teachey. Inspired by Gooey, the GUI generator for classical Python argparse-based command line programs. Take a standard Click-based app, add --gui to the command line and you get a GUI! Brian #6: Falsehoods programmers believe about time also More falsehoods programmers believe about time; “wisdom of the crowd” edition All of these assumptions are wrong There are always 24 hours in a day. Months have either 30 or 31 days. … A week always begins and ends in the same month. … The system clock will always be set to the correct local time The system clock will always be set to a time that is not wildly different from the correct local time. If the system clock is incorrect, it will at least always be off by a consistent number of seconds. … It will never be necessary to set the system time to any value other than the correct local time. Ok, testing might require setting the system time to a value other than the correct local time but it will never be necessary to do so in production. … Human-readable dates can be specified in universally understood formats such as 05/07/11. … from more … The day before Saturday is always Friday. … Two subsequent calls to a getCurrentTime() function will return distinct results. The second of two subsequent calls to a getCurrentTime() function will return a larger result. The software will never run on a space ship that is orbiting a black hole. Extras Michael: REMI GUI editor Joke https://twitter.com/mbbillz/status/921119218703257600
January 21, 2020
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: iterators, generators, coroutines Cool quick read article by Mark McDonnell. Starts with an attempt at a gentle introduction to the iterator protocol (why does everyone think that users need to start with this info?) Muscle through this part or just skim it. Should be an appendix. Generators (start here): functions that use yield Unbound generators: they don’t stop Generator Expressions: Like for v in ("foo" for i in range(5)): … Use parens instead of brackets, otherwise they are like list comprehensions. Specifically: (expression for item in collection if condition) Generators using generators / nested generators : yield from Given bar() and baz() are generators, this works: def foo(): yield from bar() yield from baz() Coroutines are an extension of generators “Generators use the yield keyword to return a value at some point in time within a function, but with coroutines the yield directive can also be used on the right-hand side of an = operator to signify it will accept a value at that point in time.” Then….. coroutine example, some asyncio stuff, … honestly I got lost. Bottom line: I’m still looking for a great tutorial on coroutines that doesn’t explain the iterator protocol (boring!) shows an example NOT using asyncio and NOT a REPL example I want to know how I can make use of coroutines in an actual program (toy ok) where the use of coroutines actually helps the structure and makes it more maintainable, etc. Michael #2: requests-toolbelt A toolbelt of useful classes and functions to be used with requests multipart/form-data encoder - The main attraction is a streaming multipart form-data object, MultipartEncoder. User-Agent constructor - You can easily construct a requests-style User-Agent string SSLAdapter - Allows the user to choose one of the SSL protocols made available in Python's ssl module for outgoing HTTPS connections ForgetfulCookieJar - prevents a particular requests session from storing cookies Brian #3: Pandas Validation We covered Bulwark in episode 162 There are other approaches and projects looking at the same problem. pandas-validation Suggested by Lance “… pandas-validation lets you create a template of what your pandas dataframe should look like and it'll validate the entire dataframe against that template. So if you have a dataframe with first column being strings second column being dates and the third being address, you can use a mixture of built in validate types to ensure your data conforms to that. It will even let you set up some regex and make sure that the data in a column conforms to that regex.” - Lance supports dates, timestamps, numeric values, strings pandera “pandera provides a flexible and expressive API for performing data validation on tidy (long-form) and wide data to make data processing pipelines more readable and robust." “pandas data structures contain information that pandera explicitly validates at runtime. This is useful in production-critical or reproducible research settings. “pandera enables users to: Check the types and properties of columns in a DataFrame or values in a Series. Perform more complex statistical validation. Seamlessly integrate with existing data analysis/processing pipelines via function decorators.” A few different approaches. I can’t really tell from the outside if there is a clear winner or solution that’s working better for most cases. I’d like to hear from listeners which they use, if any. Or if we missed the obvious validation method most people are using. Michael #4: qtpy I have been inspired to check out Qt again, but the libraries and versions a confusing. Provides an uniform layer to support PyQt5, PySide2, PyQt4 and PySide with a single codebase Basically, you can write your code as if you were using PySide2 but import Qt modules from qtpy instead of PySide2 (or PyQt5). Brian #5: pylightxl Viktor Kis submission “A light weight, zero dependency, minimal functionality excel read/writer python library” Well. Reader right now. Writing coming soon. :) Some cool examples in the docs to get you started grabbing data from spreadsheets right away. Features: Zero non-standard library dependencies Single source code that supports both Python37 and Python27. The light weight library is only 3 source files that can be easily copied directly into a project for those that have installation/download restrictions. In addition the library’s size and zero dependency makes pyinstaller compilation small and easy! 100% test-driven development for highest reliability/maintainability with 100% coverage on all supported versions API aimed to be user friendly, intuitive and to the point with no bells and whistles. Structure: database > worksheet > indexing example: db.ws('Sheet1').index(row=1,col=2) or db.ws('Sheet1').address(address='B1') Read excel files (.xlsx, .xlsm), all sheets or selective few for speed/memory management Index cells data by row/col number or address Calling an entire row/col of data returns an easy to use list output: db.ws('Sheet1').row(1) or db.ws('Sheet1').rows Worksheet data size is consistent for each row/col. Any data that is empty will return a ‘’ Michael #6: python-ranges via Aiden Price Continuous Range, RangeSet, and RangeDict data structures for Python Best understood as an example: tax_info = RangeDict({ Range(0, 9701): (0, 0.10, 0), Range(9701, 39476): (970, 0.12, 9700), ... }) income = int(input("What is your income? $")) base, marginal_rate, bracket_floor = tax_info[income] Range and RangeSet objects are mutually compatible for things like union(), intersection(), difference(), and symmetric_difference() Extras: Brian: pytest-check works with pytest-rerunfailures - with other plugins, it may not. - known incompatibility with flaky and retry Michael: Pandas goes 1.0 (via Jeremy Schendel). Just put out a release candidate for 1.0, and will be using SemVer going forward. PyCharm security from Anthony Shaw. Video for Python for Decision Makers webcast is out. Joke: Optimist: The glass is half full. Pessimist: The glass is half empty. Engineer: The glass is twice as large as it needs to be.
January 16, 2020
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: Data driven journalism via cjworkbench via Michael Paholski The data journalism platform with built in training Think spreadsheet + ETL automation Designed around modular tools for data processing -- table in, table out -- with no code required Features include: Modules to scrape, clean, analyze and visualize data An integrated data journalism training program Connect to Google Drive, Twitter, and API endpoints. Every action is recorded, so all workflows are repeatable and transparent All data is live and versioned, and you can monitor for changes. Write custom modules in Python and add them to the module library Brian #2: remi: A Platform-independent Python GUI library for your applications. Python REMote Interface library. “Remi is a GUI library for Python applications which transpiles an application's interface into HTML to be rendered in a web browser. This removes platform-specific dependencies and lets you easily develop cross-platform applications in Python!” No dependencies. pip install git+https://github.com/dddomodossola/remi.git doesn’t install anything else. Yes. Another GUI in a web page, but for quick and dirty internal tools, this will be very usable. Basic app: import remi.gui as gui from remi import start, App class MyApp(App): def __init__(self, *args): super(MyApp, self).__init__(*args) def main(self): container = gui.VBox(width=120, height=100) self.lbl = gui.Label('Hello world!') self.bt = gui.Button('Press me!') self.bt.onclick.do(self.on_button_pressed) container.append(self.lbl) container.append(self.bt) return container def on_button_pressed(self, widget): self.lbl.set_text('Button pressed!') self.bt.set_text('Hi!') start(MyApp) Michael #3: Typer Build great CLIs. Easy to code. Based on Python type hints. Typer is FastAPI's little sibling. And it's intended to be the FastAPI of CLIs. Just declare once the types of parameters (arguments and options) as function parameters. You do that with standard modern Python types. You don't have to learn a new syntax, the methods or classes of a specific library, etc. Based on Click Example (min version) import typer def main(name: str): typer.echo(f"Hello {name}") if __name__ == "__main__": typer.run(main) Brian #4: Effectively using Matplotlib Chris Moffitt “… I think I was a little premature in dismissing matplotlib. To be honest, I did not quite understand it and how to use it effectively in my workflow.” That very much sums up my relationship with matplotlib. But I’m ready to take another serious look at it. one reason for complexity is 2 interfaces MATLAB like state-based interface object based interface (use this) recommendations: Learn the basic matplotlib terminology, specifically what is a Figure and an Axes . Always use the object-oriented interface. Get in the habit of using it from the start of your analysis. Start your visualizations with basic pandas plotting. Use seaborn for the more complex statistical visualizations. Use matplotlib to customize the pandas or seaborn visualization. Runs through an example Describes figures and plots Includes a handy reference for customizing a plot. Related: StackOverflow answer that shows how to generate and embed a matplotlib image into a flask app without saving it to a file. Style it with pylustrator.readthedocs.io :) Michael #5: Django Simple Task django-simple-task runs background tasks in Django 3 without requiring other services and workers. It runs them in the same event loop as your ASGI application. Here’s a simple overview of how it works: On application start, a queue is created and a number of workers starts to listen to the queue When defer is called, a task(function or coroutine function) is added to the queue When a worker gets a task, it runs it or delegates it to a threadpool On application shutdown, it waits for tasks to finish before exiting ASGI server It is required to run Django with ASGI server. Example from django_simple_task import defer def task1(): time.sleep(1) print("task1 done") async def task2(): await asyncio.sleep(1) print("task2 done") def view(requests): defer(task1) defer(task2) return HttpResponse(b"My View") Brian #6: PyPI Stats at pypistats.org Simple interface. Pop in a package name and get the download stats. Example use: Why is my open source project now getting PRs and issues? I’ve got a few packages on PyPI, not updated much. cards and submark are mostly for demo purposes for teaching testing. pytest-check is a pytest plugin that allows multiple failures per test. I only hear about issues and PRs on one of these. So let’s look at traffic. cards: downloads day: 2 week: 24 month: 339 submark: day: 5 week: 9 month: 61 pytest-check: day: 976 week: 4,524 month: 19,636 That totally explains why I need to start actually supporting pytest-check. Cool. Note: it’s still small. Top 20 packages are all downloaded over 1.3 million times per day. Extras: Comment from January Python PDX West meetup “Please remember to have one beginner friendly talk per meetup.” Good point. Even if you can’t present here in Portland / Hillsboro, or don’t want to, I’d love to hear feedback of good beginner friendly topics that are good for meetups. PyCascades 2020 discount code listeners-at-pycascades for 10% off FireFox 72 is out with anti-fingerprinting and PIP - Ars Technica Joke: Language essays comic
January 9, 2020
Sponsored by us! Support us by visiting pythonbytes.fm/biz [courses] and pythonbytes.fm/pytest [book], or becoming a patron at patreon.com/pythonbytes Brian #1: Meditations on the Zen of Python Moshe Zadka The Zen of Python is not "the rules of Python" or "guidelines of Python". It is full of contradiction and allusion. It is not intended to be followed: it is intended to be meditated upon. Moshe give some of his thoughts on the different lines of the Zen of Python. Full Zen of Python can be found here or in a REPL with import this A few Beautiful is better than ugly Consistency helps. So black, flake8, pylint are useful. “But even more important, only humans can judge what humans find beautiful. Code reviews and a collaborative approach to writing code are the only realistic way to build beautiful code. Listening to other people is an important skill in software development.” Complex is better than complicated. “When solving a hard problem, it is often the case that no simple solution will do. In that case, the most Pythonic strategy is to go "bottom-up." Build simple tools and combine them to solve the problem.” Readability counts “In the face of immense pressure to throw readability to the side and just "solve the problem," the Zen of Python reminds us: readability counts. Writing the code so it can be read is a form of compassion for yourself and others.” Michael #2: nginx raided by Russian police Russian police have raided today the Moscow offices of NGINX, Inc., a subsidiary of F5 Networks and the company behind the internet's most popular web server technology. Russian search engine Rambler.ru claims full ownership of NGINX code. Rambler claims that Igor Sysoev developed NGINX while he was working as a system administrator for the company, hence they are the rightful owner of the project. Sysoev never denied creating NGINX while working at Rambler. In a 2012 interview, Sysoev claimed he developed NGINX in his free time and that Rambler wasn't even aware of it for years. Update Promptly following the event we took measures to ensure the security of our master software builds for NGINX, NGINX Plus, NGINX WAF and NGINX Unit—all of which are stored on servers outside of Russia. No other products are developed within Russia. F5 remains committed to innovating with NGINX, NGINX Plus, NGINX WAF and NGINX Unit, and we will continue to provide the best-in-class support you’ve come to expect. Brian #3: I'm not feeling the async pressure Armin Ronacher “Async is all the rage.” But before you go there, make sure you understand flow control and back pressure. “…back pressure is resistance that opposes the flow of data through a system. Back pressure sounds quite negative … but it's here to save your day.” If parts of your system are async, you have to make sure the entire flow throw the system doesn’t have overflow points. An example shown with reader/writer that is way hairier than you’d think it should be. “New Footguns: async/await is great but it encourages writing stuff that will behave catastrophically when overloaded.” “So for you developers of async libraries here is a new year's resolution for you: give back pressure and flow control the importance they deserve in documentation and API.” Michael #4: codetiming from Real Python via Doug Farrell A flexible, customizable timer for your Python code For a complete tutorial on how codetiming works, see Python Timer Functions: Three Ways to Monitor Your Code on Real Python. Time your code via A timer class A decorator A context manager Brian #5: Making Python Programs Blazingly Fast Martin Heinz Seemed like a good followup to the last topic Profiling with command line time python something.py python -m cProfile -s time something.py timing functions with wrapper Misses timeit, but see that also, https://docs.python.org/3.8/library/timeit.html How to make things faster: use built in types over custom types caching/memoization with lru_cache use local variables and local aliases when looping use functions… (kinda duh, but sure). don’t repeatedly access attributes in loops use f-strings over other formatting use generators. or at least experiment with them. the memory savings could result in speedup Michael #6: LocalStack via Graham Williamson and Jan 'oglop' Gazda A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline! LocalStack spins up the following core Cloud APIs on your local machine: S3, DynamoDB, Lambda, Elasticsearch see many more services paid one has more LocalStack builds on existing best-of-breed mocking/testing tools, most notably kinesalite/dynalite and moto. While these tools are awesome (!), they lack functionality for certain use cases. LocalStack combines the tools, makes them interoperable, and adds important missing functionality on top of them Has lots of config and knobs, but runs in docker so that helps Extras: Python Job Board Michael: Guido interviewed for JavaScript language! Microsoft: We're creating a new Rust-based programming language for secure coding New webcast: Python for the .NET developer Ace Python Interviews free course Joke: Types of software jobs.
January 3, 2020
Sponsored by DataDog: pythonbytes.fm/datadog Special guest: Aly Aly #1: Andrew Godwin - Just Add Await: Retrofitting Async into Django — DjangoCon 2019 Andrew is leading the implementation of asynchronous support for Django Overview of Async Landscape How synchronous and asynchronous code interact Async functions are different than sync functions which makes it hard to design APIs Difficulties in adding Async support to Django Django is a project that a lot of people are familiar with; it’s new async implementation also needs to feel familiar Plan was Implement async capabilities in three phases Phase 1: ASGI Support (Django 3.0) This phase lays the groundwork for future changes ORM is async-aware: using it from async code raises a SynchronousOnlyOperation exception Phase 2: Async Views, Async Handlers, and Async Middleware (Django 3.1) Add async capabilities for the core part of the request path There is a branch where things are mostly working, just need to fix a couple of tests Phase 3: Async ORM (Django 3.2 / 4.0) Largest, most difficult and most unbounded part of the project ORM queries can result in lots of database lookups; have to be careful here Async Project Wiki - project status, find out how to contribute Brian #2: gamesbyexample Al Sweigart “PythonStdioGames : A collection of games (with source code) to use for example programming lessons. Written in Python 3. Click on the src folder to view all of the programs.” I first learned programming by modifying games written by others and seeing what the different parts do when I change them. For me it was Lunar Lander on a TRS-80, and it took forever to type in the listing from the back of a magazine. But now, you can just clone a repo and play with existing files. Cool features: They're short, with a limit of 256 lines of code. They fit into a single source code file and have no installer. They only use the Python standard library. They only use stdio text; print() and input() in Python. They're well commented. They use as few programming concepts as possible. If classes, list comprehensions, recursion, aren't necessary for the program, then they are't used. Elegant and efficient code is worthless next to code that is easy to understand and readable. These programs are for education, not production. Standard best practices, like not using global variables, can be ignored to make it easier to understand. They do input validation and are bug free. All functions have docstrings. There’s also a todo list if people want to help out. Aly #3: Bulwark Open-source library that allows users to property test pandas DataFrames Goal is to make it easy for data analysts and data scientists to write tests Tests around data are different; they are not deterministic, they requires us to think about testing in a different way With property tests, we can check an object has a certain property Property tests for DataFrames includes validating the shape of the DataFrame, checking that a column is within a certain range, verifying a DataFrame has no NaNs, etc Bulwark allows you to implement property tests as checks. Each check Takes a DataFrame and optional arguments The check will make an assertion about a DataFrame property If the assertion passes, the check will return the original, unaltered DataFrame If the check fails, an AssertionError is raised and you have context around why it failed Bulwark also allows you to implement property checks as decorators This is useful if you design data pipelines as functions Each function take in input data, performs an action, returns output Add decorators validate properties of input DataFrame to pipeline functions Lots of builtin checks and decorators; easy to add your own Slides with example usage and tips: Property Testing with Pandas with Bulwark Brian #4: Poetry 1.0.0 Sebastien Eustace caution: not backwards compatible full change log Highlights: Poetry is getting serious. more ways to manage environments switch between python versions in a project with poetry env use /path/to/python or poetry env use python3.7 Imroved support for private indices (instead of just pypi) can specify index per dependency can specify a secondary index can specify a non-pypi index as default, avoiding pypi Env variable support to more easily work with poetry in a CI environment Improved add command to allow for constraints, paths, directories, etc for a dependency publishing allows api tokens marker specifiers on dependencies. Aly #5: Kubernetes for Full-Stack Developers With the rise of containers, Kubenetes has become the defacto platform for running and coordinating containerized applications across multiple machines With the rise of containers, Kubenetes is the defacto platform for running and coordinating applications across multiple machines This guide follows steps new users would take when learning how to deploy applications to Kubernetes: Learn Kubernetes core concepts Build modern 12 Factor web applications Get applications working inside of containers Deploy applications to Kubernetes Manage cluster operations New to containers? Check out my Introduction to Docker talk Brian #6: testmon: selects tests affected by changed files and methods On a previous episode (159) we mentioned pytest-picked and I incorrectly assumed it would run tests related to code that has changed, ‘cause it says “Run the tests related to the unstaged files or the current branch (according to Git)”. I was wrong, Michael was right. It runs the tests that are in modified test files. What I was thinking of is “testmon” which does what I was hoping for. “pytest-testmon is a pytest plugin which selects and executes only tests you need to run. It does this by collecting dependencies between tests and all executed code (internally using Coverage.py) and comparing the dependencies against changes. testmon updates its database on each test execution, so it works independently of version control.” If you had tried testmon before, like me, be aware that there have been significant changes in 1.0.0 Very cool to see continued effort on this project. Extras: Aly: Finding local Python User Groups PyCon.org Events Calendar Meetup.com search for Python PyTennessee 2019 on March 7 - 8. Tickets on sale now! I will be giving a talk on the Facade Design Pattern Brian: Next episode is planned to be a live recording during the Jan 7 Python PDX West meetup. There will also be 1-2 other talks. Joke: From Tyler Matteson Two coroutines walk into a bar. RuntimeError: 'bar' was never awaited. From Ben Sandofsky Q: How many developers on a message board does it take to screw in a light bulb? A: “Why are you trying to do that?”
December 18, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Anthony Herbert Anthony #1: Larry Hastings - Solve Your Problem With Sloppy Python - PyCon 2018 Michael’s personal automation things that I do all the time stripe to sheets automation urlify tons of reporting wakeup - to get 100 on Lighthouse deploy (on my servers) creating import data for video courses measuring duration of audio files Michael #2: Introduction to ASGI: Emergence of an Async Python Web Ecosystem by Florimond Manca Python growth is not just data science Python web development is back with an async spin, and it's exciting. One of the main drivers of this endeavour is ASGI , the Asynchronous Standard Gateway Interface. A guided tour about what ASGI is and what it means for modern Python web development. Since 3.5 was released, the community has been literally async-ifying all the things. If you're curious, a lot of the resulting projects are now listed in aio-libs and awesome-asyncio . An overview of ASGI Why should I care? Interoperability is a strong selling point, there are many more advantages to using ASGI-based components for building Python web apps. Speed: the async nature of ASGI apps and servers make them really fast (for Python, at least) — we're talking about 60k-70k req/s (consider that Flask and Django only achieve 10-20k in a similar situation). Features: ASGI servers and frameworks gives you access to inherently concurrent features (WebSocket, Server-Sent Events, HTTP/2) that are impossible to implement using sync/WSGI. Stability: ASGI as a spec has been around for about 3 years now, and version 3.0 is considered very stable. Foundational parts of the ecosystem are stabilizing as a result. To get your hands dirty, try out any of the following projects: uvicorn: ASGI server. Starlette: ASGI framework. TypeSystem: data validation and form rendering Databases: async database library. orm: asynchronous ORM. HTTPX: async HTTP client w/ support for calling ASGI apps (useful as a test client). Anthony #3: Python Insights Michael #4: Assembly via Luiz Honda Assembly is a Pythonic Object-Oriented Web Framework built on Flask, that groups your routes by class Assembly is a pythonic object-oriented, mid stack, batteries included framework built on Flask, that adds structure to your Flask application, and group your routes by class. Assembly allows you to build web applications in much the same way you would build any other object-oriented Python program. Assembly helps you create small to enterprise level applications easily. Decisions made for you + features: github.com/mardix/assembly#decisions-made-for-you--features Examples, root URLs: # Extends to Assembly makes it a route automatically # By default, Index will be the root url class Index(Assembly): # index is the entry route # -> / def index(self): return "welcome to my site" # method name becomes the route # -> /hello/ def hello(self): return "I am a string" # undescore method name will be dasherize # -> /about-us/ def about_us(self): return "I am a string" Example of /blog. # The class name is part of the url prefix # This will become -> /blog class Blog(Assembly): # index will be the root # -> /blog/ def index(self): return [ { "title": "title 1", "content": "content" }, ... ] # with params. The order will be respected # -> /comments/1234/ # 1234 will be passed to the id def comments(self, id): return [ { comments... } ] Anthony #5: Building a Standalone GPS Logger with CircuitPython using @Adafruit and particle hardware Michael #6: 10 reasons python is good to learn Python is popular and good to learn because, in Michael’s words, it’s a full spectrum language. And the reasons are: Python Is Free and Open-Source Python Is Popular, Loved, and Wanted Python Has a Friendly and Devoted Community Python Has Elegant and Concise Syntax Python Is Multi-Platform Python Supports Multiple Programming Paradigms Python Offers Useful Built-In Libraries Python Has Many Third-Party Packages Python Is a General-Purpose Programming Language Python Plays Nice with Others Extras: Michael: I was just on .NET Rocks podcast talking about Python for the .NET Developer New Python for the .NET Developer 9-hour course New Python for Decision Makers course, 2.5 hours of exploring Python for your org. Hidden files in Finder: use shortcut cmd+shift+. Anthony: Pretty Printed YouTube channel Joke: The failed pickup line A girl is hanging out at a bar with her friends. Some guy comes up to her an says: “You are the ; to my line of code.” She responds, “Get outta here creep, I code in Python.”
December 12, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Type Hints for Busy Python Programmers Al Sweigart, @AlSweigart We’ve (Michael and myself, of course) convinced you that type hints might be a good thing to help reduce bugs or confusion or whatever. Now what? Al’s got your back with this no nonsense guide to get you going very quickly. Written as a conversation between a programmer and an type hint expert. Super short. Super helpful. typing and mypy are the modules you need. There are other tools, but let’s start there. Doesn’t affect run time, so you gotta run the tool. Gradually add, don’t have to do everything in one go. Covers the basics And then the “just after basics” stuff you’ll run into right away when you start, like: Allowing a type and None: Union[int, NoneType] Optional parameters Shout out to Callable, Sequence, Mapping, Iterable, available in the documentation when you are ready for them later Just really a great get started today guide. Michael #2: auto-py-to-exe A .py to .exe converter using a simple graphical interface built using Eel and PyInstaller in Python. Using the Application Select your script location (paste in or use a file explorer) Outline will become blue when file exists Select other options and add things like an icon or other files Click the big blue button at the bottom to convert Find your converted files in /output when complete Short 3 min video. Brian #3: How to document Python code with Sphinx Moshe Zadka, @moshezadka I’m intimidated by sphinx. Not sure why. But what I’ve really just wanted to do is to use it for this use of generating documentation of code based on the code and the docstrings. Many of the tutorials I’ve looked at before got me stuck somewhere along the way and I’ve given up. But this looks promising. Example module with docstring shown. Simple docs/index.rst, no previous knowledge of restructured text necessary. Specifically what extensions do I need: autodoc, napolean, and viewcode example docs/conf.py that’s really short setting up tox to generate the docs and the magic command like incantation necessary: sphinx-build -W -b html -d {envtmpdir}/doctrees . {envtmpdir}/html That’s it. (well, you may want to host the output somewhere, but I can figure that out. ) Super simple. Awesome Michael #4: Snek is a cross-platform PowerShell module for integrating with Python via Chad Miars Snek is a cross-platform PowerShell module for integrating with Python. It uses the Python for .NET library to load the Python runtime directly into PowerShell. Using the dynamic language runtime, it can then invoke Python scripts and modules and return the result directly to PowerShell as managed .NET objects. Kind of funky syntax, but that’s PowerShell for you ;) Even allows for external packages installed via pip Brian #5:How to use Pandas to access databases Irina Truong, @irinatruong You can use pandas and sqlalchemy to easily slurp tables right out of your db into memory. But don’t. pandas isn’t lazy and reads everything, even the stuff you don’t need. This article has tips on how to do it right. Recommendation to use the CLI for exploring, then shift to pandas and sqlalchemy. Tips (with examples, not shown here): limit the fields to just those you care about limit the number of records with limit or by selecting only rows where a particular field is a specific value, or something. Let the database do joins, even though you can do it in pandas Estimate memory usage with small queries and .memory_usage().sum(). Tips on reading chunks and converting small int types into small pandas types instead of 64 bit types. Michael #6: ijson — Iterative JSON parser with a standard Python iterator interface Iterative JSON parser with a standard Python iterator interface Most common usage is having ijson yield native Python objects out of a JSON stream located under a prefix. Here’s how to process all European cities: // from: { "earth": { "europe": [ ... ] } } stream each entry in europe as item: objects = ijson.items(f, 'earth.europe.item') cities = (o for o in objects if o['type'] == 'city') for city in cities: do_something_with(city) Extras: Michael: Python decision makers webcast on January 14th, 9:30am US Pacific Guido steps down from Steering Council via Vincent POULAILLEAU GitHub Archive Program, Preserving open source software for future generations, video Python 2.7 will be removed from Homebrew, via Allan Hansen Django 3.0 released Joke: Question: "What is the best prefix for global variables?" Answer: # via shinjitsu A web developer walks into a restaurant. He immediately leaves in disgust as the restaurant was laid out in tables. via shinjitsu
December 3, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Final type PEP 591 -- Adding a final qualifier to typing This PEP proposes a "final" qualifier to be added to the typing module---in the form of a final decorator and a Final type annotation---to serve three related purposes: Declaring that a method should not be overridden Declaring that a class should not be subclassed Declaring that a variable or attribute should not be reassigned Some situations where a final class or method may be useful include: A class wasn’t designed to be subclassed or a method wasn't designed to be overridden. Perhaps it would not work as expected, or be error-prone. Subclassing or overriding would make code harder to understand or maintain. For example, you may want to prevent unnecessarily tight coupling between base classes and subclasses. You want to retain the freedom to arbitrarily change the class implementation in the future, and these changes might break subclasses. # Example for a class: from typing import final @final class Base: ... class Derived(Base): # Error: Cannot inherit from final class "Base" ... And for a method: class Base: @final def foo(self) -> None: ... class Derived(Base): def foo(self) -> None: # Error: Cannot override final attribute "foo" # (previously declared in base class "Base") ... It seems to also mean const RATE: Final = 3000 class Base: DEFAULT_ID: Final = 0 RATE = 300 # Error: can't assign to final attribute Base.DEFAULT_ID = 1 # Error: can't override a final attribute Brian #2: flit 2 Michael #3: Pint via Andrew Simon Physical units and builtin unit conversion to everyday python numbers like floats. Receive inputs in different unit systems it can make life difficult to account for that in software. Pint handles the unit conversion automatically in a wide array of contexts – Can add 2 meters and 5 inches and get the correct result without any additional work. The integration with numpy and pandas are seamless, and it’s made my life so much simpler overall. Units and types of measurements Think you need this? How about the Mars Climate Orbiter The MCO MIB has determined that the root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file, “Small Forces,” used in trajectory models. Specifically, thruster performance data in English units instead of metric units was used in the software application code titled SM_FORCES (small forces). Brian #4: 8 great pytest plugins Jeff Triplett Michael #5: 11 new web frameworks via LuisCarlos Contreras Sanic [flask like] - a web server and web framework that’s written to go fast. It allows the usage of the async / await syntax added in Python 3.5 Starlette [flask like] - A lightweight ASGI framework which is ideal for building high performance asyncio services, designed to be used either as a complete framework, or as an ASGI toolkit. Masonite - A developer centric Python web framework that strives for an actual batteries included developer tool with a lot of out of the box functionality. Craft CLI is the edge here. FastAPI - A modern, high-performance, web framework for building APIs with Python 3.6+ based on standard Python type hints. Responder - Based on Starlette, Responder’s primary concept is to bring the niceties that are brought forth from both Flask and Falcon and unify them into a single framework. Molten - A minimal, extensible, fast and productive framework for building HTTP APIs with Python. Molten can automatically validate requests according to predefined schemas. Japronto - A screaming-fast, scalable, asynchronous Python 3.5+ HTTP toolkit integrated with pipelining HTTP server based on uvloop and picohttpparser. Klein [flask like] - A micro-framework for developing production-ready web services with Python. It is ‘micro’ in that it has an incredibly small API similar to Bottle and Flask. Quart [flask like]- A Python ASGI web microframework. It is intended to provide the easiest way to use asyncio functionality in a web context, especially with existing Flask apps. BlackSheep - An asynchronous web framework to build event based, non-blocking Python web applications. It is inspired by Flask and ASP.NET Core. BlackSheep supports automatic binding of values for request handlers, by type annotation or by conventions. Cyclone - A web server framework that implements the Tornado API as a Twisted protocol. The idea is to bridge Tornado’s elegant and straightforward API to Twisted’s Event-Loop, enabling a vast number of supported protocols. Brian #6: Raise Better Exceptions in Python Extras Michael: Naming venvs --prompt Another new course coming soon: Python for decision makers and business leaders Some random interview over at Real Python: Python Community Interview With Brian Okken Joke via Daniel Pope What's a tractor's least favorite programming language? Rust.
November 27, 2019
This episode is sponsored by DigitalOcean - pythonbytes.fm/digitalocean Brian #1: Python already replaced Excel in banking “If you wanted to prove your mettle as an entry-level banker or trader it used to be the case that you had to know all about financial modeling in Excel. Not any more. These days it's all about Python, especially on the trading floor. "Python already replaced Excel," said Matthew Hampson, deputy chief digital officer at Nomura, speaking at last Friday's Quant Conference in London. "You can already walk across the trading floor and see people writing Python code...it will become much more common in the next three to four years." Michael #2: GitHub launches 'Security Lab' to help secure open source ecosystem At the GitHub Universe developer conference, GitHub announced the launch of a new community program called Security Lab GitHub says Security Lab founding members have found, reported, and helped fix more than 100 security flaws already. Other organizations, as well as individual security researchers, can also join. A bug bounty program with rewards of up to $3,000 is also available, to compensate bug hunters for the time they put into searching for vulnerabilities in open source projects. Bug reports must contain a CodeQL query. CodeQL is a new open source tool that GitHub released today; a semantic code analysis engine that was designed to find different versions of the same vulnerability across vasts swaths of code. Starting today automated security updates are generally available and have been rolled out to every active repository with security alerts enabled. Once a security flaw is fixed, the project owner can publish the security, and GitHub will warn all upstream project owners who are using vulnerable versions of the original maintainer's code. But before publishing a security advisory, project owners can also request and receive a CVE number for their project's vulnerability directly from GitHub. And last, but not least, GitHub also updated Token Scanning, its in-house service that can scan users' projects for API keys and tokens that have been accidentally left inside their source code. Brian #3: pybit.es now has some test challenges Uses pytest, coverage.py, and mutpy (for mutation testing) Most other challenges have tests that validate the code you write. New challenges (3 so far) have you write the tests for existing code. Tests are evaluated with both coverage.py and mutpy another mutation testing tool is mutmut, written about earlier this year by Ned Badtchelder. Michael #4: pyhttptest - a command-line tool for HTTP tests over RESTful APIs via Florian Dahlitz A command-line tool for HTTP tests over RESTful APIs Tired of writing test scripts against your RESTFul APIs anytime? Describe an HTTP Requests test cases in a simplest and widely used format JSON within a file. Run one command and gain a summary report. Example { "name": "TEST: List all users", "verb": "GET", "endpoint": "users", "host": "https://github.com", "headers": { "Accept-Language": "en-US" }, "query_string": { "limit": 5 } } Brian #5: xarray suggested by Guido Imperiale xarray is a mature library that builds on top of numpy, pandas and dask to offer arrays that are n-dimensional (numpy and dask do it, but pandas doesn't) self-described and indexed (pandas does it, but numpy and dask don't) out-of-memory, multi-threaded, and cloud-distributed (dask does it, but numpy and pandas don't). Additionally, xarray can semi-transparently swap numpy with other backends, such as sparse , while retaining the same API. Michael #6: Animated SVG Terminals Florian Dahlitz termtosvg is a Unix terminal recorder written in Python that renders your command line sessions as standalone SVG animations. Features: Produce lightweight and clean looking animations or still frames embeddable on a project page Custom color themes, terminal UI and animation controls via user-defined SVG templates Rendering of recordings in asciicast format made with asciinema Examples: nbedos.github.io/termtosvg/pages/examples.html Extras pytest 5.3.0 released, please read changelog if you use pytest, especially if you use it with CI and depend on --junitxml, as they have changed the format to a newer version. Michael: PyCon registration is open (via Jacqueline Wilson) Facebook: Microsoft's Visual Studio Code is now our default development platform Black friday at Talk Python Training! New course coming soon: Python for the .NET developer Jokes What do you get when you put root beer in a square glass? Beer. Q: What do you call optimistic front-end developers? A: Stack half-full developers. Also, I was going to tell a version control joke, but they are only funny if you git them.
November 20, 2019
This episode is sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: pydantic via Colin Sullivan Data validation and settings management using python type annotations. (We covered Cerberus, this is similar) pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid. class User(pydantic.BaseModel): id: int name = 'John Doe' signup_ts: datetime = None friends: List[int] = [] external_data = { 'id': '123', 'signup_ts': '2019-06-01 12:22', 'friends': [1, 2, '3'] } user = User(**external_data) id is of type int; the annotation-only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible; otherwise an exception will be raised. name is inferred as a string from the provided default; because it has a default, it is not required. signup_ts is a datetime field which is not required (and takes the value None if it's not supplied). Why use it? There's no new schema definition micro-language to learn. In benchmarks pydantic is faster than all other tested libraries. Use of recursive pydantic models, typing's standard types (e.g. List, Tuple, Dict etc.) and validators allow complex data schemas to be clearly and easily defined, validated, and parsed. As well as BaseModel, pydantic provides a [dataclass](https://pydantic-docs.helpmanual.io/usage/dataclasses/) decorator which creates (almost) vanilla python dataclasses with input data parsing and validation. Brian #2: Coverage.py 5.0 beta 1 adds context support Please try out the beta, even without trying contexts, as it helps Ned Batchelder to make sure it’s as backwards compatible as possible while still adding this super cool functionality. Coverage 5.0 beta 1 announcement The changes. Measurement contexts in depth. Trying out contexts with pytest and pytest-cov: (venv) $ pip install coverage==5.0b1 (venv) $ pip install pytest-cov (venv) $ pytest --cov=foo --cov-context=test test_foo.py (venv) $ coverage html --show-contexts (venv) $ open htmlcov/index.html results in coverage report that has little dropdowns on the right for lines that are covered, and what context they were covered. For the example above, with pytest-cov, it shows what test caused each line to be hit. Contexts can do way more than this. One example, split up different levels of tests, to see which lines are only hit by unit tests, indicating missing higher level tests, or the opposite. The stored db could also possibly be mined to see how much overlap there is between tests, and maybe help with higher level tools to predict the harm or benefit from removing some tests. I’m excited about the future, with contexts in place. Even if you ignore contexts, please go try out the beta ASAP to make sure your old use model still works. Michael #3: PSF is seeking developers for paid contract improving pip via Brian Rutledge The Python Software Foundation Packaging Working Group is receiving funding to work on the design, implementation, and rollout of pip's next-generation dependency resolver. This project aims to complete the design, implementation, and rollout of pip's next-generation dependency resolver. Lower the barriers to installing Python software, empowering users to get a version of a package that works. It will also lower the barriers to distributing Python software, empowering developers to make their work available in an easily reusable form. Because of the size of the project, funding has been allocated to secure two contractors, a senior developer and an intermediate developer, to work on development, testing and building test infrastructure, code review, bug triage, and assisting in the rollout of necessary features. Total pay: Stage 1: $116,375, Stage 2: $103,700 Brian #4: dovpanda - Directions OVer PANDAs Dean Langsam “Directions are hints and tips for using pandas in an analysis environment. dovpanda is an overlay for working with pandas in an analysis environment. "If you think your task is common enough, it probably is, and Pandas probably has a built-in solution. dovpanda is an overlay module that tries to understand what you are trying to do with your data, and help you find easier ways to write your code.” “The main usage of dovpanda is its hints mechanism, which is very easy and works out-of-the-box. Just import it after you import pandas, whether inside a notebook or in a console.” It’s like training wheels for pandas to help you get the most out of pandas and learn while you are doing your work. Very cool. Michael #5: removestar via PyCoders newsletter Tool to automatically replace 'import *' in Python files with explicit imports Report only mode and modify in place mode. Brian #6: pytest-quarantine : Save the list of failing tests, so that they can be automatically marked as expected failures on future test runs. Brian Rutlage Really nice email from Brian: >"Hi Brian! We've met a couple times at PyCon in Cleveland. Thanks for your podcasts, and your book. I've gone from being a complete pytest newbie, to helping my company adopt it, to writing a plugin. The plugin was something I developed at work, and they let me open-source it. I wanted to share it with you as a way of saying "thank you", and because you seem to be a bit of connoisseur of pytest plugins. ;)" Here it is: https://github.com/EnergySage/pytest-quarantine/” pytest has a cool feature called xfail, to allow you to mark tests you know fail. pytest-quarantine allows you to run your suite and generate a file of all failures, then use that to mark the xfails. Then you or your team can chip away at these failures until you get rid of them. But in the meantime, your suite can still be useful for finding new failures. And, the use of an external file to mark failures makes it so you don’t have to edit your test files to mark the tests that are xfail. Extras: MK: Our infrastructure is fully carbon neutral! Joke: A cop pulls Dr. Heisenberg over for speeding. The officer asks, "Do you know how fast you were going?" Heisenberg pauses for a moment, then answers, "No, but I know where I am.” [1] See Uncertainty principle, also called Heisenberg uncertainty principle or indeterminacy principle, statement, articulated (1927) by the German physicist Werner Heisenberg, that the position and the velocity of an object cannot both be measured exactly, at the same time, even in theory.
November 15, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guests: Dan Bader Cecil Philip Dan #1: Why You Should Use python -m pip https://snarky.ca/why-you-should-use-python-m-pip/ Cecil #2: Visual Studio Online: Web-Based IDE & Collaborative Code Editor https://visualstudio.microsoft.com/services/visual-studio-online/ Michael #3: Python Adopts a 12-month Release Cycle The long discussion on changing the Python project's release cadence has come to a conclusion: the project will now be releasing new versions on an annual basis. Described in PEP 602 The steering council thinks that having a consistent schedule every year when we hit beta, RC, and final it will help the community: Know when to start testing the beta to provide feedback Known when the expect the RC so the community can prepare their projects for the final release Know when the final release will occur to coordinate their own releases (if necessary) when the final release of Python occurs Allow core developers to more easily plan their work to make sure work lands in the release they are targeting Make sure that core developers and the community have a shorter amount of time to wait for new features to be released Dan #4: Black 19.10b0 Released — stable release coming soon https://twitter.com/llanga/status/1188968251918819329 “Black Friday” release date? https://twitter.com/llanga/status/1189145837991014402 Playground: https://black.now.sh/ Cecil 5: Navigating code on GitHub Example: https://github.com/talkpython/100daysofcode-with-python-course/blob/master/days/10-12-pytest/guess/guess.py Michael #6: lolcommits: selfies for software developers. lolcommits takes a snapshot with your webcam every time you git commit code, and archives a lolcat style image with it. git blame has never been so much fun. Infinite uses: Animate your progress through a project and watch as you age. See what you looked like when you broke the build. Keep a joint lolrepository for your entire company. Plugins: Lolcommits allows a growing list of plugins to perform additional work on your lolcommit image after capturing. Animate: Configure lolcommits to generate an animated GIF with each commit for extra lulz! Extras: Dan: Article & Course on Python 3.8 https://realpython.com/python38-new-features/ https://realpython.com/courses/cool-new-features-python-38/ Cecil: Twitch learning Python channel Michael: New Anvil course, free one - https://talkpython.fm/anvil PSF yearly survey is out: https://twitter.com/thepsf/status/1190004772704784385 Joke: LOLCODE
November 6, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: Guido retires Guido van Rossum has left DropBox and retired (post) Let’s all take a moment to say thank you. I wonder what will come next in terms of creative projects Some comments from community members (see Twitter thread) Brian #2: SeleniumBase Automated UI Testing with Selenium WebDriver and pytest. Very expressive and intuitive automation library built on top of Selenium WebDriver. method overview very readable (this is a workflow test, but still, quite readable): from seleniumbase import BaseCase class MyTestClass(BaseCase): def test_basic(self): self.open("https://xkcd.com/353/") self.assert_title("xkcd: Python") self.assert_element('img[alt="Python"]') self.click('a[rel="license"]') self.assert_text("free to copy and reuse") self.go_back() self.click("link=About") self.assert_text("xkcd.com", "h2") self.open("https://store.xkcd.com/collections/everything") self.update_text("input.search-input", "xkcd book\n") self.assert_exact_text("xkcd: volume 0", "h3") includes plugins for including screenshots in test results. supports major CI systems some cool features that I didn’t expect user onboarding demos assisted QA (partially automated with manual questions) support for selenium grid logs of command line options, including headless Michael #3: Reimplementing a Solaris command in Python gained 17x performance improvement from C Postmortem by Darren Moffat Is Python slow? A result of fixing a memory allocation issue in the /usr/bin/listusers command Decided to investigate if this ancient C code could be improved by conversion to Python. The C code was largely untouched since 1988 and was around 800 lines long, it was written in an era when the number of users was fairly small and probably existed in the local files /etc/passwd or a smallish NIS server. It turns out that the algorithm to implement the listusers is basically some simple set manipulation. Rewrite of listusers in Python 3 turned out to be roughly a 10th of the number of lines of code But Python would be slower right ? Turns out it isn't and in fact for some of my datasets (that had over 100,000 users in them) it was 17 times faster. A few of the comments asked about the availability of the Python version. The listusers command in Oracle Solaris 11.4 SRU 9 and higher. Since we ship the /usr/bin/listusers file as the Python code you can see it by just viewing the file in an editor. Note though that is not open source and is covered by the Oracle Solaris licenses. Brian #4: 20 useful Python tips and tricks you should know I admit it, I’m capable of getting link-baited by the occasional listicle. Some great stuff, especially for people coming from other languages. Chained assignment: x = y = z = 2 Chained comparison: 2 < x > y = ['a', 'b', 'c'] >>> zip(x, y) [(1, 'a'), (2, 'b'), (3, 'c')] and then some other weird stuff that I don’t find that useful. Michael #5: Complexity Waterfall via Ahrem Ahreff Heavy use of wemake-python-styleguide Code smells! Use your refactoring tools and write tests. Automation enable an opportunity of “Continuous Refactoring” and “Architecture on Demand” development styles. Brian #6: Plynth Plynth is a GUI framework for building cross-platform desktop applications with HTML, CSS and Python. Plynth has integrated the standard CPython implementation with Chromium's rendering engine. You can run your python scripts seamlessly with HTML/CSS instead of using Javascript with modules from pip Plynth uses Chromium/Electron for its rendering. With Plynth, every Javascript library turns into a Python module. Not open source. But free for individuals, including commercial use and education. A bunch of tutorial videos that are not difficult to follow, and not long, but… not really obvious code either. Python 3.6 and 3.7 development kits available Extras: Michael: Google Is Uncovering Hundreds Of Race Conditions Within The Linux Kernel Joke: Q: What's a web developer's favorite tea? A: URL gray via Aideen Barry
October 29, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Bob Belderbos Brian #1: Lesser Known Coding Fonts Interesting examination of some coding fonts. Link to a great talk called Cracking the Code, by Jonathan David Ross, about coding fonts and Input. I’m trying out Input Mono right now, and quite like it. Fira code: https://github.com/tonsky/FiraCode Bob #2: Django Admin Handbook As a Django developer knowing the admin is pretty important. Free ebook of 40 or so pages, you can consume it in one evening. There are a lot of good tricks, 3 I liked: How to optimize queries in Django admin (override get_queryset) How to export CSV from Django admin (useful for data analysis in Jupyter for example) How to override save behaviour for Django admin (used this to notify users upon publishing a new exercise on our platform) Some more cool ebooks on that site, e.g. Tweetable #Python. Michael #3: Your Guide to the CPython Source Code Let’s talk about exploring the CPython code You’ll want to get the code: git clone https://github.com/python/cpython Compile the code (Anthony gives lots of steps for macOS, Windows, and Linux) Structure: cpython/ │ ├── Doc ← Source for the documentation ├── Grammar ← The computer-readable language definition ├── Include ← The C header files ├── Lib ← Standard library modules written in Python ├── Mac ← macOS support files ├── Misc ← Miscellaneous files ├── Modules ← Standard Library Modules written in C ├── Objects ← Core types and the object model ├── Parser ← The Python parser source code ├── PC ← Windows build support files ├── PCbuild ← Windows build support files for older Windows versions ├── Programs ← Source code for the python executable and other binaries ├── Python ← The CPython interpreter source code └── Tools ← Standalone tools useful for building or extending Python Some cool “hidden” goodies. For example, check out Lib/concurrent/futures/process.py, it comes with a cool ascii diagram of the process. Lots more covered, that we don’t have time for The Python Interpreter Process The CPython Compiler and Execution Loop Objects in CPython The CPython Standard Library Installing a custom version Brian #4: Six Django template tags not often used in tutorials Here’s a few: {% empty %}, for use in for loops when the array is empty {% lorem \[count\] [method] [random] %} for automatically filling with Lorem Ipsum text. {% verbatim %} … {% endverbatim %}, stop the rendering engine from trying to parse it and replace stuff. https://hipsum.co/ Bob #5: Beautiful code snippets with Carbon Beautiful images, great for teaching Python / programming. Used by a lot of developer, nice example I spotted today. Supports typing and drag and drop, just generated this link by dropping a test module onto the canvas! Great to expand Twitter char limit (we use it to generate Python Tip images). Follow the project here, seems they now integrate with Github. Michael #6: Researchers find bug in Python script may have affected hundreds of studies More info via Mike Driscoll at Thousands of Scientific Papers May be Invalid Due to Misunderstanding Python In a paper published October 8, researchers at the University of Hawaii found that a programming error in a set of Python scripts commonly used for computational analysis of chemistry data returned varying results based on which operating system they were run on. Scientists did not understand that Python’s glob.glob() does not return sorted results Throwing doubt on the results of more than 150 published chemistry studies. the researcher were trying to analyze results from an experiment involving cyanobacteria discovered significant variations in results run against the same nuclear magnetic resonance spectroscopy (NMR) data. The scripts, called the "Willoughby-Hoye" scripts after their creators, were found to return correct results on macOS Mavericks and Windows 10. But on macOS Mojave and Ubuntu, the results were off by nearly a full percent. The module depends on the operating system for the order in which the files are returned. And the results of the scripts' calculations are affected by the order in which the files are processed. The fix: A simple list.sort()! Williams said he hopes the paper will get scientists to pay more attention to the computational side of experiments in the future. Extras: Nov 5 is the next Python PDX West Using Big Tech Tools Working on: PyBites platform: added flake8/ black code formatting, UI enhancements. Michael: Bezos DDoS'd: Amazon Web Services' DNS systems knackered by hours-long cyber-attack PyPI Just Crossed the 200,000 Packages Threshold! (via RP) XKCD Date — via André Jaenisch, Enter https://explainxkcd.com/wiki/index.php/1425:_Tasks and learn, that it was published on 24th Sep 2014. Joke: Q: What did the Network Administrator say when they caught a nasty virus? A: It hurts when IP
October 23, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: Building a Python C Extension Module Tutorial, learn to use the Python API to write Python C extension modules. And Invoke C functions from within Python Pass arguments from Python to C and parse them accordingly Raise exceptions from C code and create custom Python exceptions in C Define global constants in C and make them accessible in Python Test, package, and distribute your Python C extension module Extending Your Python Program there may be other lesser-used system calls that are only accessible through C Steps: Writing a Python Interface in C Figure out the arguments (e.g. int fputs(const char *, FILE *) ) Implement in C: #include Python.h static PyObject *method_fputs(PyObject *self, PyObject *args) { char *str, *filename = NULL; int bytes_copied = -1; /* Parse arguments */ if(!PyArg_ParseTuple(args, "ss", &str, &filename)) { return NULL; } FILE *fp = fopen(filename, "w"); bytes_copied = fputs(str, fp); fclose(fp); return PyLong_FromLong(bytes_copied); } In line 2, you declare the argument types you wish to receive from your Python code line 6, then you’ll see that PyArg_ParseTuple() copies into the char*’s Write regular C code (fopen, fputs) Return: PyLong_FromLong() creates a PyLongObject, which represents an integer object in Python. a few extra functions that are necessary write definitions of your module and the methods it contains Before you can import your new module, you first need to build it. You can do this by using the Python package distutils. Brian #2: What’s New in Python 3.8 - docs.python.org We’ve already talked about the big hitters: assignment expressions, (the walrus operator) positional only parameters, (the / in the param list) f-strings support = for self-documenting expressions and debugging There are a few more goodies I wanted to quickly mention: More async: python -m asyncio launches a native async REPL More helpful warnings and messages when using is and is not to compare strings and integers and other types intended to be compared with == and != Missing the comma in stuff like [(1,2) (3,4)]. Happens all the time with parametrized testing you can do iterable unpacking in a yield or return statement x = (1, 2, 3) a, *b = x return a, *b $8,935 / month Brian #6: Auto formatters for Python A comparison of autopep8, yapf, and black Auto formatters are super helpful for teams. They shut down the unproductive arguments over style and make code reviews way more pleasant. People can focus on content, not being the style police. We love black. But it might be a bit over the top for some people. Here are a couple of other alternatives. autopep8 - mostly focuses on PEP8 “autopep8 automatically formats Python code to conform to the PEP 8 style guide. It uses the pycodestyle utility to determine what parts of the code needs to be formatted. autopep8 is capable of fixing most of the formatting issues that can be reported by pycodestyle.” black - does more doesn’t have many options, but you can alter line length, can turn of string quote normalization, and you can limit or focus the files it sees. does a cool “check that the reformatted code still produces a valid AST that is equivalent to the original.” but you can turn that off with --fast yapf - way more customizable. Great if you want to auto format to a custom style guide. “The ultimate goal is that the code YAPF produces is as good as the code that a programmer would write if they were following the style guide. It takes away some of the drudgery of maintaining your code.” Article is cool in that it shows some sample code and how it’s changed by the different formatters. Extras: Michael: New courses coming Financial Aid Launches for PyCon US 2020! Joke: American Py Song From Eric Nelson: Math joke. “i is as complex as it gets. jk.”
October 15, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #1: JPMorgan’s Athena Has 35 Million Lines of Python 2 Code, and Won’t Be Updated to Python 3 in Time With 35 million lines of Python code, the Athena trading platform is at the core of JPMorgan's business operations. A late start to migrating to Python 3 could create a security risk. Athena platform is used internally at JPMorgan for pricing, trading, risk management, and analytics, with tools for data science and machine learning. This extensive feature set utilizes over 150,000 Python modules, over 500 open source packages, and 35 million lines of Python code contributed by over 1,500 developers, according to data presented by Misha Tselman, executive director at J.P. Morgan Chase in a talk at PyData 2017. And JPMorgan is going to miss the deadline Roadmap puts "most strategic components" compatible with Python 3 by the end of Q1 2020 JPMorgan uses Continuous Delivery, with 10,000 to 15,000 production changes per week "If you maintain a library that other developers depend on," the post states, "you may be preventing them from updating to 3. By holding other developers back, you are indirectly and likely unintentionally increasing the security risks of others," adding that developers who do not publish code publicly should "consider your colleagues who may also be using your code internally." Brian #2: organize suggested by Ariel Barkan a Python based file management automation tool configuration is via a yml file command line tool to organize your file system examples: move all of your screenshots off of your desktop into a screenshots folder move old incomplete downloads into trash remove empty files from certain folders organize receipts and invoices into date based folders Michael #3: PEP 589 – TypedDict: Type Hints for Dictionaries With a Fixed Set of Keys Author: Jukka Lehtosalo Sponsor: Guido van Rossum Status: Accepted Version: 3.8 PEP 484 defines the type Dict[K, V] for uniform dictionaries, where each value has the same type, and arbitrary key values are supported. It doesn't properly support the common pattern where the type of a dictionary value depends on the string value of the key. Core idea: Consider creating a type to validate an arbitrary JSON document with a fixed schema Proposed syntax: from typing import TypedDict class Movie(TypedDict): name: str year: int movie: Movie = {'name': 'Blade Runner', 'year': 1982} Operations on movie can be checked by a static type checker movie['director'] = 'Ridley Scott' # Error: invalid key 'director' movie['year'] = '1982' # Error: invalid value type ("int" expected) Brian #4: gazpacho gazpacho is a web scraping library “It replaces requests and BeautifulSoup for most projects. “ “gazpacho is small, simple, fast, and consistent.” example of using gazpacho to scrape hockey data for fantasy sports. simple interface, short scripts, really beginner friendly retrieve with get, parse with Soup. I don’t think it will completely replace the other tools, but for simple get/parse/find operations, it may make for slimmer code. Note, I needed to update certificates to get this to work. see this. Michael #5: How pip install Works via PyDist What happens when you run pip install [somepackage]? First pip needs to decide which distribution of the package to install. This is more complex for Python than many other languages There are 7 different kinds of distributions, but the most common these days are source distributions and binary wheels. A binary wheel is a more complex archive format, which can contain compiled C extension code. Compiling, say, numpy from source takes a long time (~4 minutes on my desktop), and it is hard for package authors to ensure that their source code will compile on other people's machines. Most packages with C extensions will build multiple wheel distributions, and pip needs to decide which if any are suitable for your computer. To find the distributions available, pip requests https://pypi.org/simple/[somepackage], which is a simple HTML page full of links, where the text of the link is the filename of the distribution. To select a distribution, pip first determines which distributions are compatible with your system and implementation of python. binary wheels, it parses the filenames according to PEP 425, extracting the python implementation, application binary interface, and platform. All source distributions are assumed to be compatible, at least at this step in the process Once pip has a list of compatible distributions, it sorts them by version, chooses the most recent version, and then chooses the "best" distribution for that version It prefers binary wheels if there are any Determining the dependencies for this distribution is not simple either. For binary wheels, the dependencies are listed in a file called METADATA. But for source distributions the dependencies are effectively whatever gets installed when you execute their setup.py script with the install command. What happens though if one of the distributions pip finds violates the requirements of another? It ignores the requirement and installs idna anyway! Next pip has to actually build and install the package. it needs to determine which library directory to install the package in—the system's, the user's, or a virtualenvs? Controlled by sys.prefix, which in turn is controlled by pip's executable path and the PYTHONPATH and PYTHONHOME environment variables. Finally, it moves the wheel files into the appropriate library directory, and compiles the python source files into bytecode for faster execution. Now your package is installed! Brian #6: daily pandas tricks Kevin Markham is sending out one pandas tip or trick per day via twitter. It’s been fun to watch and learn new bits. The link is a sampling of a bunch of them. Here’s just one example: Need to rename all of your columns in the same way? Use a string method: Replace spaces with _: df.columns = df.columns.str.replace(' ', '_') Make lowercase & remove trailing whitespace: df.columns = df.columns.str.lower().str.rstrip() Extras Michael: Switched to Adobe Audition Azure Databricks drops Python 2 Better Jupyter in VS Code macOS Catalina (so far so good) Jokes: via Sarcastic Pharmacist Hard to distinguish hard from easy in programming
October 10, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Python alternative to Docker Matt Layman Using Shiv, from LinkedIn Mentioned briefly in episode 114 Shiv uses zipapp, PEP 441. Execute code directly from a zip file. App code and dependencies can be bundled together. “Having one artifact eliminates the possibility of a bad interaction getting to your production system.” article includes an example of all the steps for packaging a Django app with Gunicorn. includes talking about deployment. Matt includes shoutouts to: Platform as a Service providers Manual steps to do it all. Docker Compares the process against Docker and discusses when to choose one over the other. Also an interesting read: Docker is in deep trouble Michael #2: How to support open-source software and stay sane via Jason Thomas written by Anna Nowogrodzki Releasing lab-built open-source software often involves a mountain of unforeseen work for the developers. Article opens: “On 10 April, astrophysicists announced that they had captured the first ever image of a black hole. This was exhilarating news, but none of the giddy headlines mentioned that the image would have been impossible without open-source software.” The image was created using Matplotlib, a Python library for graphing data, as well as other components of the open-source Python ecosystem. Just five days later, the US National Science Foundation (NSF) rejected a grant proposal to support that ecosystem, saying that the software lacked sufficient impact. Open-source software is widely acknowledged as crucially important in science, yet it is funded non-sustainably. “It’s sort of the difference between having insurance and having a GoFundMe when their grandma goes to the hospital,” says Anne Carpenter Challenges Scientists writing open-source software often lack formal training in software engineering. Yet poorly maintained software can waste time and effort, and hinder reproducibility. If your research group is planning to release open-source software, you can prepare for the support work Obsolescence isn’t bad, she adds: knowing when to stop supporting software is an important skill. However long your software will be used for, good software-engineering practices and documentation are essential. These include continuous integration systems (such as TravisCI), version control (Git) and unit testing. To facilitate maintenance, Varoquaux recommends focusing on code readability over peak performance. Brian #3: The Hippocratic License Coraline Ada Ehmke Interesting idea to derive from MIT, but add restrictions. This license adds these restrictions: “The software may not be used by individuals, corporations, governments, or other groups for systems or activities that actively and knowingly endanger, harm, or otherwise threaten the physical, mental, economic, or general well-being of individuals or groups in violation of the United Nations Universal Declaration of Human Rights” I could see others with different restrictions, or this but more. Michael #4: MATLAB vs Python: Why and How to Make the Switch MATLAB® is widely known as a high-quality environment for any work that involves arrays, matrices, or linear algebra. I personally used it for wavelet-decomposition of real time eye measurements during cognitively intensive human workloads… That toolbox costs $2000 per user. Difference in philosophy: Closed, paid vs. open source. Since Python is available at no cost, a much broader audience can use the code you develop Also, there is GNU Octave is a free and open-source clone of MATLAB apparently Brian #5: PyperCard - Easy GUIs for All Nicholas Tollervey Came up on episode 143 Also, episode 89 of Test & Code Really easy to quickly set up a GUI specified by a list of “Card” objects. (different from cards project) Simple examples are choose your own adventure type applications, where one button takes you to another card, and another button, a different card. However, the “next card” could be a Python function that can do anything, as long as it returns a string with the name of the next card. Lots of potential here, especially with input boxes, images, sound, and more. Super fun, but also might have business use. Michae #6: pynode Article: Bridging Node.js and Python with PyNode to Predict Home Prices Call python code from node.js Define a Python method In node: require pynode: const pynode = require('@fridgerator/pynode') Start an interpreter: pynode.startInterpreter() Call the function pynode.call('add', 1, 2, (err, result) => { if (err) return console.log('error : ', err) result === 3 // true }) Jokes The "Works on My Machine" Certification Program, get certified!
October 5, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Michael #1: How to Stand Out in a Python Coding Interview Real Python, by James Timmins Are tech interviews broken? Well at least we can try to succeed at them anyway You’ve made it past the phone call with the recruiter, and now it’s time to show that you know how to solve problems with actual code… Interviews aren’t just about solving problems: they’re also about showing that you can write clean production code. This means that you have a deep knowledge of Python’s built-in functionality and libraries. Things to learn Use enumerate() to iterate over both indices and values Debug problematic code with breakpoint() Format strings effectively with f-strings Sort lists with custom arguments Use generators instead of list comprehensions to conserve memory Define default values when looking up dictionary keys Count hashable objects with the collections.Counter class Use the standard library to get lists of permutations and combinations Brian #2: The Python Software Foundation has updated its Code of Conduct There’s now one code of conduct for PSF and PyCon US and other spaces sponsored by the PSF This includes some regional conferences, such as PyCascades, and some meetup groups, (ears perk up) The docs Code of Conduct Enforcement Guidelines Reporting Guidelines Do we need to care? all of us, yes. If there weren’t problems, we wouldn’t need these. attendees, yes. Know before you go. organizers, yes. Better to think about it ahead of time and have a plan than have to make up a strategy during an event if something happens. me, in particular, and Michael. Ugh. yes. our first meetup is next month. I’d like to be in line with the rest of Python. So, yep, we are going to have to talk about this and put something in place. Michael #3: The Interview Study Guide For Software Engineers A checklist on my last round of interviews that covers many of the popular topics. Warm Up With The Classics Fizz Buzz 560. Subarray Sum Equals K Arrays: Left Rotation Strings: Making Anagrams Nth Fibonacci Many many videos on interview topics and ideas Data Structures Algorithms Big O Notation Dynamic Programming String Manipulation System Design Operating Systems Threads Object Oriented Design Patterns SQL Fun conversation in the comments Brian #4: re-assert : “show where your regex match assertion failed” Anthony Sotille “re-assert provides a helper class to make assertions of regexes simpler.” The Matches objects allows for useful pytest assertion messages In order to get my head around it, I looked at the test code: https://raw.githubusercontent.com/asottile/re-assert/master/tests/re_assert_test.py and modified it to remove all of the with pytest.raises(AssertionError)… to actually get to see the errors and how to use it. def test_match_old(): > assert re.match('foo', 'fob') E AssertionError: assert None E + where None = [HTML_REMOVED]('foo', 'fob') E + where [HTML_REMOVED] = re.match test_re.py:8: AssertionError ____________ test_match_new ___________________ def test_match_new(): > assert Matches('foo') == 'fob' E AssertionError: assert Matches('foo') ^ == 'fob' E -Matches('foo') E - # regex failed to match at: E - # E - #> fob E - # ^ E +'fob' Michael #5: awesome-python-typing Collection of awesome Python types, stubs, plugins, and tools to work with them. Taxonomy Static type checkers Stub packages Tools Integrations Articles Communities Related Static type checkers: mypy - Optional static typing for Python 3 and 2 (PEP 484). Stub packages: Typeshed - Collection of library stubs for Python, with static types. Tools (super category): pytest-mypy - Mypy static type checker plugin for Pytest. Articles: Typechecking Django and DRF - Full tutorial about type-checking django. Brian #6: Developer Advocacy: Frequently Asked Questions Dustin Ingram I know a handful of people who have this job title. What is it? disclaimer: Dustin is a DA at Google. Other companies might be different What is it? “I help represent the Python community at [company]" “part of my job is to be deeply involved in the Python community.” working on projects that help Python, PyPI, packaging, etc. speaking at conferences talking to people. customers and non-customers talking to product teams being “user zero” for new products and features paying attention to places users might raise issues about products working in open source creating content for Python devs being involved in the community as a company rep representing Python in the company coordinating with other DAs Work/life? Not all DAs travel all the time. that was my main question. Talk Python episode: War Stories of the Developer Evangelists Extras: https://www.meetup.com/Python-PDX-West/ Michael: requests moves to PSF Joke: via https://twitter.com/NotGbo/status/1173667028965777410 Web Dev Merit Badges
September 25, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Dropbox: Our journey to type checking 4 million lines of Python Continuing saga, but this is a cool write up. Benefits “Experience tells us that understanding code becomes the key to maintaining developer productivity. Without type annotations, basic reasoning such as figuring out the valid arguments to a function, or the possible return value types, becomes a hard problem. Here are typical questions that are often tricky to answer without type annotations: Can this function return None? What is this items argument supposed to be? What is the type of the id attribute: is it int, str, or perhaps some custom type? Does this argument need to be a list, or can I give a tuple or a set?” Type checker will find many subtle bugs. Refactoring is easier. Running type checking is faster than running large suites of unit tests, so feedback can be faster. Typing helps IDEs with better completion, static error checking, and more. Long story, but really cool learnings of how and why to tackle adding type hints to a large project with many developers. Conclusion. mypy is great now, because DropBox needed it to be. Michael #2: Setting Up a Flask Application in Visual Studio Code Video, but also as a post Follow on to the same in PyCharm: video and post Steps outside VS Code Clone repo Create a virtual env (via venv) Install requirements (via requirements.txt) Setup flask app ENV variable flask deploy ← custom command for DB VS Code Open the folder where the repo and venv live Open any Python file to trigger the Python subsystem Ensure the correct VENV is selected (bottom left) Open the debugger tab, add config, pick Flask, choose your app.py file Debug menu, start without debugging (or with) Adding tests via VS Code Open command pallet (CMD SHIFT P), Python: Discover Tests, select framework, select directory of tests, file pattern, new tests bottle on the right bar Brian #3: Multiprocessing vs. Threading in Python: What Every Data Scientist Needs to Know How data scientists can go about choosing between the multiprocessing and threading and which factors should be kept in mind while doing so. Does not consider async, but still some great info. Overview of both concepts in general and some of the pitfalls of parallel computing. The specifics in Python, with the GIL Use threads for waiting on IO or waiting on users. Use multiprocessing for CPU intensive work. The surprising bit for me was the benchmarks Using something speeds up the code. That’s obvious. The difference between the two isn’t as great as I would have expected. A discussion of merits and benefits of both. And from the perspective of data science. A few more examples, with code, included. Michael #4: ORM - async ORM And https://github.com/encode/databases The orm package is an async ORM for Python, with support for Postgres, MySQL, and SQLite. SQLAlchemy core for query building. databases for cross-database async support. typesystem for data validation. Because ORM is built on SQLAlchemy core, you can use Alembic to provide database migrations. Need to be pretty async savy Brian #5: Getting Started with APIs dataquest.io post Conceptual introduction of web APIs Discussion of GET status codes, including a nice list with descriptions. examples: 301: The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint name is changed. 400: The server thinks you made a bad request. This can happen when you don’t send along the right data, among other things. endpoints endpoints that take query parameters JSON data Examples in Python for using: requests to query endpoints. json to load and dump JSON data. Michael #6: Memory management in Python This article describes memory management in Python 3.6. Everything in Python is an object. Some objects can hold other objects, such as lists, tuples, dicts, classes, etc. such an approach requires a lot of small memory allocations To speed-up memory operations and reduce fragmentation Python uses a special manager on top of the general-purpose allocator, called PyMalloc. Layered managers RAM OS VMM C-malloc PyMem Python Object allocator Object memory Three levels of organization To reduce overhead for small objects (less than 512 bytes) Python sub-allocates big blocks of memory. Larger objects are routed to standard C allocator. three levels of abstraction — arena, pool, and block. Block is a chunk of memory of a certain size. Each block can keep only one Python object of a fixed size. The size of the block can vary from 8 to 512 bytes and must be a multiple of eight A collection of blocks of the same size is called a pool. Normally, the size of the pool is equal to the size of a memory page, i.e., 4Kb. The arena is a chunk of 256kB memory allocated on the heap, which provides memory for 64 pools. Python's small object manager rarely returns memory back to the Operating System. An arena gets fully released If and only if all the pools in it are empty. Extras Brian: Tuesday, Oct 6, Python PDX West, Thursday, Sept 26, I’ll be speaking at PDX Python, downtown. Both events, mostly, I’ll be working on new programming jokes unless I come up with something better. :) Michael: Gitbook Call for Proposals for PyCon 2020 Is Open Jokes: A few I liked from the dad joke list. What do you call a 3.14 foot long snake? A π-thon What if it’s 3.14 inches, instead of feet? A μ-π-thon Why doesn't Hollywood make more Big Data movies? NoSQL. Why didn't the div get invited to the dinner party? Because it had no class.
September 18, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Annual Release Cycle for Python - PEP 602 Under discussion Annual release cadence Seventeen months to develop a major version 5 months unversioned, 7 months alpha releases when new features and fixes come in, 4 months of betas with no new features, 1 month final RCs. One year of full support, four more years of security fixes. Rationale/Goals smaller releases features and fixes to users sooner more gradual upgrade path predictable calendar releases that line up will with sprints and PyConUS explicit alpha phase decrease pressure and rush to get features into beta 1 Risks Increase concurrent supported versions from 4 to 5. Test matrix increase for integrators and distributions. PEP includes rejected ideas like: 9 month cadence keep 18 month cadence Michael #2: awesome-asgi A curated list of awesome ASGI servers, frameworks, apps, libraries, and other resources ASGI is a standard interface positioned as a spiritual successor to WSGI. It enables communication and interoperability across the whole Python async web stack: servers, applications, middleware, and individual components. Born in 2016 to power the Django Channels project, ASGI and its ecosystem have been expanding ever since, boosted by the arrival of projects such as Starlette and Uvicorn in 2018. Frameworks for building ASGI web applications. Bocadillo - Fast, scalable and real-time capable web APIs for everyone. Powered by Starlette. Supports HTTP (incl. SSE) and WebSockets. Channels - Asynchronous support for Django, and the original driving force behind the ASGI project. Supports HTTP and WebSockets with Django integration, and any protocol with ASGI-native code. FastAPI - A modern, high-performance web framework for building APIs with Python 3.6+ based on standard Python type hints. Powered by Starlette and Pydantic. Supports HTTP and WebSockets. Quart - A Python ASGI web microframework whose API is a superset of the Flask API. Supports HTTP (incl. SSE and HTTP/2 server push) and WebSockets. Responder - A familiar HTTP Service Framework for Python, powered by Starlette. (ASGI 2.0 only, ed.) Starlette - Starlette is a lightweight ASGI framework/toolkit, which is ideal for building high performance asyncio services. Supports HTTP and WebSockets. Brian #3: Jupyter meets the Earth Lindsey Heagy & Fernando Pérez “We are thrilled to announce that the NSF is funding our EarthCube proposal “Jupyter meets the Earth: Enabling discovery in geoscience through interactive computing at scale” “ “This project provides our team with $2 Million in funding over 3 years as a part of the NSF EarthCube program. It also represents the first time federal funding is being allocated for the development of core Jupyter infrastructure.” “Our project team includes members from the Jupyter and Pangeo communities, with representation across the geosciences including climate modeling, water resource applications, and geophysics. Three active research projects, one in each domain, will motivate developments in the Jupyter and Pangeo ecosystems. Each of these research applications demonstrates aspects of a research workflow which requires scalable, interactive computational tools.” “The adoption of open languages such as Python and the coalescence of communities of practice around open-source tools, is visible in nearly every domain of science. This is a fundamental shift in how science is conducted and shared.” Geoscience use cases climate data analysis hydrologic modeling geophysical inversions User-Centered Development data discovery scientific discovery through interactive computing established tools and data visualization using and managing shared computational infrastructure Michael #4: Asynchronous Django via Jose Nario Python compatibility Django 3.0 supports Python 3.6, 3.7, and 3.8. We highly recommend and only officially support the latest release of each series The Django 2.2.x series is the last to support Python 3.5. Other items but Big news: ASGI support Django 3.0 begins our journey to making Django fully async-capable by providing support for running as an ASGI application. This is in addition to our existing WSGI support. Django intends to support both for the foreseeable future. Note that as a side-effect of this change, Django is now aware of asynchronous event loops and will block you calling code marked as “async unsafe” - such as ORM operations - from an asynchronous context. Brian #5: The 1x Engineer Fun take on 10x. List actually looks like probably a 3-4x to me, maybe even 8x or more. How high does this scale go anyway? non-exhaustive list qualities, here’s a few. Has a life outside engineering. Writes code that others can read. Doesn't act surprised when someone doesn’t know something. Asks for help when they need it. Is able to say "I don't know." Asks questions. Constructively participates in code reviews. Can collaborate with others. Supports code, even if they did not write it. Can feel like an impostor at times. Shares knowledge. Never stops learning. [obviously listens to Python Bytes, Talk Python, and Test & Code] Is willing to leave their comfort zone. Contributes to the community. Has productive and unproductive days. Doesn't take themselves too seriously. Fails from time to time. Has a favorite editor, browser, and operating system, but realizes others do too. Michael #6: Sunsetting Python 2 January 1, 2020, will be the day that we sunset Python 2 Why are you doing this? We need to sunset Python 2 so we can help Python users. How long is it till the sunset date? pythonclock.org will tell you. What will happen if I do not upgrade by January 1st, 2020? If people find catastrophic security problems in Python 2, or in software written in Python 2, then most volunteers will not help fix them. I wrote code in Python 2. How should I port it to Python 3? Please read the official "Porting Python 2 Code to Python 3" guide. I didn't hear anything about this till just now. Where did you announce it? We talked about it at software conferences, on the Python announcement mailing list, on official Python blogs, in textbooks and technical articles, on social media, and to companies that sell Python support. Extras Brian: working on a Portland Westside Python Meetup, info will be at pythonpdx.com Hoping to get something ready for Oct. But… if not, hopefully by Nov. Michael: Humble Level Up Your Python Bundle
September 11, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: rapidtables “rapidtables … converts lists of dictionaries to pre-formatted tables. And it does the job as fast as possible.” Also can do color formatting if used in conjunction with termcolor.colored, but I’m mostly excited about really easily generating tabular data with print. Can also format to markdown or reStructured text, and can do alignment, … Michael #2: httpx A next generation HTTP client for Python. 🦋 HTTPX builds on the well-established usability of requests, and gives you: A requests-compatible API. HTTP/2 and HTTP/1.1 support. Support for issuing HTTP requests in parallel. (Coming soon) Standard synchronous interface, but with [async](https://www.encode.io/httpx/async/)/[await](https://www.encode.io/httpx/async/) support if you need it. Ability to make requests directly to WSGI or ASGI applications. This is particularly useful for two main use-cases: Using httpx as a client, inside test cases. Mocking out external services, during tests or in dev/staging environments. Strict timeouts everywhere. Fully type annotated. 100% test coverage. Lovely support for “parallel requests” without full asyncio (at the API level). Also pairs with async / await with async client. Plus all the requests features Brian #3: Quick and dirty mock service with Starlette Matt Layman Mock out / fake a third party service in a testing environment. Starlette looks fun, but the process can be used with other API producing server packages. We tell people to do things like this all the time, but there are few examples showing how to. This example also introduces a delay because the service used in production takes over a minute and part of the testing is to make sure the system under test handles that delay gracefully. Very cool, easy to follow write up. (Should probably have Matt on a Test & Code episode to talk about this strategy.) Michael #4: Mocking out AWS APIs via Giuseppe Cunsolo A library that allows you to easily mock out tests based on AWS infrastructure. Lovely use of a decorator to mock out S3 Moto isn't just for Python code and it isn't just for S3. Look at the standalone server mode for more information about running Moto with other languages. Be sure to check out very important note. Brian #5: μMongo: sync/async ODM “μMongo is a Python MongoDB ODM. It inception comes from two needs: the lack of async ODM and the difficulty to do document (un)serialization with existing ODMs.” works with common mongo drivers such as PyMongo, TxMongo, motor_asyncio, and mongomock. (Hadn’t heard of mongomock before, I’ll have to try that out.) Note: We’ve discussed MongoEngine before. (I’m curious what Michael has to say about uMongo.) Michael #6: Single Responsibility Principle in Python via Tyler Matteson I’m a big fan of the SOLID principles They even come in demotivator style posters DI Liskov Substitution Principle Open/Closed Principle Single Responsibility Principle Interface Segregation Principle This article will guide you through a complex process of writing simple code. Extras Michael: Code Challenge Bite 220. Analyzing @pythonbytes RSS feed Jokes Q: What do you get when you cross a computer and a life guard? A: A screensaver! Q: What do you get when you cross a computer with an elephant? A: Lots of memory! via https://github.com/wesbos/dad-jokes Anti-joke (we ready for those yet?): A Python developer, a PHP developer, a C# developer, and a Go developer went to lunch together. They had a nice lunch and got along fine.
September 8, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Trey Hunner Brian #1: Positional-only arguments in Python by Sanket Feature in 3.8 “To specify arguments as positional-only, a / marker should be added after all those arguments in the function definition. “ Great example of a pow(x, y, /, mod=None) function where the names x and y are meaningless and if given in the wrong order would be confusing. Trey #2: django-stubs Type checking for Django! It’s new and from my very quick testing on my Django site it definitely doesn’t work with everything yet, but it’s promising I don’t use type annotations in Django yet, but I very well might eventually Michael #3: CodeCombat Super fun game for learning to code Real code but incredibly easy coding Can subscribe or use the free tier Just got $6M VC funding Brian #4: Four Use Cases for When to Use Celery in a Flask Application or really any web framework by Nick Janetakis “Celery helps you run code asynchronously or on a periodic schedule which are very common things you'd want to do in most web projects.” examples: sending emails out connecting to 3rd party APIs. performing long running tasks. Like, really long. Running tasks on a schedule. Trey #5: pytest-steps Created by smarie Can use a generator syntax with yield statements to break a big test up into multiple named “steps” that’ll show up in your pytest output If one step fails, the rest of the steps will be skipped by default You can also customize it to make optional steps, which aren’t required for future steps to run, or steps which depend on other optional steps explicitly The documentation shows a lot of different ways to use it, but the generator approach looks by far the most readable to me (also state is shared between steps with this approach whereas the others require some fancy state-capturing object which looks confusing to me) I haven’t tried this, but my use case would be my end-to-end/functional tests, which would work great with steps because I’m often using Selenium to navigate between a number of different pages and forms, one click at a time. Michael #6: docassemble Created by Jonathan Pyle A free, open-source expert system for guided interviews and document assembly, based on Python, YAML, and Markdown. Features WYSIWYG: Compose your templates in .docx (with help of a Word Add-in) or .pdf files. Signatures: Gather touchscreen signatures and embed them in documents. Live chat: Assist users in real time with live chat, screen sharing, and remote screen control. AI: Use machine learning to process user input. SMS: Send text messages to your users E-mail: Send and receive e-mails in your interviews. OCR: Use optical character recognition to process images uploaded by the user. Multilingual: Offer interviews in multiple languages. Multiuser: Develop applications that involve more than one user, such as mediation or counseling interviews. Extensible: Use the power of Python to extend the capabilities of your interviews. Open: Package your interviews and use GitHub and PyPI to share your work with the docassemble user community. Background Tasks: Do things behind the scenes of the interview, even when the user is not logged in. Scalable: Deploy your interviews on multiple machines to handle high traffic. Secure: Protect user information with server-side encryption, two-factor authentication, document redaction, and other security features. APIs: Integrate with third-party applications using the API, or send your interviews input and extract output using code. Extras Michael: PyPI closes in on 200k NumPy 1.17.0 released Python 3.8.0b4 is out Joke via Avram Lubkin Knock! Knock! Who's there? Recursive function. Recursive function who? Knock! Knock! Nice. to get that joke, you’ll have to understand recursion. to understand recursion, either google “recursion”, and click on “did you mean “recursion”” learn it in small steps. step one, recursion text conversation: first person: “Hey, what’s your address?” second: [HTML_REMOVED] first: “No. Your local address” second: 127.0.0.1 first: “No. Your physical address” second: [HTML_REMOVED]
August 31, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Special guests Matt Harrison Anthony Sottile Michael #1: friendly-traceback via Jose Carlos Garcia (I think 🙂 ) Aimed at Python beginners: replacing standard traceback by something easier to understand Shows help for exception type Shows local variable values Shows code in a cleaner form with more context 3 ways to install As an exception hook Explicit explain When running an app Matt #2: Pandas Users Survey Most use it almost everyday but have less than 2 years experience Linux 61%, Windows 60%, Mac 42% 93% Python 3 Anthony #3: python3 “Y2K” problem (python3.10 / python4.0) with python3.8 close to release and python3.9 right around the corner, what comes after? both python3.10 and python4.0 present some problems sys.version[:3] which will suddenly report '``3.1``' in 3.10 a lot of code (including six.PY3!) uses sys.version_info[0] == 3 which will suddenly be false in python4.0 (and start running python2 code!) early-to-mid 2020 we should start seeing the next version in the wild as python3.9 reaches beta easy ways to start testing this early: python3.10 - a build of cpython for ubuntu with the version number changed flake8-2020 - a flake8 plugin which checks for these common issues- Michael #4: pypi research via Adam (Codependent Codr) Really interesting research paper on the current state of Pypi from a couple authors at the University of Michigan: "An Empirical Analysis of the Python Package Index" - https://arxiv.org/pdf/1907.11073.pdf Comprehensive empirical summary of the Python Package Repository, PyPI, including both package metadata and source code covering 178,592 packages, 1,745,744 releases, 76,997 contributors, and 156,816,750 import statements. We provide counts and trends for packages, releases, dependencies, category classifications, licenses, and package imports, as well as authors, maintainers, and organizations. Within PyPI, we find that the growth of the repository has been robust under all measures, with a compound annual growth rate of 47% for active packages, 39% for new authors, and 61% for new import statements over the last 15 years. In 2005, there were 96 active packages, 96! MIT is the most common license (Matt) I saw this and was surprised at most commonly used libraries. What do you think the most common 3rd party library is? Matt #5: DaPy “Pandas for humans” - Matt’s words Has portions of pandas, scikit-learn, yellowbrick, and numpy Designed for “data analysis, not for coders” Anthony #6: python-remote-pdb very small over-the-network remote debugger thin wrapper around pdb in a single file (easy to drop the file on PYTHONPATH if you can’t pip install) not as fully featured as other remote debuggers such as pudb / rpdb / pycharm’s debugger but very easy to drop in fully supports [breakpoint()](https://www.python.org/dev/peps/pep-0553/) (python3.7+ or via future-breakpoint) access pdb via telnet / nc / socat I’m using it to debug a text editor I’m writing to learn curses! Extras: Michael: Hacker Gets $12,000 In Parking Tickets After 'NULL' License Plate Trick Backfires PyCon 2020 site is up Matt http://bit.ly/psxgb - My new course on Machine Learning with XGBoost Anthony: https://github.com/DRMacIver/hecate “like selenium webdriver for the terminal” Jokes: Michael: Two mathematicians are sitting at a table in a pub having an argument about the level of math education among the general public. The one defending overall math knowledge gets up to go to the washroom. On the way back, he encounters their waitress and says, "I'll add an extra $10 to your tip, if you'll answer a question for me when I ask it. All you have to say is 'x-squared'." She agrees. A few minutes later the populist mathematician says to his buddy, "I'll bet you $20 that even our waitress can tell us the integral of 2x." The cynic agrees to the bet. So the schemer beckons the waitress to their table and asks the question, to which she replies "x-squared". As he begins to gloat and demand his winnings, the waitress continues, "Plus a constant." Anthony: I had a golang joke prepared, but then I panic()d
August 23, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Chris #1: Why your mock doesn’t work Ned Batchelder TDD is an important practice for development, and as my team is finding out, mocking objects is not as easy at it seems at first. I love that Ned gives an overview of how Mock works But also gives two resources to show you alternatives to Mock, when you really don’t need it. From reading these articles and video, I’ve learned that it’s hard to make mocks but it’s important to: Create only one mock for each object you’re mocking that mocks only what you need have tests that run the mock against your code and your mock against the third party Mahmoud #2: Vermin By Morten Kristensen Rules-based Python version compatibility detector caniuse is cool, but it’s based on classifiers. When it comes to your own code, it’ll only tell you what you tell it. If you’ve got legacy libraries, or like most companies, an application, then you’ll need something more powerful. Vermin tells you the minimum compatible Python version, all the way down to the module and even function level. Brian #3: The nonlocal statement in Python Abhilash Raj When global is too big of a hammer. This doesn’t work: def function(): x = 100 def incr(y): x = x + y incr(100) This does: def function(): x = 100 def incr(y): nonlocal x x = x + y incr(100) print(x) Chris #4: twitter.com/brettsky/status/1163860672762933249 Brett Cannon Microsoft Azure improves python support 2 key points about the new Python support in Azure Functions: it's debuting w/ 3.6, but 3.7 support is actively being worked on and 3.8 support won't take nearly as long, and native async/await support! Mahmoud #5: Awesome Python Applications update Presented at PyBay 2019 Slides/summary (video forthcoming): http://sedimental.org/talks.html#ask-the-ecosystem-lessons-from-250-foss-python-applications 250+ applications, dating back to 1998 (mailman, gedit) 95% of applications have commits in 2019 65% of applications support Python 3 (even the ones with a long history!) Other interesting findings Presenting these findings and more at PyGotham 2019. NYC in early October. Brian #6: pre-commit now has a quick start guide Wanna use pre-commit but don’t know how to start? Here ya go! Runs through install configuration installing hooks running hooks against your project I’d like to add Add hooks to your project one at a time For each new hook add to pre-commit-config.yml run pre-commit install to install hook run pre-commit run --``all-files review changes made to your project if good, commit if bad revert modify config of tools, such as pyproject.toml for black, .flake8 for flake8, etc. try again Extras Chris: Humble Bundle by No Starch supports the Python Software Foundation https://codechalleng.es/ released Newbie Bites… challenges that are intended for people brand new to python. [[direct link](https://gumroad.com/l/Xhxeo)] Mahmoud: PyGotham 2019 October (Maintainers Conf in Washington DC, too) Real Python Pandas course Brian: http://py3readiness.org/ shows 360 of the top downloaded Python packages are all Python 3 ready. Jokes I was looking for some programming one liners online; looked on a reddit thread; read a great answer; which was “any joke can be a one-liner with enough semicolons.” A SQL statement walks into to a bar and up to two tables and asks, “Mind if I join you?”
August 14, 2019
Special guest: Kelly Schuster-Paredes Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Keynote: Python 2020 - Łukasz Langa - PyLondinium19 Enabling Python on new platforms is important. Python needs to expand further than just CPython. Web, 3D games, system orchestration, mobile, all have other languages that are more used. Perhaps it’s because the full Python language, like CPython in full is more than is needed, and a limited language is necessary. MicroPython and CircuitPython are successful. They are limited implementations of Python Łukasz talks about many parts of Python that could probably be trimmed to make targeted platforms very usable without losing too much. It’d be great if more projects tried to implement Python versions for other platforms, even if the Python implementation is limited. Kelly #2: Mu Editor by Nicholas Tollervey Lots of updates happening to the Code with Mu software Mu is a Python code editor for beginner programmers originally created as a contribution from the Python Software Foundation for the BBC’s micro:bit project Code with Mu presented at EuroPython and shared a lot of interesting updates and things in the alpha version of Mu, available on code with Mu website. Mu is a modal editor: BBC Microbit Circuit Python ESP Micropython Pygame Zero Python 3 Tiago Monte’s recorded presentation at EuroPython Game with Turtle Flask — release notes Made with Mu at EuroPython videos Hot off the press: Nick just released Pypercard a HyperCard inspired GUI framework for BEGINNER developers in Python based off of Adafruit’s release. It is a “PyperCard is a HyperCard inspired Pythonic and deliberately constrained GUI framework for beginner programmers. linked repos on GitHub. module re-uses the JSON specification used to create HyperCard The concept allows user to “create Hypercard like stacks of states” to allow beginner coders to create choose their own adventure games. Michael #3: Understanding the Python Traceback by Chad Hansen The Python traceback has a wealth of information that can help you diagnose and fix the reason for the exception being raised in your code. What do we learn right away? The type of error A description of the error (hopefully, sometimes) The line of code the error occurred on The call stack (filenames, line numbers, and module names) If the error happened while handling another error Read from bottom to top — that was weird to me Most common error? AttributeError: 'NoneType' object has no attribute 'an_attribute' Article talks about other common errors Are you creating custom exceptions to make your packages more useful? Brian #4: My oh my, flake8-mypy and pytest-mypy contributed by Ray Cote via email “For some reason, I continually have problems running mypy, getting it to look at the correct paths, etc. However, when I run it from flake8-mypy, I'm getting reasonable, actionable output that is helping me slowly type hint my code (and shake out a few bugs in the process). There's also a pytest-mypy, which I've not yet tried. “ - Ray flake8-mypy ** Maintained by Łukasz Langa “The idea is to enable limited type checking as a linter inside editors and other tools that already support Flake8 warning syntax and config.” pytest-mypy Maintained by Dan Bader and David Tucker “Runs the mypy static type checker on your source files as part of your pytest test runs.” Remind me to do a PR against the README to make pytest lowercase. Kelly #5: Lego Education and Spike In March of this year, Lego Education gave news of a new robot being released since the EV3 released of Mindstorms in 2013. Currently the EV3 Mindstorm can be coded with Python and it is assumed that Spike Prime can be as well. The current EV3 robots can currently be coded in python thanks to Nigel Ward. He created a site back in 2016 or earlier; through a program called the EV3Dev project. ev3dev is a Debian Linux-based operating system Until recently, Lego had not endorsed the use of Python or had they released documentation. Lego released a Getting started with EV3 MicroPython 59 page guide Version 1.0.0 EV3 MicroPython runs on top of ev3dev with a new Pybricks MicroPython runtime and library. has its own Visual Studio Code extension no need for terminal Has instruction and lists of different features and classes used to program the PyBricks API- A python wrapper for the Databricks Rest API. Pybricks is on GitHub from one contributor, Sebastien Thomas under MIT license David Lechner, Laurens Valk, and Anton Vanhoucke are contributors of the Lego MicroPython release. This opens up opportunities for students that compete in the First Lego League Competition to code in Python. Example code for the Gyrobot Michael #6: Python 3 at Mozilla From January 2019. Mozilla uses a lot of Python. In mozilla-central there are over 3500 Python files (excluding third party files), comprising roughly 230k lines of code. Additionally there are 462 repositories labelled with Python in the Mozilla org on Github That’s a lot of Python, and most of it is Python 2. But before tackling those questions, I want to address another one that often comes up right off the bat: Do we need to be 100% migrated by Python 2’s EOL? No. But punting the migration into the indefinite future would be a big mistake: Python 2 will no longer receive security fixes. All of the third party packages we rely on (and there are a lot of them) will also stop being supported Delaying means more code to migrate Opportunity cost: Python 3 was first released in 2008 and in that time there have been a huge number of features and improvements that are not available in Python 2. The best time to get serious about migrating to Python 3 was five years ago. The second best time is now. Moving to Python 3 We stood up some linters. One linter that makes sure Python files can at least get imported in Python 3 without failing One that makes sure Python 2 files use appropriate __future__ statements to make migrating that file slightly easier in the future. Pipenv & poetry & Jetty: a little experiment I’ve been building. It is a very thin wrapper around Poetry Extras Brian: Python 3.8.0b3 “We strongly encourage maintainers of third-party Python projects to test with 3.8 during the beta phase and report issues …” Michael: pipx now has shell completions Kelly: Teaching Python podcast Jokes via Real Python and Nick Spirit Python private method → Joke cartoon image.
August 6, 2019
Special guest: Brett Thomas Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Writing sustainable Python scripts Vincent Bernat Turning a quick Python script into a maintainable bit of software. Topics covered: Documentation as a docstring helps future users/maintainers know what problem you are solving. CLI arguments with defaults instead of hardcoded values help extend the usability of the script. Logging. Including debug logging (and how to turn them on with CLI arguments), and system logging for unattended scripts. Tests. Simple doctests, and pytest tests utilizing parametrize to have one test and many test cases. Brett #2: Static Analysis and Bandit Michael #3: jupyter-black Black formatter for Jupyter Notebook One of the big gripes I have about these online editors is their formatting (often entirely absent) Then the extension provides a toolbar button a keyboard shortcut for reformatting the current code-cell (default: Ctrl-B) a keyboard shortcut for reformatting whole code-cells (default: Ctrl-Shift-B) Brian #4: Report Generation workflow with papermill, jupyter, rclone, nbconvert, … Chris Moffitt articles Automated Report Generation with Papermill: Part 1 Automated Report Generation with Papermill: Part 2 Jupyter Notebooks used to create a report with pandas and matplotlib nbconvert to create an html report Papermill to parametrize the process with different data, and execute the notebook Copy the reports to shared cloud folders using Rclone. Set up a process to automate everything. Hook it up to cron to run regularly Brett #5: Rant on time deltas datetime.timedelta(months=1) # Boom, too bad. Use: https://dateutil.readthedocs.io/en/stable/ Michael #6: How — and why — you should use Python Generators by Radu Raicea Generator functions allow you to declare a function that behaves like an iterator. They allow programmers to make an iterator in a fast, easy, and clean way. They only compute it when you ask for it. This is known as lazy evaluation. If you’re not using generators, you’re missing a powerful feature Often they result in simpler code than with lists and standard functions Extras Brian: PyPI now supports uploading via API token also on Test PyPI Michael: Chocolatey package manager on windows via Prayson Daniel GvM’s Next PEG article Jokes A good programmer is someone who always looks both ways before crossing a one-way street. (reminds me of another joke: Adulthood is like looking both ways before crossing the street, then getting hit by an airplane) Little bobby tables
July 29, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Debugging with f-strings in Python 3.8 We’ve talked about the walrus operator, :=, but not yet “debug support for f-strings” this: print(f'foo={foo} bar={bar}') can change to this: print(f'{foo=} {bar=}') and if you don’t want to print with repr() you can have str() be used with !s. print(f'{foo=!s} {bar=!s}') also !f can be used for float modifiers: >>> import math >>> print(f'{math.pi=!f:.2f}') math.pi=3.14 one more feature, space preservation in the f-string expressions: >>> a = 37 >>> print(f'{a = }, {a = }') a = 37, a = 37 Michael #2: Am I "real" software developer yet? by Sun-Li Beatteay To new programmers joining the field, especially those without CS degrees, it can feel like the title is safe-guarded. Only bestowed on the select that have proven themselves. Sometimes manifests itself as Impostor Syndrome Focused on front-end development as I had heard that HTML, CSS and JavaScript were easy to pick up That was when I decided to create a portfolio site for my wife, who was a product designer. Did my best to surround myself with tech culture. Watched YouTube videos listened to podcasts read blog posts from experienced engineers to keep myself motivated. Daydreamed what it would be like to stand in their shoes. My wife’s website went live in July of that year. I had done it. Could I finally start calling myself something of a Software Engineer? “Web development isn’t real programming” Spent the next 18 months studying software development full time. I quit my job and moved in with my in-laws — which was a journey in-and-of itself. Software engineer after 1-2 years? No so fast (says the internet) The solution that I found for myself was simple yet terrifying: talking to people MK: BTW, I don’t really like the term “engineer” Brian #3: Debugging with local variables and snoop debugging tools ex: “You want to know which lines are running and which aren't, and what the values of the local variables are.” Throw a @snoop decorator on a function and the function lines and local variable values will be dumped to stderr during run. Even showing loops a bunch of times. It’s tools to almost debug as if you had a debugger, without a debugger, and without having to add a bunch of logging or print statements. Lots of other use models to allow more focus. wrap just part of your function with a with snoop block only watch certain local variables. turn off reporting for deep function/block levels. Michael #4: New home for Humans This came out of the blue with some trepidation: kennethreitz commented 6 days ago: In the spirit of transparency, I'd like to (publicly) find a new home for my repositories. I want to be able to still make contributions to them, but no longer be considered the "owner" or "arbiter" or "BDFL" of these repositories. Some notable repos: https://github.com/kennethreitz/requests https://github.com/kennethreitz/records https://github.com/kennethreitz/requests-html https://github.com/kennethreitz/setup.py https://github.com/kennethreitz/legit https://github.com/kennethreitz/responder Lots of back and forth until Ernest jumped in. The Python Software Foundation would like to offer to accept transfers of these repositories into the @psf GitHub organization. This organization was recently acquired by the Python Software Foundation and intended to provide administrative backstopping for projects in the ecosystem; existing maintainers of various projects will remain and the PSF staff will be available to manage repositories and teams as necessary. Brian #5: The Backwards Commercial License Eran Hammer - open source dev, including hapi.js Interesting idea to make open source projects maintainable Three phases of software lifecycle for some projects: first: project created to fill a need in one project/team/company, a single use case second: used by many, active community, growing audience three: work feels finished. bug fixes, security issues, minor features continue, but most people can stay on old stable versions During the “done” phase, companies would like to have bug fixes but don’t want to have to keep changing their code to keep up. Idea: commercial license to support old stable versions. “If you keep up with the latest version, you do not require a license (unless you want the additional benefits it will provide).” “However, very few companies can quickly migrate every time there is a new major release of a core component. Engineering resources are limited and in most cases, are better directed at building great products than upgrading supporting infrastructure. The backwards license provides this exact assurance. You can stay on any version you would like knowing that you are still running supported, well-maintained, and secure code.” “The new commercial license will include additional benefits focused on providing enterprise customers the assurances needed to rely on these critical components for many years to come. “ Michael #6: Switching Python Parsers? via Gi Bi, article by Guido van Rossum Alternative to the home-grown parser generator that I developed 30 years ago when I started working on Python. (That parser generator, dubbed “pgen”, was just about the first piece of code I wrote for Python.) Here are some of the issues with pgen that annoy me. The “1” in the LL(1) moniker implies that it uses only a single token lookahead, and this limits our ability of writing nice grammar rules. Because of the single-token lookahead, the parser cannot determine whether it is looking at the start of an expression or an assignment. So how does a PEG parser solve these annoyances? By using an infinite lookahead buffer! The typical implementation of a PEG parser uses something called “packrat parsing”, which not only loads the entire program in memory before parsing it, but also allows the parser to backtrack arbitrarily. Why not sooner? Memory! But that is much less of an issue now. My idea now, putting these things together, is to see if we can create a new parser for CPython that uses PEG and packrat parsing to construct the AST directly during parsing, thereby skipping the intermediate parse tree construction, possibly saving memory despite using an infinite lookahead buffer Extras Brian: Plone 5.2 https://plone.org/news/2019/plone-5-2-the-future-proofing-release Plone is a content management system built on top of Zope, a web application server framework. Plone 5.2 supports Python 3.6, 3.7, 3.8 uses Zope 4, which also support Python 3 Multi-year effort Interview with Philip Bauer, organizer of 5.2. Michael: Building Dab and T-Pose Controlled Lights - Make Art with Python Jokes A couple of quick ones: “What is a whale’s favorite language?” “C” — via Eric Nelson Why does Pythons live on land? Because it is above C-level! — via Jesper Kjær Sørensen @JKSlonester
July 23, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Becoming a 10x Developer : 10 ways to be a better teammate Kate Heddleston “A 10x engineer isn’t someone who is 10x better than those around them, but someone who makes those around them 10x better.” Create an environment of psychological safety Encourage everyone to participate equally Assign credit accurately and generously Amplify unheard voices in meetings Give constructive, actionable feedback and avoid personal criticism Hold yourself and others accountable Cultivate excellence in an area that is valuable to the team Educate yourself about diversity, inclusivity, and equality in the workplace Maintain a growth mindset Advocate for company policies that increase workplace equality article includes lots of actionable advice on how to put these into practice. examples: Ask people their opinions in meetings. Notice when someone else might be dominating a conversation and make room for others to speak. Michael #2: quasar & vue.py via Doug Farrell Quasar is a Vue.js based framework, which allows you as a web developer to quickly create responsive++ websites/apps in many flavours: SPAs (Single Page App) SSR (Server-side Rendered App) (+ optional PWA client takeover) PWAs (Progressive Web App) Mobile Apps (Android, iOS, …) through Apache Cordova Multi-platform Desktop Apps (using Electron) Great for python backends tons of vue components But could it be all python? vue.py provides Python bindings for Vue.js. It uses brython to run Python in the browser. Examples can be found here. Brian #3: Regular Expressions 101 We talked about regular expressions in episode 138 Some tools shared with me after I shared a regex joke on twitter, including this one. build expressions for Python and also PHP, JavaScript, and Go put in an example, and build the regex to match explanations included match information including match groups and multiple matches quick reference of all the special characters and what they mean generates code for you to see how to use it in Python Also fun (and shared from twitter): Regex Golf see how far you can get matching strings on the left but not the list on the right. I got 3 in and got stuck. seems I need to practice some more Michael #4: python-diskcache Caching can be HUGE for perf benefits But memory can be an issue Persistence across executions (e.g. web app redeploy) an issue Servers can be issues themselves Enter the disk! Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python. DigitalOcean and many hosts now offer SSD’s be default Unfortunately the file-based cache in Django is essentially broken. DiskCache efficiently makes gigabytes of storage space available for caching. By leveraging rock-solid database libraries and memory-mapped files, cache performance can match and exceed industry-standard solutions. There's no need for a C compiler or running another process. Performance is a feature Testing has 100% coverage with unit tests and hours of stress. Nice comparison chart Brian #5: The Python Help System Overview of the built in Python help system, help() examples to try in a repl help(print) help(dict) help('assert') import math; help(math.log) Also returns docstrings from your non-built-in stuff, like your own methods. Michael #6: Python Architecture Graphs by David Seddon Impulse - a CLI which allows you to quickly see a picture of the import graph any installed Python package at any level within the package. Useful to run on an unfamiliar part of a code base, to help get a quick idea of the structure. It's a visual explorer to give you a quick signal on architecture. Import Linter - this allows you to declare and check contracts about your dependency graph, which gives you the ability to lint your code base against architectural rules. Helpful to enforce certain architectural constraints and prevent circular dependencies creeping in. Extras Michael: tabnanny flask course is out, give it a look Jokes Two threads walk into a bar. The barkeeper looks up and yells, 'Hey, I want don't any conditions race like time last!’ A string value walked into a bar, and then was sent to stdout.
July 18, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Ines Montani Brian #1: Simplify Your Python Developer Environment Contributed by Nils de Bruin “Three tools (pyenv, pipx, pipenv) make for smooth, isolated, reproducible Python developer and production environments.” The tools: pyenv - install and manage multiple Python versions and flavors pipx - install a Python application with it’s own virtual environment for use globally pipenv - managing virtual environments, dependencies, on a per project basis Brian note: I’m not sold on any of these yet, but honestly haven’t given them a fair shake either, but also didn’t really know how to try them all out. This is a really good write up to get started. Ines #2: New fast.ai course: A Code-First Introduction to Natural Language Processing fast.ai is a really popular, free course for deep learning by Rachel Thomas and Jeremy Howard Also comes with a Python library and lots of notebooks Some influential research developed alongside the course, e.g. ULMFiT (popular algorithm for NLP tasks like text classification) New course on Natural Language Processing: Practical introduction to NLP covering both modern neural network approaches and traditional techniques Highlights: NLP background: topic modeling and linear models Rule-based approaches and real-world problem solving Focus on ethics – videos on bias and disinformation Michael #3: Cloning the human voice In 5 minutes, with Python via Brenden Clone a voice in 5 seconds to generate arbitrary speech in real-time An implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. Watch the video: https://www.youtube.com/watch?v=-O_hYhToKoA Also: Fake voices 'help cyber-crooks steal cash’ Brian #4: Ab(using) pyproject.toml and stuffing pytest.ini and mypy.ini content into it Contributed by Andrew Spittlemeister My first reaction is horror, but this is kinda my thought process with this one toml is not ini (but they look close) neither pytest nor mypy support storing configuration in pyproject.toml they both do support using setup.cfg (but flit and poetry projects don’t use that file, or try not to) they both support passing in the config file as a command line argument you can be careful and write a pyproject.toml file that is both toml and ini compliant drat, this is a reasonable idea, if not a little wacky no guarantee that it will keep working one thing to note: use quotes for stuff you normally wouldn’t need to in ini file. Example ini: [pytest] addopts = -ra -v if stuffed in pyproject.toml [pytest] addopts = "-ra -v" to run: > mypy --config-file pyproject.toml module_name > pytest -c pyproject.toml Ines #5: *Polyaxon* A platform for reproducing and managing the whole life cycle of machine learning and deep learning applications. We talked to lots of research groups and everyone works with just their GPU on desktop. Super slow – you need to wait for results, schedule next job etc. Polyaxon is a free open source library built on Kubernetes. Really easy to set up, especially on Google Kubernetes Engine. Especially good for hyper-parameter search, where you might not need GPU experiments if you can run lots of experiments in parallel Release v0.5 just came today. Big improvements: Plugins system Local runs, for much easier debugging New workflow engine for chaining things together and run experiments with lots of steps Michael #6: Flynt for f-strings A tool to automatically convert old string literal formatting to f-strings F-Strings: Not only are they more readable, more concise, and less prone to error than other ways of formatting, but they are also faster! Converted over 500 lines / expressions in Talk Python Training and Python Bytes. Get started with a pipx install: pipx install flynt Then point it at A file: flynt somefile.py A directory (recursively): flynt ./ Converts code like this: print(``"``Greetings {}, you have found {:,} items!``"``.format(name, count)) To code like this: print(f"Greetings {name}, you have found {count:,} items!") Beware of the digit grouping bug. Good project to jumping in and contributing to open source Extras: Thanks to André Jaenisch for pointing the existence of ReDoS attacks and a good video explaining them. Michael: Python httptoolkit Python Magic’s name via David Martínez Flying Fractals (video and code) Python 3.7.4 is out Ines: Explosion (?) spaCy IRL 2019 our very first conference held on July 6 in Berlin many amazing speakers from research, applied NLP and the community all talks were recorded and will be up on our YouTube channel very soon FastAPI core developer Sebastián Ramírez is joining our team FastAPI was presented by Brian in episode 123 of this podcast we’re big fans and have been switching all our APIs over to FastAPI we’ll keep supporting the project and will definitely give Sebastián enough time to keep working on it Joke: A programmer walks into a bar and orders 1.38 root beers. The bartender informs her it's a root beer float. She says 'Make it a double!’ What do you call a developer without a side project? Well rested.
July 8, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: flake8-comprehensions submitted by Florian Dahlitz I’m already using flake8, so adding this plugin is a nice idea. checks your code for some generator and comprehension questionable code. C400 Unnecessary generator - rewrite as a list comprehension. C401 Unnecessary generator - rewrite as a set comprehension. C402 Unnecessary generator - rewrite as a dict comprehension. C403 Unnecessary list comprehension - rewrite as a set comprehension. C404 Unnecessary list comprehension - rewrite as a dict comprehension. C405 Unnecessary (list/tuple) literal - rewrite as a set literal. C406 Unnecessary (list/tuple) literal - rewrite as a dict literal. C407 Unnecessary list comprehension - '[HTML_REMOVED]' can take a generator. C408 Unnecessary (dict/list/tuple) call - rewrite as a literal. C409 Unnecessary (list/tuple) passed to tuple() - (remove the outer call to tuple()/rewrite as a tuple literal). C410 Unnecessary (list/tuple) passed to list() - (remove the outer call to list()/rewrite as a list literal). C411 Unnecessary list call - remove the outer call to list(). Example: Rewrite list(f(x) for x in foo) as [f(x) for x in foo] Rewrite set(f(x) for x in foo) as {f(x) for x in foo} Rewrite dict((x, f(x)) for x in foo) as {x: f(x) for x in foo} Michael #2: PyOxidizer (again) Michael’s assessment - There are three large and looming threats to Python. Lack of A real mobile development story GUI applications on desktop operating systems Sharing your application with users (this is VERY far from deployment to servers) Cover PyOxidizer before but seems to have just rocketed off last couple of weeks. At their PyCon 2019 keynote talk, Russel Keith-Magee identified code distribution as a potential black swan - an existential threat for longevity - for Python. “Python hasn't ever had a consistent story for how I give my code to someone else, especially if that someone else isn't a developer and just wants to use my application.” They announced the first release of PyOxidizer (project, documentation), an open source utility that aims to solve the Python application distribution problem! PyOxidizer's marquee feature is that it can produce a single file executable containing a fully-featured Python interpreter, its extensions, standard library, and your application's modules and resources. You can have a single .exe providing your application. Unlike other tools in this space which tend to be operating system specific, PyOxidizer works across platforms (currently Windows, macOS, and Linux - the most popular platforms for Python today). PyOxidizer loads everything from memory and there is no explicit I/O being performed. When you **import** a Python module, the bytecode for that module is being loaded from a memory address in the executable using zero-copy. This makes PyOxidizer executables faster to start and import - faster than a python executable itself! Brian #3: Using changedir to avoid the need for src I’ve been experimenting with combining flit, pytest, tox, and coverage for new projects. And in doing so, ran across a cool feature of tox that I didn’t know about before, changedir. It’s a feature of tox to allow you to run tests in a different directory than the top level project directory. tox changedir docs tox and pytest and changedir I talk about this more in episode 80 of Test & Code. As an example project I build yet another markdown converter using regular expressions. This is funny to me, considering the recent cloudflare outage due to a single regular expression. https://blog.cloudflare.com/cloudflare-outage/ “Tragedy is what happens to me, comedy is what happens to you” - Mel Brooks approximate quote. Michael #4: WebRTC and ORTC implementation for Python using asyncio Web Real-Time Communication (WebRTC) - WebRTC is a free, open project that provides browsers and mobile applications with Real-Time Communications (RTC) capabilities via simple APIs. Object Real-Time Communication (ORTC) - ORTC (Object Real-Time Communications) is an API allowing developers to build next generation real-time communication applications for web, mobile, or server environments. The API closely follows its Javascript counterpart while using pythonic constructs: promises are replaced by coroutines events are emitted using pyee.EventEmitter The main WebRTC and ORTC implementations are either built into web browsers, or come in the form of native code. In contrast, the aiortc implementation is fairly simple and readable. Good starting point for programmers wishing to understand how WebRTC works or tinker with its internals. Easy to create innovative products by leveraging the extensive modules available in the Python ecosystem. For instance you can build a full server handling both signaling and data channels or apply computer vision algorithms to video frames using OpenCV. Brian #5: Apprise - Push Notifications that work with just about every platform! listener suggestion cool shim project to allow multiple notification services in one app “Apprise allows you to send a notification to almost all of the most popular notification services available to us today such as: Telegram, Pushbullet, Slack, Twitter, etc. One notification library to rule them all. A common and intuitive notification syntax. Supports the handling of images (to the notification services that will accept them).” supports notification services such as discord, gitter, ifttt, mailgun, mattermost, MS teams, twitter, … SMS notification through Twilio, Nexmo, AWS, D7 email notifications Michael #6: Websauna web framework Websauna is a full stack Python web framework for building web services and back offices with admin interface and sign up process https://websauna.org "We have web applications 80% figured out. Websauna takes it up to 95%.” Built upon Python 3, Pyramid, and SQLAlchemy. When to use it? Websauna is focused on Internet facing sites where you have a public or private sign up process and an administrative interface. Its sweet spots include custom business portals and software-as-a-service products which are too specialized for off-the-shelf solutions. Benefits Focus on core business logic as Websauna provides basic website building blocks like sign up and sign in. Low learning curve and friendly comprehensive documentation help novice developers Emphasis is on meeting business requirements with reliable delivery times, responsiveness, consistency Site operations is half the story. Websauna provides an automated deployment process and integrates with monitoring, security and other DevOps solutions. Extras Michael: Data driven Flask course is out! Brian: Recent Test & Code episodes were solo because I’m in the middle of a work move and didn’t want to schedule interviews around a crazy work schedule. However, that should settle down in July and I can get back to getting great guests on the show. But I’m also having fun with solo topics, so I’ll keep that in the mix. upshot: if I’ve contacted you or you me about being on the show and you haven’t heard from me lately, give me a nudge with a DM or email or something. Jokes An SQL query goes into a bar, walks up to two tables and asks, 'Can I join you?' Not a joke, really, but along the lines of “comedy when it happens to you”. Reset procedure for GE lightbulbs theregister.co.uk/2019/06/20/ge_lightblulb_reset
July 2, 2019
Sponsored by Rollbar: https://pythonbytes.fm/rollbar Brian #1: Comparing the Same Project in Rust, Haskell, C++, Python, Scala and OCaml Tristan Hume, writing about a university project Teams of up to 3 people, multi month, write a Java to x86 compiler in language of choice Needed to pass both known and unknown tests. Secret tests to be run after submission encouraged teams to add more testing than provided. Nothing but standard libraries, and no parsing libraries, even if in standard. Lines of code Rust baseline Haskell: 1-1.6x C++: 1.4x Rust (another team): 3x Scala: 0.7 x OCaml: 1-1.6x Python: about half the size Python version one person used metaprogramming more extra features than any other team passed all public and secret tests Michael #2 : Pylustrator is a program to style your matplotlib plots via Len Wanger Pylustrator is a program to style your matplotlib plots for publication. Subplots can be resized and dragged around by the mouse, text and annotations can be added. Changes can be saved to the initial plot file as python code. Brian #3: MongoDB 4.2 Distributed Transactions extends multi-document ACID transactions across documents, collections, dbs in a replica set, and sharded cluster. Field Level Encryption encryption done on client side satisfies GDPR by allowing customer key destruction rendering server data on customer useless. system administration can be done with no exposure to private data Michael #4: Deep Difference and search of any Python object/data via François Leblanc DeepDiff: Deep Difference of dictionaries, iterables, strings and other objects. It will recursively look for all the changes. Lots of nice touches: List difference ignoring order or duplicates Report repetitions Exclude certain types from comparison Exclude part of your object tree from comparison Significant Digits DeepSearch: Search for objects within other objects. DeepHash: Hash of ANY python object based on its contents even if the object is not considered hashable! DeepHash is supposed to be deterministic in order to make sure 2 objects that contain the same data, produce the same hash. Brian #5: Advanced Python Testing Josh Peak “This article is mostly for me to process my thoughts but also to pave a path for anyone that wants to follow a similar journey on some more advanced python testing topics.” Learning journey (including some great podcasts and an awesome book on testing) Testing tools basic test structure adding black to testing with pytest-black linting with pylint including a very cool speed up trick to only lint modified files. flake8, including docstring checking tox.ini modifications code coverage goals and how to ratchet up to that goal with --cov-fail-under cool learning: “Increase code coverage by testing more code OR deleting code.” fixtures for database connections utilizing mocks, spies, stubs, and monkey patches, including pytest-mock pytest-vcr to save network interactions and replay them in future test runs, resulting in a 10x speedup. Lots of links and tangents possible from this article. Michael #6: Understanding Python's del via Kevin Buchs Official docs General confusion of what this does Looks like memory management, and it mostly isn’t Primary use: remove an item from a list given its index instead of its value or from a dictionary given its key: del person['profession'] # person is a dict del statement can also be used to remove slices from a list del lst[2:4] del can also be used to delete entire variables: del variable Recently covered how The CPython Bytecode Compiler is Dumb. Proactive dels could help. Extras Michael: Pynsource: Reverse engineer Python source code into UML diagrams (via Anders Klint) Language Bar chart race (via Josh Thurston) My Local maximum appearance. Jokes Optimist: The glass is half full. Pessimist: The glass is half empty. Programmer: The glass is twice as large as necessary. Pragmatist: allowing room for requirements oversights, scope creep, and schedule overrun. From “The Upside” with Kevin Hart and Bryan Cranston (watched it last night): K: Would you invest in [HTML_REMOVED]? B: That seems too niche. K: What’s “niche” mean? B: It’s the girl version of “nephew”.
June 25, 2019
Brought to you by Datadog: pythonbytes.fm/datadog Brian #1: Voilà! “from Jupyter notebooks to standalone applications and dashboards” Turn a notebook into a web app with: custom widgets runnable code (but not editable) interactive plots different custom grid layouts templates Michael #2: Toward a “Kernel Python” By Glyph Glyph wants to Marie Kondō the standard library (and I think I agree with him) We have PEP 594 for removing obviously obsolete and unmaintained detritus from the standard library. PEP 594 is great news for Python, and in particular for the maintainers of its standard library, who can now address a reduced surface area. Believes the PEP may be approaching the problem from the wrong direction. One “dead” battery is the colorsys module: why not remove it? “The module is useful to convert CSS colors between coordinate systems. Today, however, the modules you need to convert colors between coordinate systems are only a pip install away. Every little bit is overhead for the core devs, consider the state of PRs Looking at CPython’s keyword-based review queue, we can see that there are 429 tickets currently awaiting review. The oldest PR awaiting review hasn’t been touched since February 2, 2018, which is almost 500 days old. By Glyph’s subjective assessment, on this page of 25 PRs, 14 were about the standard library, 10 were about the core language or interpreter code We need a “kernel” version of Python that contains only the most absolutely minimal library, so that all implementations can agree on a core baseline that gives you a “python” Michael: There will be a cost to beginners. But there is already. Brian #3: Use __main__.py I didn’t know it was that easy to get python -m [HTML_REMOVED] to work. Michael #4: The CPython Bytecode Compiler is Dumb by Chris Wellons Given multiple ways to express the same algorithm or idea, Chris tends to prefer the one that compiles to the more efficient bytecode. Fortunately CPython, the main and most widely used implementation of Python, is very transparent about its bytecode. It’s easy to inspect and reason about its bytecode. The disassembly listing is easy to read and understand. One fact has become quite apparent: the CPython bytecode compiler is pretty dumb. With a few exceptions, it’s a very literal translation of a Python program, and there is almost no optimization. Darius Bacon points out that Guido van Rossum himself said, “Python is about having the simplest, dumbest compiler imaginable.” So this is all very much by design. The consensus seems to be that if you want or need better performance, use something other than Python. (And if you can’t do that, at least use PyPy.) ← Cython people, Cython. Example def foo(): x = 0 y = 1 return x Could easily be: def foo(): return 0 Yet, CPython completely misses this optimization for both x and y: 2 0 LOAD_CONST 1 (0) 2 STORE_FAST 0 (x) 3 4 LOAD_CONST 2 (1) 6 STORE_FAST 1 (y) 4 8 LOAD_FAST 0 (x) 10 RETURN_VALUE And so on. Brett Cannot has expressed performance as a major focus for CPython, maybe there is something here? Brian #5: You can play with EdgeDB now, maybe A Path to a 10x Database EdgeDB roadmap Alpha 1 is available. “EdgeDB is the next generation relational database based on PostgreSQL. It features a novel data model and an advanced query language.” I’m excited about what their doing. Looking forward to 1.0. Lots of great features listed in the 10x post, but what I’m most intrigued by is their replacement of SQL with a different query language. Michael #6: 16 Python libraries that helped a healthcare startup grow via Waqas Younas Worked with a U.S.-based healthcare startup for 7 years. This startup developed a software product that sent appointment reminders to the patients of healthcare facilities; the reminders were sent via email, text, and IVR. Paramiko - A Python implementation of SSHv2. built-in CSV module SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper Requests - HTTP for Humans™ BeautifulSoup - Python library for pulling data out of HTML and XML files. testscenarios - a pyunit extension for dependency injection HL7 - a simple library for parsing messages of Health Level 7 (HL7) version 2.x into Python objects. Python-Phonenumbers - Library for parsing, formatting, and validating international phone numbers gevent - a coroutine -based Python networking library that uses greenlet to provide a high-level synchronous API on top of the libev or libuv event loop. dateutil - powerful extensions to datetime (pip install python-dateutil) Matplotlib - a Python 2D plotting library which produces publication quality figures python-magic - a python interface to the libmagic file type identification library. libmagic identifies file types by checking their headers according to a predefined list of file types. Django - a high-level Python Web framework that encourages rapid development and clean, pragmatic design Boto - a Python package that provides interfaces to Amazon Web Services. Mailgun Python bindings - helped us send appointment reminders seamlessly Twilio’s Python bindings - helped us send appointment reminders seamlessly Extras Michael: United States Digital Service Jokes Difference between ML & AI? Ans.
June 20, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest Max Sklar Brian #1: Why do Python lists let you += a tuple, when you can’t + a tuple? Reuven Lerner >>> x = [1, 2, 3] >>> b = (4, 5, 6) >>> x + b Traceback (most recent call last): File "[HTML_REMOVED]", line 1, in [HTML_REMOVED] TypeError: can only concatenate list (not "tuple") to list >>> x += b >>> x [1, 2, 3, 4, 5, 6] Huh?? “It turns out that the implementation of list.__iadd__ (in place add) takes the second (right-hand side) argument and adds it, one element at a time, to the list. It does this internally, so that you don’t need to execute any assignment after. The second argument to “+=” must be iterable.” Max #2: R vs Python, R is out of top 20 languages despite statistical boom Subtitle: is R declining because of Python? First of all, this article is about an index on the popularity of programming languages from an organization TIOBE. They have an index on the popularity of programming languages. Obviously it’s a combination of many different scores, and that could be controversial, but I’m going to assume that they put some thought into how the rankings are calculated, and that it’s as good as any. A few stories here: first Python hit at all time high in their ranking at number 3, beating out c++ I believe for the first time, and only Java and C are above it. The other story is that the statistical language R dipped below 20 to number 21, and the speculation is that Python has sort of taken over as the preferred statistical language to R. Personally, I got into Python much sooner, because I started as a software engineer, and moved into data science and machine learning. So after taking CS, and programming in Java and C for a few years, python came much more naturally. But still - a lot of people who are data-science first (and they have an additional skills to the kind of hybrid that I am) like and prefer R, and they can use it in a specialized way and get good results. Personally, I’m going to stick with python, because there’s so many statistical libraries yet to learn, and it’s served me well thus far. The language I’ve used most in recent years, Scala, is surprisingly down at 31 - not even close! related: https://www.zdnet.com/article/programming-languages-python-predicted-to-overtake-c-and-java-in-next-4-years/ Michael #3: macOS deprecates Python 2, will stop shipping it (eventually) via Dan Bader, on the heels of WWDC 2019 “Future versions of macOS won’t include scripting language runtimes by default” Contrast this with Windows just now starting to ship with Python 3 In the same announcement: “Use of Python 2.7 isn’t recommended as this version is included in macOS for compatibility with legacy software. Future versions of macOS won’t include Python 2.7. Instead, it’s recommended that you run python3 from within Terminal. (51097165)” Also has impact wider than “us”. E.g. No Ruby or Perl, means home brew doesn’t install easily which is how we get Python 3! Brian #4: Pythonic Ways to Use Dictionaries Al Sweigart A few pythonic uses of dictionaries that are not obvious to new people. Use get() and setdefault() with Dictionaries get(key, default=[HTML_REMOVED]) allows you to read a key without checking for it’s existence beforehand. setdefault(key, default=[HTML_REMOVED]) is a bit of a strange duck but still useful. Set the value of something if it doesn’t exist yet. Python Uses Dictionaries Instead of a Switch Statement Just do it a few times to get the hang of it. Then it becomes natural. Michael's switch addition for Python: https://github.com/mikeckennedy/python-switch Max #5: Things you are probably not using in Python 3 But Should This is from Datawhatnow.com This is particularly relevant for me, since I used python legacy at Foursquare for many years, and now coming back to it taking another look at python v3. One that looks very useful is f-Strings where you can put the variable name in braces in a string and just have it replaced. I’ve seen things like this in other languages - notably PHP and most front-end scripts. Makes the code very readable. Except I know I’m going to screw up by leaving out that stray “f” in front of the string. It should almost be automatic, because how often are you putting these variable names in braces? Another thing I didn’t know python 3 had - again I’m kind of just get started with python 3 is enumerations. I’ve been using Enums for years in scala (really case classes) to make my code WAY more readable. Will keep that in mind when developing in python 3. Michael #6: Have a time machine? C++ would get the Python 2 → 3 treatment too via James Small In a recent CppCast interview, Herb Sutter describes how he would change C/C++ types if he could go back in time. This is almost exactly how things were changed from Python 2 to Python 3 (str split into Unicode strings and byte arrays) So my question to you two is: Why was the transition so hard? Was it just habit and stubbornness? What could the PSF have done? Extras Michael: pip install mystery by Divo Kaplan A random Python package every time. Mystery is a Python package that is instantiated as a different package every time you install it! Inspired by one of our episodes Get our effective pycharm book bundle with the courses over at effectivepycharm.com Brian: Python 3.8.0b1 If you support a package, please test. Max: The Local Maximum Weekly Podcast that covers both the theoretical issues in probability theory, philosophy, and machine learning, but then applies it in a practical way to things like current events and product development. For example, a few weeks ago I did a show on how to estimate the probably of an event that has never occurred We also cover things like Apple’s decision to breakup iTunes, how the internet is shaping up in places like Cuba, and the controversy around YouTube’s recommendation algorithm. Jokes MK: There are only two hard problems in Computer Science: cache invalidation, naming things and off-by-one-errors.
June 12, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Three scientists publish a paper proving that Mercury, not Venus, is the closest planet to Earth. using Python contributed by, and explained by, listener Andrew Diederich. “This is from the March 19th, 2019 Strange Maps article. Which planet is, on average, closest to the Earth? Answer: Mercury. Actually, Mercury is, on average, the closest to all other planets, because it’s closest to the sun.” article, including video, uses PyEphem, which apparently is now deprecated and largely replaced with skyfield. Michael #2: Github semantics Parsing, analyzing, and comparing source code across many languages Written in a Haskell, it’s a library and command line tool for parsing, analyzing, and comparing source code. It’s still early days yet, but semantic can do a lot of cool things, and is powering public-facing GitHub features. I’m tremendously excited as to see how it’ll evolve now that it’s a community-facing project. Understands: Python, TypeScript, JavaScript, Ruby, Go, … here are some cool things inside it: A flow-sensitive, caching, generalized interpreter for imperative languages An abstract interpreter that generates scope graphs for a given program text A strategic rewriting system based on recursion schemes for open syntax terms Brian #3: flake8-black Contributed by Nathan Clayton “The point of this plugin is to be able to run black --check ... from within the flake8 plugin ecosystem.” I like to run flake8 during development both to keep things neat, and to train myself to just write code in a more standard way. This is a way to run black with no surprises. Michael #4: Python Preview for VS Code You write Python code (script style mostly), it creates an object-visualization Think of a picture your first year C++ CS prof might draw. This extension does that automatically as you write Python code Looks to be based (conceptually) on Philip Guo’s Python Tutor site. Brian #5: Create and Publish a Python Package with Poetry John Franey Walks through creating a package, customizing the pyproject.toml, and talks about the different settings in the toml and what it means. Then using the testpypi, and finally publish. Michael #6: Pointers in Python: What's the Point? by Logan Jones Quick question: Does Python have pointers (outside of C-extensions, etc of course)? Yet Python is more pointer heavy than most languages (more so than C# more so than even C++)! In Python, everything is an object, even numbers and booleans. Each object contains at least three pieces of data: Reference count Type Value Check that you have the same object is instead of == Python variables are pointers, just safe ones. Interesting little tidbit from the article: Interning strings is useful to gain a little performance on dictionary lookup—if the keys in a dictionary are interned, and the lookup key is interned, the key comparisons (after hashing) can be done by a pointer compare instead of a string compare. (Source) But like we have inline-assembly in C++ and unsafe mode in C#, we can use pointers in Cython or more fine-grained with ctypes. Extras Michael: PSF needs your help. Spread the word about the fundraiser and please, ask your company to contribute: Building the PSF: the Q2 2019 Fundraiser (Donations are tax-deductible for individuals and organizations that pay taxes in the United States) “Contributions help fund workshops, conferences, pay meetup fees, support fiscal sponsorships, PyCon financial aid, and development sprints. ” Jokes via Jay Miller What did the developer name his newborn boy? JSON
June 5, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Python built-ins worth learning Trey Hunner “I estimate most Python developers will only ever need about 30 built-in functions, but which 30 depends on what you’re actually doing with Python.” “I recommend triaging your knowledge: Things I should memorize such that I know them well Things I should know about so I can look them up more effectively later Things I shouldn’t bother with at all until/unless I need them one day” all 69 built-in functions, split into commonly known overlooked by beginners learn it later maybe learn it eventually you likely don’t need these Highlighting some: overlooked by beginners sum, enumerate, zip, bool, reversed, sorted, min, max, any, all know it’s there, but learn it later: open, input, repr, super, property, issubclass, isinstance, hasattr, getattr, setattr, delattr, classmethod, staticmethod, next my notes I think getattr should be learned early on, because it’s default behavior is so useful. But can’t use it for dicts. Use mydict.get(key, default) for dictionaries. Michael #2: Github sponsors and match Like Patreon but for GitHub projects 2x your sponsorship: Github matches! To boost community funding, we'll match contributions up to $5,000 during a developer’s first year in GitHub Sponsors with the GitHub Sponsors Matching Fund. 100% to developers, Zero fees: GitHub will not charge fees for GitHub Sponsors. Anyone who contributes to open source—whether through code, documentation, leadership, mentorship, design, or beyond—is eligible for sponsorship. Brian #3: Build a REST API in 30 minutes with Django REST Framework Bennett Garner Very fast intro including: Set up Django Create a model in the database that the Django ORM will manage Set up the Django REST Framework Serialize the model from step 2 Create the URI endpoints to view the serialized data Example is a simple hero db with hero name and alias. Michael #4: Dependabot has been acquired by GitHub Automated dependency updates: Dependabot creates pull requests to keep your dependencies secure and up-to-date. I personally use and recommend PyUP: https://pyup.io/ How it works: Dependabot checks for updates: Dependabot pulls down your dependency files and looks for any outdated or insecure requirements. Dependabot opens pull requests: If any of your dependencies are out-of-date, Dependabot opens individual pull requests to update each one. You review and merge: You check that your tests pass, scan the included changelog and release notes, then hit merge with confidence. Here's what you need to know: We're integrating Dependabot directly into GitHub, starting with security fix PRs 👮‍♂️ You can still install Dependabot from the GitHub Marketplace whilst we integrate it into GitHub, but it's now free of charge 🎁 We've doubled the size of Dependabot's team; expect lots of great improvements over the coming months 👩‍💻👨‍💻👩‍💻👨‍💻👩‍💻👨‍💻 Paid accounts are now free, automatically. Brian #5: spoof “New features planned for Python 4.0” Charles Leifer - also known for Peewee ORM This is funny, but painful. Is it too soon to joke about the pain of 2 to 3? A few of my favorites PEP8 will be updated. Line lengths will be increased to 89.5 characters. (compromise between 79 and 100) All new libraries and standard lib modules must include the phrase "for humans" somewhere in their title. Type-hinting has been extended to provide even fewer tangible benefits and will be called type whispering. You can make stuff go faster by adding async before every other keyword. Notable items left out of 4.0 Still no switch statement. No improvements to packaging. Michael #6: BlackSheep web framework Fast HTTP Server/Client microframework for Python asyncio, using Cython, uvloop, and httptools. Very Flask-like API. Interesting to consider the “popularity” of Flask vs Django in this context. Objectives Clean architecture and source code, following SOLID principles Intelligible and easy to learn API, similar to those of many Python web frameworks Keep the core package minimal and focused, as much as possible, on features defined in HTTP and HTML standards Targeting stateless applications to be deployed in the cloud High performance, see results from TechEmpower benchmarks (links in Wiki page) Also has an async client much like aiohttp. Extras Michael: Free courses in the Training mobile apps Upcoming webcast: 10 Tools and Techniques Python Web Developers Should Explore 2019 PSF Board Elections Get PyCharm, Support Python Until June 1st, get PyCharm at 30% OFF All the money raised will go toward the Python Software Foundation Jokes How do you generate a random string? Put a first year Computer Science student in Vim and ask them to save and exit. Waiter: He's choking! Is anyone a doctor? Programmer: I'm a Vim user.
May 30, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: History of CircuitPython PSF blog, A. Jesse Jiryu Davis Adafruit hired Scott Shawcroft to port MicroPython to their SAMD21 chip they use on many of their boards. CircuitPython is a friendly fork of MicroPython. Same licensing, and they share improvements back and forth. “MicroPython customizes its hardware APIs for each chip family to provide speed and flexibility for hardware experts. Adafruit’s audience, however, is first-time coders. Shawcroft said, “Our goal is to focus on the first five minutes someone has ever coded.” “ “Shawcroft aims to remove all roadblocks for beginners to be productive with CircuitPython. As he demonstrated, CircuitPython auto-reloads and runs code when the user saves it; there are two more user experience improvements in the latest release. First, serial output is shown on a connected display, so a program like print("hello world") will have visible output even before the coder learns how to control LEDs or other observable effects.” Related: CircuitPython 4.0.0 released Michael #2: R Risks Python Swallowing It Whole: TIOBE Is the R programming language in serious trouble? According to the latest update of the TIOBE Index, the answer seems to be “yes.” R has finally tumbled out of the top 20 languages “It seems that there is a consolidation going on in the statistical programming market. Python has become the big winner.” Briefly speculates why is Python (which ranked fourth on this month’s list) winning big in data science? My thought: Python is a full spectrum language with solid numerical support. Brian#3: The Missing Introduction To Containerization Aymen El Amri Understanding containerization through history chroot jail, 1979, allowed isolation of a root process and it’s children from the rest of the OS, but with no security restrictions. FreeBSD Jail, 2000, more secure, also isolating the file system. Linux VServer, 2001, added “security contextes” and used new OS system-level virtualization. Allows you to run multiple Linux distros on a single VPS. Oracle Solaris Containers, 2004, system resource controls and boundary separation provided by “zone”. OpenVZ, 2005, OS-level virtualization. Used by many hosting companies to isolate and sell VPSs. Google’s CGroups, 2007, a mechanizm to limit and isolate resource usage. Was mainlained into Linux kernel the same year. LXC, Linux Containers, 2008, Similar to OpenVX, but uses CGroups. CloudFoundry’s Warden, 2013, an API to manage environments. Docker, 2013, os-level virtualization Google’s LMCTFY (Let me contain that for you), 2014, an OSS version of Google’s container stack, providing Linux application containers. Most of this tech is being incorporated into libcontainer. “Everything at Google runs on containers. There are more than 2 billion containers running on Google infrastructure every week.” CoreOS’s rkt, 2014, an alternative to Docker. Lots of terms defined VPS, Virtual Machine, System VM, Process VM, … OS Containers vs App Containers Docker is both a Container and a Platform This is halfway through the article, and where I got lost in an example on creating a container sort of from scratch. I think I’ll skip to a Docker tutorial now, but really appreciate the back story and mental model of containers. Michael #4: Algorithms as objects We usually think of an algorithm as a single function with inputs and outputs. Our algorithms textbooks reinforce this notion. They present very concise descriptions that neatly fit in half of a page. Little details add up until you’re left with a gigantic, monolithic function monolithic function lacks readability the function also lacks maintainability Nobody wants to touch this code because it’s such a pain to get any context Complex code requires abstractions How to tell if your algorithm is an object Code smell #1. It’s too long or too deeply nested Code smell #2. Banner comments Code smell #3. Helper functions as nested closures, but it’s still too long Code smell #4. There are actual helper functions, but they shouldn’t be called by anyone else Code smell #5. You’re passing state between your helper functions Write your algorithm as an object Refactoring a monolithic algorithm into a class improves readability, which is is our #1 goal. Lots of concrete examples in the article Brian #5: pico-pytest Oliver Bestwalter Super tiny implementation of pytest core. 25 lines My original hand crafted test framework was way more code than that, and not as readable. This is good to look at to understand the heart of what test frameworks do find test code run it mark any exceptions as failures Of course, the bells and whistles added in the full implementation are super important, but this is the heart of what is happening. Michael #6: An Introduction to Cython, the Secret Python Extension with Superpowers Cython is one of the best kept secrets of Python. It extends Python in a direction that addresses many of the shortcomings of the language and the platform, such as execution speed, GIL-free concurrency, absence of type checking and not creating an executable. Number of widely used packages that are written in it, such as spaCy, uvloop, and significant parts of scikit-learn, Numpy and Pandas. Cython makes use of the architectural organization of Python by translating (or 'transpiling', as it is now called) a Python file into the C equivalent of what the Python runtime would be doing, and compiling this into machine code. Can sometimes avoid Python types altogether (e.g. sqrt function) C arrays versus lists: Python collection types (list, dict, tuple and set) can be used as a type in cdef functions. The problem with the list structure, however, is that it leads to Python runtime interaction, and is accordingly slow Nice article for getting started and motivation. But I didn’t see Python type annotations in play (they are now supported) Extras Brian: The Price of the Hallway Track - Hynek It’s lame to speak to an empty room, so go to some talks, and lean toward less known speakers. Definitely on my todo list for next year. Who put Python in the Windows 10 May 2019 Update? - Steve Dower more back story Michael: Little development board to production via Crowd Supply: The TinyPICO is an ESP32-based board that's, well, tiny ;) but packs a pretty significant punch...and it's been designed from day 1 to have first-class MicroPython support! via matt_trentini PyCon 2019 Reflections by Automation Panda Python Bytes (yeah, us!) has a Patreon page. Upcoming webcast: 10 Tools and Techniques Python Web Developers Should Explore Jokes What do you call eight hobbits? A hobbyte. Two bytes meet. The first byte asks, 'Are you ill?' The second byte replies, 'No, just feeling a bit off.’ OR: What is Benoit B. Mandelbrot's middle name? Benoit B. Mandelbrot.
May 21, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: PEP 581 (Using GitHub issues for CPython) is accepted PEP 581 The email announcing the acceptance. “The migration will be a large effort, with much planning, development, and testing, and we welcome volunteers who wish to help make it a reality. I look forward to your contributions on PEP 588 and the actual work of migrating issues to GitHub.” — Barry Warsaw Michael #2: Replace Nested Conditional with Guard Clauses Deeply nested code is problematic (does it have deodorant — err comments?) But what can you do? Guard clauses! See Martin Fowler’s article and this one. # BAD! def checkout(user): shipping, express = [], [] if user is not None: for item in user.cart: if item.is_available: shipping.append(item) if item.express_selected: express.append(item) return shipping, express # BETTER! def checkout(user): shipping, express = [], [] if user is None: return shipping, express for item in user.cart: if not item.is_available: continue shipping.append(item) if item.express_selected: express.append(item) return shipping, express Brian #3: Things you’re probably not using in Python 3 – but should Vinko Kodžoman Some of course items: f-strings Pathlib (side note. pytest tmp_path fixture creates temporary directories and files with PathLib) data classes Some I’m warming to: type hinting And those I’m really glad for the reminder of: enumerations from enum import Enum, auto class Monster(Enum): ZOMBIE = auto() WARRIOR = auto() BEAR = auto() print(Monster.ZOMBIE) # Monster.ZOMBIE built in lru_cache: easy memoization with the functools.lru_cache decorator. @lru_cache(maxsize=512) def fib_memoization(number: int) -> int: ... extended iterable unpacking >>> head, *body, tail = range(5) >>> print(head, body, tail) 0 [1, 2, 3] 4 >>> py, filename, *cmds = "python3.7 script.py -n 5 -l 15".split() >>> cmds ['-n', '5', '-l', '15'] >>> first, _, third, *_ = range(10) >>> first, third (0, 2) Michael #4: The Python Arcade Library Arcade is an easy-to-learn Python library for creating 2D video games. It is ideal for people learning to program, or developers that want to code a 2D game without learning a complex framework. Minesweeper games, hangman, platformer games in general. Check out Sample Games Made With The Arcade Library too Includes physics and other goodies Based on OpenGL Brian #5: Teaching a kid to code with Pygame Zero Matt Layman Scratch too far removed from coding. Using Mu to simplify coding interface. comes with a built in Python. Pygame Zero preinstalled “[Pygame Zero] is intended for use in education, so that teachers can teach basic programming without needing to explain the Pygame API or write an event loop.” Initial 29 line game taught: naming things and variables mutability and fiddling with “constants” to see the effect functions and side effects state and time interactions and mouse events Article also includes some tips on how to behave as the adult when working with kids and coding. Michael #6: Follow up on GIL / PEP 554 Has the Python GIL been slain? by Anthony Shaw multithreading in CPython is easy, but it’s not truly concurrent, and multiprocessing is concurrent but has a significant overhead. Because Interpreter state contains the memory allocation arena, a collection of all pointers to Python objects (local and global), sub-interpreters in PEP 554 cannot access the global variables of other interpreters. the way to share objects between interpreters would be to serialize them and use a form of IPC (network, disk or shared memory). All options are fairly inefficient But: PEP 574 proposes a new pickle protocol (v5) which has support for allowing memory buffers to be handled separately from the rest of the pickle stream. When? Pickle v5 and shared memory for multiprocessing will likely be Python 3.8 (October 2019) and sub-interpreters will be between 3.8 and 3.9. Extras Brian: PyCon 2019 videos are available So grateful for this. Already watched a couple, including Ant’s awesome talk about complexity and wily. pytest and hypothesis show up in the new Pragmatic Programmer book. Michael: 100 Days of Web course is out! Effective PyCharm book New release of our Android and iOS apps. Jokes MK → Waiter: Would you like coffee or tea? Programmer: Yes.
May 14, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Folks this one is light on notes since we did it live. Enjoy the show! Special guests Emily Morehouse Steve Dower Topics Brian #1: pgcli Michael #2: Papermill Emily #3: Python Language Summit Steve #4: Python in Windows 10
May 6, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Maintaining a Python Project when it’s not your job Paul #2: Python in 1994 youtube.com/watch?v=7NrPCsH0mBU Barry #3 Python leadership in 2019 Michael #4: Textblob stackabuse.com
May 2, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Solving Algorithmic Problems in Python with pytest Adam Johnson How to utilize pytest to set up quick test cases for coding challenges, like Project Euler or Advent of Code. Moving the specification and examples in the challenge description into test cases. Running the tests with a stub implementation and understanding the failure output. Gradually building up a working solution. Nice demo of how little code it takes to write quick test cases. Also a cool idea to use challenge sites and platforms as TDD/test first practice, as well as practice converting specifications into test cases. Michael #2: DepHell -- project management for Python via @dreigelb Why it is better than all other tools: Format agnostic. You can use DepHell with your favorite format: setup.py, requirements.txt, Pipfile, poetry. DepHell supports them all and much more. Use your favorite tool on any project. Want to install a poetry based project, but don't like poetry? Just say DepHell to convert project meta information into setup.py and install it with pip. Or directly work with the project from DepHell, because DepHell can do everything what you usually want to do with packages. DepHell doesn't try to replace your favorite tools. If you use poetry, you have to use poetry's file formats and commands. However, DepHell can be combined with any other tool or even combine all these tools together through formats converting. You can use DepHell, poetry and pip at the same time. Easily extendable. Pipfile should be just another one supported format for pip. However, pip is really old and big project with many bad decisions, so, PyPA team can't just add new features in pip without fear to broke everything. This is how pipenv has been created, but pipenv has inherited almost all problems of pip and isn't extendable too. DepHell has strong modularity and can be easily extended by new formats and commands. Developers friendly. We aren't going to place all our modules into [_internal](https://github.com/pypa/pip/tree/master/src/pip/_internal). Also, DepHell has big ecosystem with separated libraries to help you use some DepHell's parts without pain and big dependencies for your project. All-in-one-solution. DepHell can manage dependencies, virtual environments, tests, CLI tools, packages, generate configs, show licenses for dependencies, make security audit, get downloads statistic from pypi, search packages and much more. None of your tools can do it all. Smart dependency resolution. Sometimes pip and pipenv can't lock your dependencies. Try to execute pipenv install oslo.utils==1.4.0. Pipenv can't handle it, but DepHell can: dephell deps add --from=Pipfile oslo.utils==1.4.0 to add new dependency and dephell deps convert --from=Pipfile --to=Pipfile.lock to lock it. Asyncio based. DepHell doesn't support Python 2.7, and that allows us to use modern features to make network and filesystem requests as fast as possible. Multiple environments. You can have as many environments for project as you want. Separate sphinx dependencies from your main and dev environment. Other tools like pipenv and poetry don't support it. Brian #3 Python rant: from foo import is bad Mike Croucher I’m glad to see this post because I’m still seeing this practice a lot, even in tutorial blog posts! This is meaningless: result = sqrt(-1) Is it: math.sqrt(-1)? or numpy.sqrt(-1) or cmath.sqrt(-1)? or scipy? or sympy? Recommendation: Never do from x import * Use import math or import numpy as np or even from scipy import sqrt Michael #4: Dask Dask natively scales Python Have numpy, pandas, and scikit-learn code that needs to go faster? Run these on smart clusters of servers Or just on your laptop Process more data than will fit into RAM Supported by… interesting to see proper support there. Matthew Rocklin was on Talk Python 207 to discuss Brian #5: Animations with Matplotlib Parul Pandey The raindrop simulation is mesmerizing. Tutorial on using FuncAnimation to animate a sine wave although, I’m not sure what the x axis means during an animation Also: live updates based on changing data animate turning a 3D plot using celluloid package to animate simple example animating subplots changing legend during animation Michael #6: PEP 554 -- Multiple Interpreters in the Stdlib This proposal introduces the stdlib interpreters module. The module will be provisional. It exposes the basic functionality of subinterpreters already provided by the C-API, along with new (basic) functionality for sharing data between interpreters. Sharing data centers around "channels", which are similar to queues and pipes. Examples and use-cases: Running isolated code In process, true parallelism Versioning of modules (?) Plugin systems Extras Michael: iOS Talk Python Training app is out: training.talkpython.fm/apps Find us at PyCon! Blessings terminal API (from Erik Rose, via Prayson Daniel) Jokes via Topher Chung Knock knock. Race condition. Who's there?
April 25, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Kenneth Reitz Brian #1: inline_python (for rust) “I just made a Frankenstein's monster: Python code embedded directly in rustlang code. Should I kill it before it escapes the lab?” - Mara Bos Writing some rust, and need a little Python? Maybe want to pop open a matplotlib window? This may be just the thing you need. see also: https://pypi.org/project/bash/ Kenneth #2: Requests3: Under Way! Requests 2.x that you know and love is going into CVE-only mode (which it has been for a long time). Requests III is a new project which will bring async/await keywords to Requests. installable as requests3. Type-Annotations Python 3.6+ Michael #3: 🔥 Pyflame: A Ptracing Profiler For Python Pyflame is a high performance profiling tool that generates flame graphs for Python. Pyflame is implemented in C++, and uses the Linux ptrace(2) system call to collect profiling information. It can take snapshots of the Python call stack without explicit instrumentation Capable of profiling embedded Python interpreters like uWSGI. Fully supports profiling multi-threaded Python programs. Why use it? Pyflame usually introduces significantly less overhead than the builtin profile (or cProfile) modules, and emits richer profiling data. The profiling overhead is low enough that you can use it to profile live processes in production. Brian #4: flit + src Currently a WIP PR. flit is easy. Given a module or a source package. flit init creates pyproject.toml and LICENSE files. commit those to git flit build creates a wheel flit publish (builds and) publishes to whatever you have in your [.pypirc](https://docs.python.org/3/distutils/packageindex.html#the-pypirc-file) Changes in this PR The flit project already has 2 types of projects. just a module, like foo.py a package (directory with __init__.py), like foo/__init__.py This would add a 3rd and 4th. just a module, but in src, like src/foo.py a package in src, like src/foo/__init__.py May be cracking open a can of worms, but I’m ok with that. Kenneth #5: $ pipx install pipenv Michael #6: cheat.sh via Jon Bultmeyer Nothing to install, but works on the CLI $ http cht.sh/python/sort+list $ http cht.sh/python/connect+to+database Has a CLI client too with a proper shell Get started with http cht.sh/python/:learn Has a funky stealth mode too Editor integration VS Code & Vim cheat.sh uses selected community driven cheat sheet repositories and information sources, maintained by thousands of users, developers and authors all over the world Extras Brian: vi is good for beginners - fun read, for all you haters out there. But use vim, not vi. Better yet, IdeaVim for PyCharm or VSCodeVim for VS Code. nbstripout - command line tool to strip output from Jupyter Notebook files. We covered pyodide on episode 93, but here’s a cool article on it Pyodide: Bringing the scientific Python stack to the browser Michael: PyCon AU CFP LIGO Blackhole collision follow up: https://www.youtube.com/watch?v=BXID4teFfDc via Dave Kirby and Matthew Feickert https://github.com/kylebebak/questionnaire like Bullet but for windows too via Sander Teunissen Kenneth (optional): PyColorado CFP PyOhio CFP PyRemote! Jokes Don’t know if I’ll do all of these, but I like them. 🙂 Brian and Kenneth, feel free to add yours if you have some! MK: Ubuntu users are apt to get these jokes. MK: How many programmers does it take to kill a cockroach? Two: one holds, the other installs Windows on it. MK: A programmer had a problem. He thought to himself, 'I know, I'll solve it with threads!'. has Now problems. two he (mildly offensive) KR: What’s the difference between a musician and a pizza? A pizza can feed a family of four. (In collaboration with Jonatan Skogsfors) Python used to be directed by the BDFL, Guido. Now it’s directed by a steering council, GUIDs[0:4].
April 19, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guest: Cecil Philip Brian #1: Python Used to Take Photo of Black Hole Lots of people talking about this. The link I’m including is a quick write up by Mike Driscoll. From now on these conversations can happen: “So, what can you do with Python?” “Well, it was used to help produce the worlds first image of a black hole. Your particular problem probably isn’t as complicated as that, so Python should work fine.” Projects listed in the paper: “First M87 Event Horizon Telescope Results. III. Data Processing and Calibration”: Numpy (van der Walt et al. 2011) Scipy (Jones et al. 2001) Pandas (McKinney 2010) Jupyter (Kluyver et al. 2016) Matplotlib (Hunter 2007). Astropy (The Astropy Collaboration et al. 2013, 2018) Cecil #2: Wasmer - Python Library for executing WebAssembly binaries WebAssembly (Wasm) enables high level languages to target a portable format that runs in the web Tons of languages compile down to Wasm but Wasmer enables the consumption of Wasm in python This enables an interesting use case for using Wasm as a way to leverage code between languages Michael #3: Cooked Input cooked_input is a Python package for getting, cleaning, converting, and validating command line input. Name comes from input / raw_input (unvalidated) and cooked input (validated) Beginner’s can use the provided convenience classes to get simple inputs from the user. More complicated command line application (CLI) input can take advantage of cooked_input’s ability to create commands, menus and data tables. All sorts of cool validates and cleaners Examples cap_cleaner = ci.CapitalizationCleaner(style=ci.ALL_WORDS_CAP_STYLE) ci.get_string(prompt="What is your name?", cleaners=[cap_cleaner]) >>> ci.get_int(prompt="How old are you?", minimum=1) How old are you?: abc "abc" cannot be converted to an integer number How old are you?: 0 "0" too low (min_val=1) How old are you?: 67 67 Brian #4: JetBrains and PyCharm officially collaborating with Anaconda PyCharm 2019.1.1 has some improvements for using Conda environments. Fixed various bugs related to creating Conda envs and installing packages into them. Special distribution of PyCharm: PyCharm for Anaconda with enhanced Anaconda support. I’m using PyCharm Pro with vim emulation this week to edit a notebook based presentation. I might run them in Jupyter, or just run it in PyCharm, but editing with all my normal keyboard shortcuts is awesome. Cecil #5: Building a Serverless IoT Solution with Python Azure Functions and SignalR Interesting blog post on using serverless, IoT, real-time messaging to create a live dashboard Shows how to create a serverless function in Python to process IoT data There’s tons of DIY applications for using this technique at home The Dashboard is a static website using D3 for charting. Michael #6: multiprocessing.shared_memory — Provides shared memory for direct access across processes New in Python 3.8 This module provides a class, SharedMemory, for the allocation and management of shared memory to be accessed by one or more processes on a multicore or symmetric multiprocessor (SMP) machine. The ShareableList looks nice to use. Extras Brian: Getting ready for PyCon with STICKERS. Yeah, baby. Come see us at PyCon. I’ll also be bringing some copies of Python Testing with pytest, if anyone doesn’t already have a copy. Lots of interviews going on for Test & Code, and some will happen at PyCon. Cecil: Attendee Detector Workshop Talk Python training app on Android Michael: Guido van Rossum interviewed on MIT’s AI podcast via Tony Cappellini Visual Studio IntelliCode for VS & VS Code Showing a Craigslist scammer who's boss using Python via Dan Koster Jokes Brian: To understand recursion you must first understand recursion. Michael: A programmer was found dead in the shower. Next to their body was a bottle of shampoo with the instructions 'Lather, Rinse and Repeat'.
April 13, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: My How and Why: pyproject.toml & the 'src' Project Structure Brian Skinn pyproject.toml but with setuptools, instead of flit or poetry with a src dir and tox and black all the bits and pieces to make all of this work Michael #2: The Deadlock Empire: Slay dragons, master concurrency! A game to test your thread safety and skill! Deadlocks occur in code when two threads end up trying to enter two or more locks (RLocks please!) Consider lock_a and lock_b Thread one enters lock_a and will soon enter lock_b Thread two enters lock_b and will soon enter lock_a Imagine transferring money between two accounts, each with a lock, and each thread does this in opposite order. Brian #3: Cog 3.0 Ned Batchelder’s cog gets an update (last one was a few years ago). “Cog … finds snippets of Python in text files, executes them, and inserts the result back into the text. It’s good for adding a little bit of computational support into an otherwise static file.” Development moved from Bitbucket to GitHub. Travis and Appveyor CI. The biggest functional change is that errors during execution now get reasonable tracebacks that don’t require you to reverse-engineer how cog ran your code. mutmut mutation testing added. Cool. What I want to know more about is this statement: “…now I use it for making all my presentations”. Very cool idea. Michael #4: StackOverflow 2019 Developer Survey Results More good news for Python Lots of focus on gender in this one Contributing to Open Source About 65% of professional developers on Stack Overflow contribute to open source projects once a year or more. Involvement in open source varies with language. Developers who work with Rust, WebAssembly, and Elixir contribute to open source at the highest rates, while developers who work with VBA, C#, and SQL do so at about half those rates. Competence and Experience We see evidence here among the most junior developers for impostor syndrome, pervasive patterns of self-doubt, insecurity, and fear of being exposed as a fraud. Among our respondents, men grew more confident much more quickly than gender minorities. Programming, Scripting, and Markup Languages Python edges out Java, second only to JavaScript (and two non-programming languages) Databases MySQL, Postgres, Microsoft SQL Server, SQLite, MongoDB Most Loved, Dreaded, and Wanted Languages Loved: Rust, Python Wanted: Python, JavaScript Dreaded: VBA, ObjectiveC Most Loved, Dreaded, and Wanted Databases Loved: Postgres Wanted: MongoDB Most Popular Development Environments VS Code is crushing it How Technologies Are Connected is just interesting Brian #5: Cuv’ner “A commanding view of your test-coverage" Coverage visualizations on the console. Michael #6: Mobile apps launched The tech (sadly only 50% Python) Xamarin, Mono, and C# on the device-side Python, Pyramid, and MongoDB on the server-side 90% code sharing or higher Native applications Build the prototype myself on Windows Hired Giorgi via TopTal Get your own developer or get some freelancing work and support my app progress with my referral code: toptal.com/#we-annexed-perfect-engineers Dear mobile app developers: You have my sympathy! Try the app at training.talkpython.fm/apps Comes with 2 free courses for anyone who logs in. Android only at the moment but not for long Extras Brian: Python Bytes Patreon page is up: patreon.com/pythonbytes Michael: PyCon Booth XKCD Plots in Matplotlib with examples via Tim Harrison Fira Code Retina and Font Ligatures The EuroSciPy 2019 Conference will take place from September 2 to September 6 in Bilbao, Spain Jokes “When your hammer is C++, everything begins to look like a thumb.” “Why don't jokes work in octal? Because 7 10 11” Over explained: Why is 6 afraid of 7. Cuz 7 8 9. Follow on: Why did 7 eat 9? He was trying to eat 3^2 meals. I've been using Vim for a long time now, mainly because I can't figure out how to exit.
April 5, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: pytest 4.4.0 Lots of amazing new features here (at least for testing nerds) testpaths displayed in output, if used. pytest.ini setting that allows you to specify a list of directories or tests (relative to test rootdir) to test. (can speed up test collection). Lots of goodies for plugin writers. Internal changes to allow subtests to work with a new plugin, pytest-subtests. Just started playing with it, but I’m excited already. Planning on a full Test & Code episode after I play with it a bit more. # unittest example: class T(unittest.TestCase): def test_foo(self): for i in range(5): with self.subTest("custom message", i=i): self.assertEqual(i % 2, 0) # pytest example: def test(subtests): for i in range(5): with subtests.test(msg="custom message", i=i): assert i % 2 == 0 Michael #2: requests-async async-await support for requests Just finished talking with Kenneth Reitz, native async coming to requests, but awhile off Nice interm solution Requires modern Python (3.6) Interesting Flask, Quart, Starlette, etc. framework wrapper for testing Brian #3: Reasons why PyPI should not be a service Dustin Ingram’s article: PyPI as a Service “Layoffs at JavaScript package registry raise questions about fate of community resource” - The Register article Apparently PyPI gets requests for a private form of their service regularly, but there are problems with that. Currently a non-profit project under the PSF. That may be hard to maintain if they have a for-profit part. Donated services and infrastructure of more than $1M/year would be hard to replace. There are already other package repository options. Although there is probably room for others to compete. Currently run by volunteers for the most part. (
March 29, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Deconstructing xkcd.com/1987/ Brett Cannon Breakdown of the infamous xkcd comic poking fun at the authors Python Environment on his computer. The interpreters listed Homebrew description python.org binaries A discussion of pip, easy_install The paths and the $PATH and $PYTHONPATH Actually quite an educational history lesson, and the abuse some people put their computers through. “So the next time someone decides to link to this comic as proof that Python has a problem, you can say that it's actually Randall's problem.” Michael #2: Python package as a CLI option Wanted to make this little app available via a CLI as a dedicated command. Really tired of python3 script.py or ./script.py Turns out, pip and Python already solve this problem, if you structure your package correctly Thanks to everyone on Twitter! The trick turns out to be to have entrypoints in your package entry_points = { "console_scripts": ['bootstrap = bootstrap.bootstrap:main'] } ... This should even register it with pipx install package ;) Brian #3: pyright a Microsoft static type checker for the Python language. “Pyright was created to address gaps in existing Python type checkers like mypy.” 5x faster than mypy meant for large code bases written in TypeScript and runs within node. Michael #4: Refactoring Python Applications for Simplicity If you can write and maintain clean, simple Python code, then it’ll save you lots of time in the long term. You can spend less time testing, finding bugs, and making changes when your code is well laid out and simple to follow. Is your code complex? Metrics for Measuring Complexity Lines of Code Cyclomatic complexity is the measure of how many independent code paths there are through your application. Maintainability Index Refactoring: The technique of changing an application (either the code or the architecture) so that it behaves the same way on the outside, but internally has improved. Nice overview of tooling (PyCharm, VS Code plugins, etc) Anti-patterns and ways out of them (best part of the article IMO) Brian #5: FastAPI Thanks Colin Sullivan for suggesting the topic “FastAPI framework, high performance, easy to learn, fast to code, ready for production” “Sales pitch / key features: Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). One of the fastest Python frameworks available. Fast to code: Increase the speed to develop features by about 200% to 300%. (estimated) Fewer bugs: Reduce about 40% of human (developer) induced errors. (estimated) Intuitive: Great editor support. Completion everywhere. Less time debugging. Easy: Designed to be easy to use and learn. Less time reading docs. Short: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs. Robust: Get production-ready code. With automatic interactive documentation. Standards-based: Based on (and fully compatible with) the open standards for APIs: OpenAPI(previously known as Swagger) and JSON Schema.” uses: Starlette for the web parts. Pydantic for the data parts. document REST apis with both Swagger ReDoc looks like quite a fun contender in the “put together a REST API quickly” set of solutions out there. Just the front page demo is quite informative. There’s also a tutorial that seems like it might be a crash course in API best practices. Michael #6: Bleach: stepping down as maintainer by Will Kahn-Greene Bleach is a Python library for sanitizing and linkifying text from untrusted sources for safe usage in HTML. A retrospective on OSS project maintenance Picked up maintenance of the project because I was familiar with it current maintainer really wanted to step down Mozilla was using it on a bunch of sites I felt an obligation to make sure it didn't drop on the floor and I knew I could do it. Never really liked working on Bleach He did a bunch of work on a project I don't really use, but felt obligated to make sure it didn't fall on the floor, that has a pain-in-the-ass problem domain. Did that for 3+ years. Is [he] getting paid to work on it? Not really. Does [he] like working on it? No. Seems like [he] shouldn't be working on it anymore. Extras Brian sleepsort Michael: Passbolt Python 3.7.3 is now available stackroboflow via Alexander Allori Joke
March 22, 2019
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Combining and separating dictionaries PEP 584 -- Add + and - operators to the built-in dict class. Steven D'Aprano Draft status, just created 1-March-2019 d1 + d2 would merge d2 into d1 like {**d1, **d2} or on two lines d = d1.copy() d.update(d2) of note, (d1 + d2) != (d2 + d1) Currently no subtraction equivalent Guido’s preference of + over | Related, Why operators are useful - also by Guido Michael #2: Why I Avoid Slack by Matthew Rocklin I avoid interacting on Slack, especially for technical conversations around open source software. Instead, I encourage colleagues to have technical and design conversations on GitHub, or some other system that is public, permanent, searchable, and cross-referenceable. Slack is fun but, internal real-time chat systems are, I think, bad for productivity generally, especially for public open source software maintenance. Prefer GitHub because I want to Engage collaborators that aren’t on our Slack Record the conversation in case participants change in the future. Serve the silent majority of users who search the web for answers to their questions or bugs. Encourage thoughtful discourse. Because GitHub is a permanent record it forces people to think more before they write. Cross reference issues. Slack is siloed. It doesn’t allow people to cross reference people or conversations across Slacks Brian #3: Hunting for Memory Leaks in Python applications Wai Chee Yau Conquering memory leaks and spikes in Python ML products at Zendesk. A quick tutorial of some useful memory tools The memory_profiler package and matplotlib to visualize memory spikes. Using muppy to heap dump at certain places in the code. objgraph to help memory profiling with object lineage. Some tips when memory leak/spike hunting: strive for quick feedback run memory intensive tasks in separate processes debugger can add references to objects watch out for packages that can be leaky pandas? really? Michael #4: Give Me Back My Monolith by Craig Kerstiens Feels like we’re starting to pass the peak of the hype cycle of microservices We’ve actually seen some migrations from micro-services back to a monolith. Here is a rundown of all the things that were simple that you now get to re-visit Setup went from intro chem to quantum mechanics Onboarding a new engineering, at least for an initial environment would be done in the first day. As we ventured into micro-services onboarding time skyrocketed So long for understanding our systems Back when we had monolithic apps if you had an error you had a clear stacktrace to see where it originated from and could jump right in and debug. Now we have a service that talks to another service, that queues something on a message bus, that another service processes, and then we have an error. If we can’t debug them, maybe we can test them All the trade-offs are for a good reason. Right? Brian #5: Famous Laws Of Software Development Tim Sommer 13 “laws” of software development, including Hofstadter’s Law: “It always takes longer than you expect, even when you take into account Hofstadter's Law.” Conway’s Law: “Any piece of software reflects the organizational structure that produced it.” The Peter Principle: “In a hierarchy, every employee tends to rise to his level of incompetence.” Ninety-ninety rule: “The first 90% of the code takes 10% of the time. The remaining 10% takes the other 90% of the time” Michael #6: Beer Garden Plugins A powerful plugin framework for converting your functions into composable, discoverable, production-ready services with minimal overhead. Beer Garden makes it easy to turn your functions into REST interfaces that are ready for production use, in a way that’s accessible to anyone that can write a function. Based on MongoDB, Rabbit MQ, & modern Python Nice docker-compose option too Extras Michael: Firefox Send Ethical ads on Python Bytes (and Talk Python) Brian: T&C 69: The Pragmatic Programmer — Andy Hunt not up yet, but will be before this episode is released Jokes From Derrick Chambers “What do you call it when a python programmer refuses to implement custom objects? self deprivation! Sorry, that joke was really classless.” via pyjokes: I had a problem so I thought I'd use Java. Now I have a ProblemFactory.
March 16, 2019
Sponsored by Datadog: pythonbytes.fm/datadog Brian #1: Futurize and Auto-Futurize Staged automatic conversion from Python2 to Python3 with futurize from python-future.org pip install future Stages: 1: safe fixes: exception syntax, print function, object base class, iterator syntax, key checking in dictionaries, and more 2: Python 3 style code with wrappers for Python 2 more risky items to change separating text from bytes, quite a few more very modular and you can be more aggressive and more conservative with flags. Do that, but between each step, run tests, and only continue if they pass, with auto-futurize from Timothy Hopper. a shell script that uses git to save staged changes and tox to test the code. Michael #2: Tech blog writing live stream via Anthony Shaw Live stream on "technical blog writing" Talking about how I put articles together, research, timing and other things about layouts and narratives. Covers “Modifying the Python language in 6 minutes”, deep article Listicals, “5 Easy Coding Projects to Do with Kids” A little insight into what is popular. Question article: Why is Python Slow? Tourists guide to the CPython source code Brian #3: Try out walrus operator in Python 3.8 Alexander Hultnér The walrus operator is the assignment expression that is coming in thanks to PEP 572. # From: https://www.python.org/dev/peps/pep-0572/#syntax-and-semantics # Handle a matched regex if (match := pattern.search(data)) is not None: # Do something with match # A loop that can't be trivially rewritten using 2-arg iter() while chunk := file.read(8192): process(chunk) # Reuse a value that's expensive to compute [y := f(x), y**2, y**3] # Share a subexpression between a comprehension filter clause and its output filtered_data = [y for x in data if (y := f(x)) is not None] This article walks through trying this out with the 3.8 alpha’s now available. Using pyenv and brew to install 3.8, but you can also just download it and try it out. 3.8.0a1: https://www.python.org/downloads/release/python-380a1/ 3.8.0a2: https://www.python.org/downloads/release/python-380a2/ Ends with a demonstration of the walrus operator working in a (I think) very likely use case, grabbing a value from a dict if the key exists for entry in sample_data: if title := entry.get("title"): print(f'Found title: "{title}"') That code won’t fail if the title key doesn’t exist. Michael #4: bullet : Beautiful Python Prompts Made Simple Have you ever wanted a dropdown select box for your CLI? Bullet! Lots of design options Also Password “boxes” Yes/No Numbers Looking for contributors, especially Windows support. Brian #5: Hosting private pip packages using Azure Artifacts Interesting idea to utilize artifacts as a private place to store built packages to pip install elsewhere. Walkthrough is assuming you are working with a data pipeline. You can package some of the work in earlier stages for use in later stages by packaging them and making them available as artifacts. Includes a basic tutorial on setuptools packaging and building an sdist and a wheel. Need to use CI in the Azure DevOps tool and use that to build the package and save the artifact Now in a later stage where you want to install the package, there are some configs needed to get the pip credentials right, included in the article. Very fun article/hack to beat Azure into a use model that maybe it wasn’t designed for. Could be useful for non data pipeline usage, I’m sure. Speaking of Azure, we brought up Anthony Shaw’s pytest-azurepipelines pytest plugin last week. Well, it is now part of the recommended Python template from Azure. Very cool. Michael #6: Async/await for wxPython via Andy Bulka Remember asyncio and PyQt from last week? Similar project called wxasync which does the same thing for wxPython! He’s written a medium article about it https://medium.com/@abulka/async-await-for-wxpython-c78c667e0872 with links to that project, and share some real life usage scenarios and fun demo apps. wxPython is important because it's free, even for commercial purposes (unlike PyQt). His article even contains a slightly controversial section entitled "Is async/await an anti-pattern?" which refers to the phenomenon of the async keyword potentially spreading through one's codebase, and some thoughts on how to mitigate that. Extras Michael: Mongo license followup Will S. told me I was wrong! And I was. :) The main clarification I wanted to make above was that the AGPL has been around for a while, and it is the new SSPL from MongoDB that targets cloud providers. Also, one other point I didn't mention -- the reason the SSPL isn't considered open source is that it places additional conditions on providing the software as a service and the OSI's open source definition requires no discrimination based on field of endeavor. Michael: python2 becomes self-aware, enters fifth stage of grief Funny thread I started python2 -m pip list DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. Michael: PyDist — Simple Python Packaging Your private and public dependencies, all in one place. Looks to be paid, but with free beta? It mirrors the public PyPI index, and keeps packages and releases that have been deleted from PyPI. It allows organizations to upload their own private dependencies, and seamlessly create private forks of public packages. And it integrates with standard Python tools almost as well as PyPI does. Joke A metajoke: pip install --user pyjokes or even better pipx install pyjokes. Then: $ pyjoke [hilarity ensues! …]
March 5, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: The Ultimate Guide To Memorable Tech Talks Nina Zakharenko 7 part series that covers choosing a topic, writing a talk proposal, tools, planning, writing, practicing, and delivering the talk I’ve just read the tools section, and am looking forward to the rest of the series. From the tools section: “I noticed I’d procrastinate on making the slides look good instead of focusing my time on making quality content.” Michael #2: Running Flask on Kubernetes via TestDriven.io & Michael Herman What is Kubernetes? A step-by-step tutorial that details how to deploy a Flask-based microservice (along with Postgres and Vue.js) to a Kubernetes cluster. Goals of tutorial Explain what container orchestration is and why you may need to use an orchestration tool Discuss the pros and cons of using Kubernetes over other orchestration tools like Docker Swarm and Elastic Container Service (ECS) Explain the following Kubernetes primitives - Node, Pod, Service, Label, Deployment, Ingress, and Volume Spin up a Python-based microservice locally with Docker Compose Configure a Kubernetes cluster to run locally with Minikube Set up a volume to hold Postgres data within a Kubernetes cluster Use Kubernetes Secrets to manage sensitive information Run Flask, Gunicorn, Postgres, and Vue on Kubernetes Expose Flask and Vue to external users via an Ingress Brian #3: Changes in the CI landscape Travis CI joins the Idera family - TravisCI blog #travisAlums on Twitter “TravisCI is laying off a bunch of senior engineers and other technical staff. Look at the #travisAlums hashtag and hire them!” - alicegoldfuss options: GitHub lists 17 options for CI, including GitLab & Azure Pipelines Some relevant articles, resources: The CI/CD market consolidation - GitLab article Azure Pipelines with Python — by example - Anthony Shaw pytest-azurepipelines - Anthony Shaw Azure Pipelines Templates - Anthony Sottile Michael #4: Python server setup for macOS 🍎 what: hello world for Python server setup on macOS why: most guides show setup on a Linux server (which makes sense) but macoS is useful for learning and for local dev STEP 1: NGINX ➡️ STATIC ASSETS STEP 2: GUNICORN ➡️ FLASK STEP 3: NGINX ➡️ GUNICORN Brian #5: Learn Enough Python to be Useful: argparse How to Get Command Line Arguments Into Your Scripts - Jeff Hale “argparse is the “recommended command-line parsing module in the Python standard library.” It’s what you use to get command line arguments into your program. “I couldn’t find a good intro guide for argparse when I needed one, so I wrote this article.” Michael #6: AWS, MongoDB, and the Economic Realities of Open Source Related podcast: https://soundcloud.com/exponentfm/episode-159-inverted-pyramids Last week, from the AWS blog: Today we are launching Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that is designed to be compatible with your existing MongoDB applications and tools. Amazon DocumentDB uses a purpose-built SSD-based storage layer, with 6x replication across 3 separate Availability Zones. The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads. Like an increasing number of such projects, MongoDB is open source…or it was anyways. MongoDB Inc., a venture-backed company that IPO’d in October, 2017, made its core database server product available under the GNU Affero General Public License (AGPL). AGPL extended the GPL to apply to software accessed over a network; since the software is only being used, not copied MongoDB’s Business Model We believe we have a highly differentiated business model that combines the developer mindshare and adoption benefits of open source with the economic benefits of a proprietary software subscription business model. MongoDB enterprise and MongoDB atlas Basically, MongoDB sells three things on top of its open source database server: Additional tools for enterprise companies to implement MongoDB A hosted service for smaller companies to use MongoDB Legal certainty What AWS Sells the value of software is typically realized in three ways: First is hardware. Second is licenses. This was Microsoft’s core business for decades: licenses sold to OEMs (for the consumer market) or to companies directly (for the enterprise market). Third is software-as-a-service. AWS announced last week: > The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads. AWS is not selling MongoDB: what they are selling is “performance, scalability, and availability.” DocumentDB is just one particular area of many where those benefits are manifested on AWS. Thus we have arrived at a conundrum for open source companies: MongoDB leveraged open source to gain mindshare. MongoDB Inc. built a successful company selling additional tools for enterprises to run MongoDB. More and more enterprises don’t want to run their own software: they want to hire AWS (or Microsoft or Google) to run it for them, because they value performance, scalability, and availability. This leaves MongoDB Inc. not unlike the record companies after the advent of downloads: what they sold was not software but rather the tools that made that software usable, but those tools are increasingly obsolete as computing moves to the cloud. And now AWS is selling what enterprises really want. This tradeoff is inescapable, and it is fair to wonder if the golden age of VC-funded open source companies will start to fade (although not open source generally). The monetization model depends on the friction of on-premise software; once cloud computing is dominant, the economic model is much more challenging. Extras: PyTexas 2019 at #Austin on Apr 13th and 14th. Registrations now open. More info at pytexas.org/2019/ Michael: Sorry Ant! Michael: RustPython follow up: https://rustpython.github.io/demo/ Joke: Q: Why was the developer unhappy at their job? A: They wanted arrays. Q: Where did the parallel function wash its hands? A: Async
February 26, 2019
Sponsored by pythonbytes.fm/datadog Special guests Eric Chou Dan Bader Trey Hunner Michael #1: Incrementally migrating over one million lines of code from Python 2 to Python 3 Weighing in at over 1 million lines of Python logic, we had a massive surface area for potential issues in our migration from Python 2 to Python 3 First Py3 commit, hack week 2015 Unfortunately, it was clear that many features were completely broken by the upgrade Official start H1 2017 Armed with Mypy, a static type-checking tool that we had adopted in the interim year, they made substantial strides towards enabling the Python 3 migration: Ported our custom fork of Python to version 3.5 Upgraded some Python dependencies to Python 3-compatible versions, and forked some others (e.g. babel) Modified some Dropbox client code to be Python 3 compatible Set up automated jobs in our continuous integration (CI) to run the existing unit tests with the Python 3 interpreter, and Mypy type-checking in Python 3 mode Crucially, the automated tests meant that we could be certain that the limited Python 3 compatibility that existed would not have regressed when the project was picked up again. Prerequisites Before we could begin working on migrating any of our application logic, we had to ensure that we could load the Python 3 interpreter and run until the entry point of the application. In the past, we had used “freezer” scripts to do this for us. However, none of these had support for Python 3 around this time, so in late 2016, we built a custom, more native solution which we internally referred to as “Anti-freeze” (more on that in the initial Python 3 migration blog post). Incrementally enabling unit tests and type-checking ‘Straddling’ Python 2 and Python 3 Letting it bake Learnings (tl;dr) Unit tests and typing are invaluable. String encoding in Python is hard. Incrementally migrate to Python 3 for great profit. Eric #2: Network Automation Development with Python (for fun and for profit) Terms: NetDevOps (Cisco), NRE (Network Reliability Engineer) Libraires: Netmiko, NAPALM, Nornir Free Lab Resources: NRE Labs, dCloud, DevNet Conferences: AnsibleFest (network automation track), Cisco DevnetCreate Trey #3: Alkali file as DB If you have structured data you want to query (like RSS feed, CSV, JSON, or any custom format of your own creation) you can use a Django ORM-like syntax to query it Save it to the same format or a different format because you control both the reading and the writing Kurt is at PyCascades so I got to chat with him about this Dan #4: Carnegie Mellon Launches Undergraduate Degree in Artificial Intelligence ** Carnegie Mellon University's School of Computer Science will offer a new undergraduate degree in artificial intelligence beginning this fall The first offered by a U.S. university "Specialists in artificial intelligence have never been more important, in shorter supply or in greater demand by employers," said Andrew Moore, dean of the School of Computer Science. The bachelor's degree in AI will focus more on how complex inputs — such as vision, language and huge databases — are used to make decisions or enhance human capabilities Michael #5: asyncio + PyQt5/PySide2 via Florian Dahlitz asyncqt is an implementation of the PEP 3156 event-loop with Qt. This package is a fork of quamash focusing on modern Python versions, with some extra utilities, examples and simplified CI. Allows wiring events to Qt’s event loop that run on asyncio and leverage it internally. Example: https://github.com/gmarull/asyncqt/blob/master/examples/aiohttp_fetch.py Dan #6: 4 things I want to see in Python 4.0 JIT as a first class feature A stable .0 release Static type hinting A GPU story for multiprocessing More community contributions Extras: Michael: My Python Async webcast recording is now available. Michael: PyCon Israel in the first week of June (https://il.pycon.org/2019/), and the CFP opened today: https://cfp.pycon.org.il/conference/cfp Dan: Python Basics Book Joke: Q: Why did the developer ground their kid? A: They weren't telling the truthy
February 22, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: Frozen-Flask “Frozen-Flask freezes a Flask application into a set of static files. The result can be hosted without any server-side software other than a traditional web server.” 2012 tutorial, Dead easy yet powerful static website generator with Flask Some of it is out of date, but it does point to the power of Frozen-Flask, as well as highlight a cool plugin, Flask-FlatPages, which allows pages from markdown. Michael #2: pipx by Chad Smith Last week we spoke about pythonloc Execute binaries from Python packages in isolated environments "binary" to describe a CLI application that can be run directly from the command line Features Safely install packages to isolated virtual environments, while globally exposing their CLI applications so you can run them from anywhere Easily list, upgrade, and uninstall packages that were installed with pipx Run the latest version of a CLI application from a package in a temporary virtual environment, leaving your system untouched after it finishes Run binaries from the __pypackages__ directory per PEP 582 as companion tool to pythonloc Runs with regular user permissions, never calling sudo pip install ... (you aren't doing that, are you? 😄). You can globally install a CLI application by running: pipx install PACKAGE "Just the “pipx upgrade-all” command is already a huge win over pipsi" Check out How does this compare to pipsi? Brian #3: Data science is different now Vicki Boykis There’s lots of buzz around data science. This has resulted in loads of new data scientists looking for junior level positions. Coming from boot camps, MOOCs, self taught, remote degrees, and other training. “.. now that data science has changed from a buzzword to something even larger companies outside of the Silicon Valley bubble hire for, positions have not only become more codified, but with more rigorous entry requirements that will prefer people with previous data science experience every time.” “ … the market can be very hard, and very discouraging for the flood of beginners.” Data science is a misleading job req “The reality is that “data science” has never been as much about machine learning as it has about cleaning, shaping data, and moving it from place to place.” Advice: Don’t get into data science (this amuses me). “Don’t do what everyone else is doing, because it won’t differentiate you.” “It’s much easier to come into a data science and tech career through the “back door”, i.e. starting out as a junior developer, or in DevOps, project management, and, perhaps most relevant, as a data analyst, information manager, or similar, than it is to apply point-blank for the same 5 positions that everyone else is applying to. It will take longer, but at the same time as you’re working towards that data science job, you’re learning critical IT skills that will be important to you your entire career.” Learn the skills needed for data science today Creating Python packages Putting R in production Optimizing Spark jobs so they run more efficiently Version controlling data Making models and data reproducible Version controlling SQL Building and maintaining clean data in data lakes Tooling for time series forecasting at scale Scaling sharing of Jupyter notebooks Thinking about systems for clean data Lots of JSON Data science is turning more and more into a mostly engineering field. Data scientists need to have “good generalist engineering skills with a data background.” Michael #4: RustPython via Fredrik Averpil A Python-3 (CPython >= 3.5.0) Interpreter written in Rust. Seems pretty active: Latest commit ac95b61 an hour ago… Goals Full Python-3 environment entirely in Rust (not CPython bindings) A clean implementation without compatibility hacks Contributing To start contributing, there are a lot of things that need to be done. Most tasks are listed in the issue tracker. Check issues labeled with good first issue if you wish to start coding. Rust does have direct WebAssembly support… Brian #5: Jupyter Notebook: An Introduction Mike Driscoll on RealPython Not the “all the cool things you can do with it”, but the “really, how do I start” tutuorial. I think it should have included a mention of installing it in a venv and how to use %pip install, so I’ll include those things in these notes. Installing with pip install jupyter . Also a note that Jupyter is included with the Anaconda distribution. Note: Like everything else, I always install it in a virtual environment, if using pip, so the real installation instructions I recommend is: python3 -m venv venv --``prompt jupyter source venv/bin/activate OR venv\scripts\activate.bat if windows pip install jupyter pip install [HTML_REMOVED] jupyter notebook That will launch a localhost web interface. Creating a new notebook within the web interface. Changing the “Untitled” name by clicking on the name. This was not obvious to me. Running cells, including the shift-enter keyboard shortcut. A run through the menu, stopping at non-obvious places “File” has “Save and Checkpoint” which is super cool. “Edit” has cell cut, copy, paste. But also has delete, split, merge, and cell movement. “Cell” menu has lots of cool run options, like “Run all above” and “Run all below” and others. Not just Python, but you can have a terminal sessions and more from within Jupyter. A look at the “Running” tab. Quick overview of the markdown support for markdown cells Exporting notebooks using jupyter nbconvert Extra notes on installing packages from Jupyter: To pip install from the notebook, do this: %pip install numpy in a code cell. Michael #6: Python Developers Survey 2018 Results Python usage as a main language is up 5 percentage points from 79% in 2017 when Python Software Foundation conducted its previous survey. What do you use Python for? (2018/2017) 59%/51% Data analysis 56%/54% Web dev 39%/32% ML Web development is the only category with a large gap (56% vs. 36%) separating those using Python as their main language vs. as a supplementary language. For other types of development, the differences are much smaller. What do you use Python for the most? (single answer) 29%/29% web dev 17%/17% data analysis 11%/8% ML Like last year: 27% (Web development) ≈ 28% (Scientific development) Science = 17% + 11% for Data analysis + Machine learning Python 3 vs Python 2 84% Python 3 vs 16% Python 2. The use of Python 3 continues to grow rapidly. According to the latest research in 2017, 75% were using Python 3 compared with 25% for Python 2. Top 4 web frameworks (majority to the first two): Flask Django Tornado Pyramid Databases PostgreSQL MySQL SQLite MongoDB ORMs SQLAlchemy and Django ORM tied Extras: “Mentored sprints for diverse beginners” at PyCon “A newcomer’s introduction to contributing to an open source project” https://us.pycon.org/2019/hatchery/mentoredsprints/ Call for applications for projects open Feb 8 to March 14 Call for contributors, participants in the sprint also open Feb 8 to March 14 “If you are wondering if this event is for you: it definitely is and we would love to have you taking part in this sprint.” “This mentored sprint will take place on Saturday, May 4th, 2019 from 2:35pm to 6:30pm” Joke: via Florian Q: If you have some pseudo code (say in sample.txt) how do you most easily convert it to Python? A: Change the extension to .py Extra Joke: Python Song (with chapters!)
February 14, 2019
Sponsored by pythonbytes.fm/datadog Brian #1: Goodbye Virtual Environments? by Chad Smith venv’s are great but they introduce some problems as well: Learning curve: explaining “virtual environments” to people who just want to jump in and code is not always easy Terminal isolation: Virtual Environments are activated and deactivated on a per-terminal basis Cognitive overhead: Setting up, remembering installation location, activating/deactivating PEP 582 — Python local packages directory This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__directory and prefer importing packages installed in this location over user or global site-packages. This will avoid the steps to create, activate or deactivate “virtual environments”. Python will use the __pypackages__ from the base directory of the script when present. Try it now with pythonloc pythonloc is a drop in replacement for python and pip that automatically recognizes a __pypackages__ directory and prefers importing packages installed in this location over user or global site-packages. If you are familiar with node, __pypackages__ works similarly to node_modules. Instead of running python you run pythonloc and the __pypackages__ path will automatically be searched first for packages. And instead of running pip you run piploc and it will install/uninstall from __pypackages__. Michael #2: webassets Bundles and minifies CSS & JS files Been doing a lot of work to rank higher on the sites That lead me to Google’s Lighthouse Despite 25ms response time to the network, Google thought my site was “kinda slow”, yikes! webassets has integration for the big three: Django, Flask, & Pyramid. But I prefer to just generate them and serve them off disk def build_asset(env: webassets.Environment, files: List[str], filters: str, output: str): bundle = webassets.Bundle( *files, filters=filters, output=output, env=env ) bundle.build(force=True) Brian #3: Bernat on Python Packaging 3 part series by Bernat Gabor Maintainer of tox and virtualenv Python packages. The State of Python Packaging Python packaging - Past, Present, Future Python packaging - Growing Pains Michael #4: What the mock? — A cheatsheet for mocking in Python Nice introduction Some examples @mock.patch('work.os') def test_using_decorator(self, mocked_os): work_on() mocked_os.getcwd.assert_called_once() And def test_using_context_manager(self): with mock.patch('work.os') as mocked_os: work_on() mocked_os.getcwd.assert_called_once() Brian #5: Transitions: The easiest way to improve your tech talk By Saron Yitbarek Jeff Atwood of CodingHorror noted “The people who can write and communicate effectively are, all too often, the only people who get heard. They get to set the terms of the debate.” Effectively presenting is part of effective communication. I love the focus of this article. Focused on one little aspect of improving the performance of a tech talk. Michael #6: Steering council announced Our new leaders are Barry Warsaw Brett Cannon Carol Willing Guido van Rossum Nick Coghlan Via Joe Carey We both think it’s great Guido is on the council.
February 6, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: Inside python dict — an explorable explanation Interactive tutorial on dictionaries Searching efficiently in a list Why are hash tables called has tables? Putting it all together to make an “almost”-Python-dict How Python dict really works internally Yes this is a super deep dive, but wow it’s cool. Tons of the code is runnable right there in the web page, including moving visual representations, highlighted code with current line of code highlighted. Some examples allow you to edit values and play with stuff. Michael #2: Embed Python in Unreal Engine 4 You may notice a theme throughout my set of picks on this episode Games built on Unreal Engine 4 include Fortnite: Save the World Gears of War 4 Marvel vs. Capcom: Infinite Moto Racer 4 System Shock (remake) Plugin embedding a whole Python VM in Unreal Engine 4 (both the editor and runtime). This means you can use the plugin to write other plugins, to automate tasks, to write unit tests and to implement gameplay elements. Here is an example usage. It’s a really nice overview and tutorial for the editor. For game elements, check out this section. Brian #3: Redirecting stdout with contextlib When I want to test the stdout output of some code, that’s easy, I grab the capsys fixture from pytest. But what if you want to grab the stdout of a method NOT while testing? Enter [contextlib.redirect_stdout(new_target)](https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stdout) so cool. And very easy to read. ex: f = io.StringIO() with redirect_stdout(f): help(pow) s = f.getvalue() also a version for stderr Michael #4: Panda3D via Kolja Lubitz Panda3D is an open-source, completely free-to-use engine for realtime 3D games, visualizations, simulations, experiments Not just games, could be science as well! The full power of the graphics card is exposed through an easy-to-use API. Panda3D combines the speed of C++ with the ease of use of Python to give you a fast rate of development without sacrificing on performance. Features: Platform Portability Flexible Asset Handling: Panda3D includes command-line tools for processing and optimizing source assets, allowing you to automate and script your content production pipeline to fit your exact needs. Library Bindings: Panda3D comes with out-of-the-box support for many popular third-party libraries, such as the Bullet physics engine, Assimp model loader, OpenAL Performance Profiling: Panda3D includes pstats — an over-the-network profiling system designed to help you understand where every single millisecond of your frame time goes. Brian #5: Why PyPI Doesn't Know Your Projects Dependencies Some questions you may have asked: > How can I produce a dependency graph for Python packages? > Why doesn’t PyPI show a project’s dependencies on it’s project page? > How can I get a project’s dependencies without downloading the package? > Can I search PyPI and filter out projects that have a certain dependency? If everything is in requirements.txt, you just might be able to, but… setup.py is dynamic. You gotta run it to see what’s needed. Dependencies might be environment specific. Windows vs Linux vs Mac, as an example. Nothing stopping someone from putting random.choice() for dependencies in a setup.py file. But that would be kinda evil. But could be done. (Listener homework?) The wheel format is way more predictable because it limits some of this freedom. wheels don’t get run when they install, they really just get unpacked. More info on wheels: Kind of a tangent, but what why not: From: https://pythonwheels.com “Advantages of wheels Faster installation for pure Python and native C extension packages. Avoids arbitrary code execution for installation. (Avoids setup.py) Installation of a C extension does not require a compiler on Linux, Windows or macOS. Allows better caching for testing and continuous integration. Creates .pyc files as part of installation to ensure they match the Python interpreter used. More consistent installs across platforms and machines.” Michael #6: PyGame series via Matthew Ward Learn how to program in Python by building a simple dice game Build a game framework with Python using the PyGame module How to add a player to your Python game Using PyGame to move your game character around What's a hero without a villain? How to add one to your Python game Put platforms in a Python game with PyGame Also: Shout out to Mission Python book: Code a Space Adventure Game!
February 2, 2019
Sponsored by pythonbytes.fm/datadog Special guest: Nina Zakharenko Brian #1: Great Expectations A set of tools intended for batch time testing of data pipeline data. Introduction to the problem doc: Down with Pipeline debt / Introducing Great Expectations expect_[something]() methods that return json formatted descriptions of whether or not the passed in data matches your expectations. Can be used programmatically or interactively in a notebook. (video demo). For programmatic use, I’m assuming you have to put code in place to stop a pipeline stage if expectations aren’t met, and write failing json result to a log or something. Examples, just a few, full list is big: Table shape: expect_column_to_exist, expect_table_row_count_to_equal Missing values, unique values, and types: - expect_column_values_to_be_unique, expect_column_values_to_not_be_null Sets and ranges expect_column_values_to_be_in_set String matching expect_column_values_to_match_regex Datetime and JSON parsing Aggregate functions expect_column_stdev_to_be_between Column pairs Distributional functions expect_column_chisquare_test_p_value_to_be_greater_than Nina #2: Using CircuitPython and MicroPython to write Python for wearable electronics and embedded platforms I’ve been playing with electronics projects as a hobby for the past two years, and a few months ago turned my attention to Python on microcontrollers MicroPython is a lean and efficient implementation of Python3 that can run on microcontrollers with just 256k of code space, and 16k of RAM. CircuitPython is a port of MicroPython, optimized for Adafruit devices. Some of the devices that run Python are as small as a quarter. My favorite Python hardware platform for beginners is Adafruit’s Circuit PlayGround Express. It has everything you need to get started with programming hardware without soldering. All you’ll need is alligator clips for the conductive pads. The board features NeoPixel LEDs, buttons, switches, temperature, motion, and sound sensors, a tiny speaker, and lots more. You can even use it to control servos, tiny motor arms. Best of all, it only costs $25. If you want to program the Circuit PlayGround Express with a drag-n-drop style scratch-like interface, you can use Microsoft’s MakeCode. It’s perfect for kids and you’ll find lots of examples on their site. Best of all, there are tons of guides for Python projects to build on their website, from making your own synthesizers, to jewelry, to silly little robots. Check out the repo for my Python-powered earrings, see a photo, or a demo. Sign up for the Adafruit Python for Microcontrollers mailing list here, or see the archives here. Michael #3: Data class CSV reader Map CSV to Data Classes You probably know about reading CSV files Maybe as tuples Better with csv.DictReader This library is similar but maps Python 3.7’s data classes to rows of CSV files Includes type conversions (say string to int) Automatic type conversion. DataclassReader supports str, int, float, complex and datetime DataclassReader use the type annotation to perform validation of the data of the CSV file. Helps you troubleshoot issues with the data in the CSV file. DataclassReader will show exactly in which line of the CSV file contain errors. Extract only the data you need. It will only parse the properties defined in the dataclass It uses dataclass features that let you define metadata properties so the data can be parsed exactly the way you want. Make the code cleaner. No more extra loops to convert data to the correct type, perform validation, set default values, the DataclassReader will do all this for you Default fallback values, more. Brian #4: How to Rock Python Packaging with Poetry and Briefcase Starts with a discussion of the packaging (for those readers that don’t listen to Python Bytes, I guess.) However, it also puts flit, pipenv, and poetry in context with each other, which is nice. Runs through a tutorial of how to build a pyproject.toml based project using poetry and briefcase. We’ve talked about Poetry before, on episode 100. pyproject.toml is discussed extensively on Test & Code 52. briefcase is new, though, it’s a project for creating standalone native applications for Mac, Windows, Linux, iOS, Android, and more. The tutorial also discusses using poetry directly to publish to the test-pypi server. This is a nice touch. Use the test-pypi before pushing to the real pypi. Very cool. Nina #5: awesome-python-security *🕶🐍🔐, a collection of tools, techniques, and resources to make your Python more secure* All of your production and client-facing code should be written with security in mind This list features a few resources I’ve heard of such as Anthony Shaw’s excellent 10 common security gotchas article which highlights problems like input injection and depending on assert statements in production, and a few that are new to me: OWASP (Open Web Application Security Project) Python Resources at pythonsecurity.org bandit a tool to find common security issues in Python bandit features a lot of useful plugins, that test for issues like: hardcoded password strings leaving flask debug on in production using exec() in your code & more detect-secrets, a tool to detect secrets left accidentally in a Python codebase & lots more like resources for learning about security concepts like cryptography See the full list for more Michael #6: pydbg Python implementation of the Rust dbg macro Best seen with an example. Rather than printing things you want to inspect, you: a = 2 b = 3 dbg(a+b) def square(x: int) -> int: return x * x dbg(square(a)) outputs: [testfile.py:4] a+b = 5 [testfile.py:9] square(a) = 4 Extras: Brian: pathlib + pytest tmpdir → tmp_path & tmp_path_factory https://docs.pytest.org/en/latest/tmpdir.html These two new fixtures (as of pytest 3.9) act like the good old tmpdir and tmpdir_factory, but return pathlib Path objects. Awesome. Michael: The Art of Python is a miniature arts festival at PyCon North America 2019, focusing on narrative, performance, and visual art. We intend to encourage and showcase novel art that helps us share our emotionally charged experiences of programming (particularly in Python). We hope that by attending, our audience will discover new aspects of empathy and rapport, and find a different kind of delight and perspective than might otherwise be expected at a large conference. StackOverflow Survey is Open! https://stackoverflow.az1.qualtrics.com/jfe/form/SV_1RGiufc1FCJcL6B NumPy Is Awaiting Fix for Critical Remote Code Execution Bug via Doug Sheehan The issue was raised on January 16 and affects NumPy versions 1.10 (released in 2015) through 1.16, which is the latest release at the moment, released on January 14 The problem is with the 'pickle' module, which is used for transforming Python object structures into a format that can be stored on disk or in databases, or that allows delivery across a network. The issue was reported by security researcher Sherwel Nan, who says that if a Python application loads malicious data via the numpy.load function an attacker can obtain remote code execution on the machine. Get your google data All google docs in MS Office format via https://takeout.google.com/settings/takeout All Gmail in MBOX format from there as well Hint: Start with nothing selected ;) Nina: I’m teaching a two day Intro and Intermediate Python course on March 19th and 20th. The class will live-stream for free here on each day of or join in-person from downtown Minneapolis. All of the course materials will be released for free as well. I recently recorded a series of videos with Carlton Gibson (Django maintainer) on developing Django Web Apps with VS Code, deploying them to Azure with a few clicks, setting up a Continuous Integration / Continuous Delivery pipeline, and creating serverless apps. Watch the series here: https://aka.ms/python-videos I’ll be a mentor at a brand new hatchery event at PyCon US 2019, mentored sprints for diverse beginners organized by Tania Allard. The goal is to help underrepresented folks at PyCon contribute to open source in a supportive environment. The details will be located here (currently a placeholder) when they’re finalized. Catch my talk about electronics projects in Python with LEDs at PyCascades in Seattle on February 24th. Currently tickets are still for sale. If you haven’t tried the Python extension for VS Code, now is a great time. The December release included some killer features, such as remote Jupyter support, and exporting Python files as Jupyter notebooks. Keep up with future releases at the Python at Microsoft blog.
January 26, 2019
Sponsored by pythonbytes.fm/digitalocean Brian #1: What should be in the Python standard library? on lwn.net by Jake Edge There was a discussion recently about what should be in the standard library, triggered by a request to add LZ4 compression. Kinda hard to summarize but we’ll try: Jonathan Underwood proposed adding LZ4 compression to stdlib. Can of worms opened zlib and bz2 already in stdlib Brett proposed making something similar to hashlib for compression algorithms. Against adding it: lz4 not needed for stdlib, and actually, bz2 isn’t either, but it’s kinda late to remove. PyPI is easy enough. put stuff there. Led to a discussion of the role of stdlib. If it’s batteries included, shouldn’t we add new batteries Some people don’t have access to PyPI easily Do we never remove elements? really? Maybe we should have a lean stdlib and a thicker standard distribution of selected packages who would decide? same problem exists then of depending on it. How to remove stuff? Steve Dower would rather see a smaller standard library with some kind of "standard distribution" of PyPI modules that is curated by the core developers. A leaner stdlib could speed up Python version schedules and reduce burden on core devs to maintain seldom used packages. See? can of worms. In any case, all this would require a PEP, so we have to wait until we have a PEP process decided on. Michael #2: Data Science portal for Home Assistant launched via Paul Cutler Home Assistant is launching a data science portal to teach you how you can learn from your own smart home data. In 15 minutes you setup a local data science environment running reports. A core principle of Home Assistant is that a user has complete ownership of their personal data. A users data lives locally, typically on the SD card in their Raspberry Pi The Home Assistant Data Science website is your one-stop-shop for advice on getting started doing data science with your Home Assistant data. To accompany the website, we have created a brand new Hass.io Add-on JupyterLab lite, which allows you to run a data science IDE called JupyterLab directly on your Raspberry Pi hosting Home Assistant. You do your data analysis locally, your data never leaves your local machine. When you build something cool, you can share the notebook without the results, so people can run it at their homes too. We have also created a Python library called the HASS-Data-Detective which makes it super easy to get started investigating your Home Assistant data using modern data science tools such as Pandas. Check out the Getting Started notebook IoT aside: I finally found my first IoT project: Recording in progress button. Brian #3: What's the future of the pandas library? Kevin Markham over at dataschool.io pandas is gearing up to move towards a 1.0 release. Currently rc-ing 0.24 Plans are to get there “early 2019”. Some highlights method chaining - encouraged by core team to encourage further, more methods will support chaining Apache arrow likely to be part of pandas backend sometime after 1.0 Extension arrays - allow you to create custom data types deprications inplace parameter. It doesn’t work with chaining, doesn’t actually prevent copies, and causes codebase complexity ix accessor, use loc and iloc instead Panel data structure. Use MultiIndex instead SparseDataFrame. Just use a normal DataFrame legacy python support Michael #4: PyOxidizer PyOxidizer is a collection of Rust crates that facilitate building libraries and binaries containing Python interpreters. PyOxidizer is capable of producing a single file executable - with all dependencies statically linked and all resources (like .pyc files) embedded in the executable The Oxidizer part of the name comes from Rust: executables produced by PyOxidizer are compiled from Rust and Rust code is responsible for managing the embedded Python interpreter and all its operations. PyOxidizer is similar in nature to PyInstaller, Shiv, and other tools in this space. What generally sets PyOxidizer apart is Produced executables contain an embedded, statically-linked Python interpreter have no additional run-time dependency on the target system runs everything from memory (as opposed to e.g. extracting Python modules to a temporary directory and loading them from there). Brian #5: Working With Files in Python by Vuyisile Ndlovu on RealPython Very comprehensive write up on working with files and directories Includes legacy and modern methods. Pay attention to pathlib parts if you are using 3.4 plus Also great for “if you used to do x, here’s how to do it with pathlib”. Included: Directory listings getting file attributes creating directories file name pattern matching traversing directories doing stuff with the files in there creating temp directories and files deleting, copying, moving, renaming archiving with zip and tar including reading those looping over files Michael #6: $ python == $ python3? via David Furphy Homebrew tried this recently & got "persuaded" to reverse. Also in recent discussion of edits to PEP394, GvR said absolutely not now, probably not ever. Guido van Rossum RE: python doesn’t exist on macOS as a command: Did you mean python2 there? In my experience macOS comes with python installed (and invoking Python 2) but no python2 link (hard or soft). In any case I'm not sure how this strengthens your argument. I'm also still unhappy with any kind of endorsement of python pointing to python3. When a user gets bitten by this they should receive an apology from whoever changed that link, not a haughty "the PEP endorses this". Regardless of what macOS does I think I would be happier in a future where python doesn't exist and one always has to specify python2 or python3. Quite possibly there will be an age where Python 2, 3 and 4 all overlap, and EIBTI. Extras: Michael: A letter to the Python community in Africa via Anthony Shaw Believe the broader international Python and Software community can learn a lot from what so many amazing people are doing across Africa. e.g. The attendance of PyCon NA was 50% male and 50% female. Joke: via Luke Russell: A: “Knock Knock” B: “Who’s There" A: ……………………………………………………………………………………….“Java” Also: Java 4EVER video is amazing: youtube.com/watch?v=kLO1djacsfg
January 18, 2019
Sponsored by https://pythonbytes.fm/digitalocean Brian #1: Advent of Code 2018 Solutions Michael Fogleman Even if you didn’t have time or energy to do the 2018 AoC, you can learn from other peoples solutions. Here’s one set written up in a nice blog post. Michael #2: Python Lands on the Windows 10 App Store Python Software Foundation recently released Python 3.7 as an app on the official Windows 10 app store. Python 3.7 is now available to install from the Microsoft Store, meaning you no longer need to manually download and install the app from the official Python website. there is one limitation. “Because of restrictions on Microsoft Store apps, Python scripts may not have full write access to shared locations such as TEMP and the registry. Discussed with Steve Dower over on Talk Python 191 Brian #3: How I Built A Python Web Framework And Became An Open Source Maintainer Florimond Manca Bocadillo - “A modern Python web framework filled with asynchronous salsa” ”maintaining an open source project is a marathon, not a sprint.” Tips at the end of the article include tips for the following topics, including recommendations and tool choices: Project definition Marketing & Communication Community Project management Code quality Documentation Versioning and releasing Michael #4: Python maintainability score via Wily via Anthony Shaw A Python application for tracking, reporting on timing and complexity in tests Easiest way to calculate it is with wily https://github.com/tonybaloney/wily … the metrics are ‘maintainability.mi’ and ‘maintainability.rank’ for a numeric and the A-F scale. Build an index: wily build src Inspect report: wily report file Graph: wily graph file metric Brian #5: A couple fun awesome lists Awesome Python Security resources Tools web framework hardening, ex: secure.py multi tools static code analysis, ex: bandit vulnerabilities and security advisories cryptography app templates Education lots of resources for learning Companies Awesome Flake8 Extensions clean code testing, including flake8-pytest - Enforces to use pytest-style assertions flake8-mock - Provides checking mock non-existent methods security documentation enhancements copyrights Michael #6: fastlogging via Robert Young A faster replacement of the standard logging module with a mostly compatible API. For a single log file it is ~5x faster and for rotating log file ~13x faster. It comes with the following features: (colored, if colorama is installed) logging to console logging to file (maximum file size with rotating/history feature can be configured) old log files can be compressed (the compression algorithm can be configured) count same successive messages within a 30s time frame and log only once the message with the counted value. log domains log to different files writing to log files is done in (per file) background threads, if configured configure callback function for custom detection of same successive log messages configure callback function for custom message formatter configure callback function for custom log writer >> import antigravity
January 11, 2019
Sponsored by https://pythonbytes.fm/datadog Brian #1: nbgrader nbgrader: A Tool for Creating and Grading Assignments in the Jupyter Notebook The Journal of Open Source Education, paper accepted 6-Jan-2019 nbgrader documentation, including a intro video From the JOSE article: “nbgrader is a flexible tool for creating and grading assignments in the Jupyter Notebook (Kluyver et al., 2016). nbgrader allows instructors to create a single, master copy of an assignment, including tests and canonical solutions. From the master copy, a student version is generated without the solutions, thus obviating the need to maintain two separate versions. nbgrader also automatically grades submitted assignments by executing the notebooks and storing the results of the tests in a database. After auto-grading, instructors can manually grade free responses and provide partial credit using the formgrader Jupyter Notebook extension. Finally, instructors can use nbgrader to leave personalized feedback for each student’s submission, including comments as well as detailed error information.” CS teaching methods have come a long ways since I was turning in floppies and code printouts. Michael #2: profanity-check A fast, robust Python library to check for offensive language in strings. profanity-check uses a linear SVM model trained on 200k human-labeled samples of clean and profane text strings. Making profanity-check both robust and extremely performant Other libraries like profanity-filter use more sophisticated methods that are much more accurate but at the cost of performance. profanity-filter runs in 13,000ms vs 24ms for profanity-check in a benchmark Two ways to use: predict(text) → 0 or 1 (1 = bad) predict_prob(text) → [0, 1] confidence interval (1 = bad) Brian #3: An Introduction to Python Packages for Absolute Beginners Ever tried to explain the difference between module and package? Between package-in-the-directory-with-init sense and package-you-can-distribute-and-install-with-pip sense? Here’s the article to read beforehand. Modules, packages, using packages, installing, importing, and more. And that’s not even getting into flit and poetry, etc. But it’s a good place to start for people new to Python. Michael #4: Python Dependencies and IoC via Joscha Götzer Open-closed principle is at work with these and is super valuable to testing (one of the SOLID principles): Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification. There is a huge debate around why Python doesn’t need DI or Inversion of Control (IoC), and a quick stackoverflow search yields multiple results along the lines of “python is a scripting language and dynamic enough so that DI/IoC makes no sense”. However, especially in large projects it might reduce the cognitive load and decoupling of individual components Dependency Injector: I couldn’t get this one to work on windows, as it needs to compile some C libraries and some Visual Studio tooling was missing that I couldn’t really install properly. The library looks quite promising though, but sort of static with heavy usage of containers and not necessarily pythonic. Injector: The library that above mentioned article talks about, a little Java-esque pinject: Has been unmaintained for about 5 years, and only recently got new attention from some open source people who try to port it to python3. A product under Google copyright, and looks quite nice despite the lack of python3 bindings. Probably the most feature-rich of the listed libraries. python-inject: I discovered that one while writing this email, not really sure if it’s any good. Nice use of type annotations and testing features di-py: Only works up to python 3.4, so I’ve also never tried it (I’m one of those legacy python haters, I’m sure you can relate 😄). Serum: This one is a little too explicit to my mind. It makes heavy use of context managers (literally with Context(...): everywhere 😉) and I’m not immediately sure how to work with it. In this way, it is quite powerful though. Interesting use of class decorators. And now on to my favorite and a repeated recommendation of mine around the internet→ Haps: This lesser-known, lightweight library is sort of the new kid on the block, and really simple to use. As some of the other libraries, it uses type annotations to determine the kind of object it is supposed to instantiate, and automatically discovers the required files in your project folder. Haps is very pythonic and fits into apps of any size, helping to ensure modularization as the only dependency of your modules will be one of the types provided by the library. Pretty good example here. Brian #5: A Gentle Introduction to Pandas Really a gentle introduction to the Pandas data structures Series and DataFrame. Very gentle, with console examples. Create series objects: from an array from an array, and change the indexing from a dictionaries from a scalar, cool. didn’t know you could do that Accessing elements in a series DataFrames sorting, slicing selecting by label, position statistics on columns importing and exporting data Michael #6: Don't use the greater than sign in programming One simple thing that comes up time and time again is the use of the greater than sign as part of a conditional while programming. Removing it cleans up code. Let's say that I want to check that something is between 5 and 10. There are many ways I can do this x > 5 and 10 > x 5 < x and 10 > x x > 5 and x < 10 10 < x and x < 5 x < 10 and x > 5 x < 10 and 5 < x Sorry, one of those is incorrect. Go ahead and find out which one If you remove the use of the greater than sign then only 2 options remain x < 10 and 5 < x 5 < x and x < 10 The last is nice because x is literally between 5 and 10 There is also a nice way of expressing that "x is outside the limits of 5 and 10” x < 5 or 10 < x Again, this expresses it nicely because x is literally outside of 5 to 10. Interesting comment: What is cleaner or easier to read comes down to personal taste. But how to express "all numbers greater than 1" without '>'? ans: 1 < allNumbers Extras Michael Teaching Python podcast by Kelly Paredes & Sean Tibor Github private repos (now free) EuroPython 2019 announced South African AWS Data Center coming (via William H.) Pandas is dropping legacy Python support any day now Joke: Harry Potter Parser Tongue via Nick Spirit
January 5, 2019
Sponsored by https://pythonbytes.fm/datadog Brian #1: loguru: Python logging made (stupidly) simple Finally, a logging interface that is just slightly more syntax than print to do mostly the right thing, and all that fancy stuff like log rotation is easy to figure out. i.e. a logging API that fits in my brain. bonus: README is a nice tour of features with examples. Features: Ready to use out of the box without boilerplate No Handler, no Formatter, no Filter: one function to rule them all Easier file logging with rotation / retention / compression Modern string formatting using braces style Exceptions catching within threads or main Pretty logging with colors Asynchronous, Thread-safe, Multiprocess-safe Fully descriptive exceptions Structured logging as needed Lazy evaluation of expensive functions Customizable levels Better datetime handling Suitable for scripts and libraries Entirely compatible with standard logging Personalizable defaults through environment variables Convenient parser Exhaustive notifier Michael #2: Python gets a new governance model by Brett Canon July 2018, Guido steps down Python progress has basically been on hold since then ended up with 7 governance proposals Voting was open to all core developers as we couldn't come up with a reasonable criteria that we all agreed to as to what defined an "active" core dev And the winner is ... In the end PEP 8016, the steering council proposal, won. it was a decisive win against second place PEP 8016 is heavily modeled on the Django project's organization (to the point that the PEP had stuff copy-and-pasted from the original Django governance proposal). What it establishes is a steering council of five people who are to determine how to run the Python project. Short of not being able to influence how the council itself is elected (which includes how the electorate is selected), the council has absolute power. result of the vote prevents us from ever having the Python project be leaderless again, it doesn't directly solve how to guide the language's design. What's next? The next step is we elect the council. It's looking like nominations will be from Monday, January 07 to Sunday, January 20 and voting from Monday, January 21 to Sunday, February 03 A key point I hope people understand is that while we solved the issue of project management that stemmed from Guido's retirement, the council will need to be given some time to solve the other issue of how to manage the design of Python itself. Brian #3: Why you should be using pathlib Tour of pathlib from Trey Hunner pathlib combines most of the commonly used file and directory operations from os, os.path, and glob. uses objects instead of strings as of Python 3.6, many parts of stdlib support pathlib since pathlib.Path methods return Path objects, chaining is possible convert back to strings if you really need to for pre-3.6 code Examples: make a directory: Path('src/__pypackages__').mkdir(parents=True, exist_ok=True) rename a file: Path('.editorconfig').rename('src/.editorconfig') find some files: top_level_csv_files = Path.cwd().glob('*.csv') recursively: all_csv_files = Path.cwd().rglob('*.csv') read a file: Path('some/file').read_text() write to a file: Path('.editorconfig').write_text('# config goes here') with open(path, mode) as x works with Path objects as of 3.6 Follow up article by Trey: No really, pathlib is great Michael #4: Altair and Altair Recipes via Antonio Piccolboni (he wrote altair_recipes) Altair: Declarative statistical visualization library for Python Altair is developed by Jake Vanderplas and Brian Granger By statistical visualization they mean: The data source is a DataFrame that consists of columns of different data types (quantitative, ordinal, nominal and date/time). The DataFrame is in a tidy format where the rows correspond to samples and the columns correspond to the observed variables. The data is mapped to the visual properties (position, color, size, shape, faceting, etc.) using the group-by data transformation. Nice example that I can get behind # cars = some Pandas data frame alt.Chart(cars).mark_point().encode( x='Horsepower', y='Miles_per_Gallon', color='Origin', ) altair_recipes Altair allows generating a wide variety of statistical graphics in a concise language, but lacks, by design, pre-cooked and ready to eat statistical graphics, like the boxplot or the histogram. Examples: https://altair-recipes.readthedocs.io/en/latest/examples.html They take a few lines only in altair, but I think they deserve to be one-liners. altair_recipes provides that level on top of altair. The idea is not to provide a multitude of creative plots with fantasy names (the way seaborn does) but a solid collection of classics that everyone understands and cover most major use cases: the scatter plot, the boxplot, the histogram etc. Fully documented, highly consistent API (see next package), 90%+ test coverage, maintainability grade A, this is professional stuff if I may say so myself. Brian #5: A couple fun pytest plugins pytest-picked Using git status, this plugin allows you to: Run only tests from modified test files Run tests from modified test files first, followed by all unmodified tests Kinda hard to overstate the usefulness of this plugin to anyone developing or debugging a test. Very, very cool. pytest-clarity Colorized left/right comparisons Early in development, but already helpful. I recommend running it with -qq if you don’t normally run with -v/--verbose since it overrides the verbosity currently. Michael #6: Secure 🔒 headers and cookies for Python web frameworks Python package called Secure, which sets security headers and cookies (as a start) for Python web frameworks. I was listening to the Talk Python To Me episode “Flask goes 1.0” with Flask maintainer David Lord. At the end of the interview he was asked about notable PyPI packages and spoke about Flask-Talisman, a third-party package to set security headers in Flask. As a security professional, it was surprising and encouraging to hear the maintainer of the most popular Python web framework speak passionately about a security package. Had been recently experimenting with emerging Python web frameworks and realized there was a gap in security packages. That inspired Caleb to (humbly) see if it were possible to make a package to correct that and I started with Responder and then expanded to support more frameworks. The outcome was Secure with functions to support aiohttp, Bottle, CherryPy, Falcon, hug, Pyramid, Quart, Responder, Sanic, Starlette and Tornado (most of these, if not all have been featured on Talk Python) and can also be utilized by frameworks not officially supported. The goal is to be minimalistic, lightweight and be implemented in a way that does not disrupt an individual framework’s design. I have had some great feedback and suggestions from the developer and OWASP community, including some awesome discussions with the OWASP Secure Project and the Sanic core team. Added support for Flask and Django too. Secure Cookies is nice in the mix Extras: Michael: SQLite bug impacts thousands of apps, including all Chromium-based browsers See https://twitter.com/mborus/status/1080874700924964864 Since this bug is triggered by an SQL command, general CPython usage should not be affected, and long as you don’t run arbitrary SQL-commands provided by the outside. Seems to NOT be a problem in CPython: https://twitter.com/mborus/status/1080883549308362753 Michael: Follow up to our AI and healthcare conversation via Bradley Hintze I found your discussion of deep learning in healthcare interesting, no doubt because that is my area. I am the data scientist for the National Oncology Program at the Veterans Health Administration. I work directly with clinicians and it is my strong opinion that AI cannot take the job from the MD. It will however make caring for patients much more efficient as AI takes care of the low hanging fruit, it you will. Healthcare, believe it or not, is a science and an art. This is why AI is never going to make doctors obsolete. It will, however, make doctors more efficient and demanded a more sophisticated doctor -- one that understands AI enough to not only trust it but, crucially, comprehend its limits. Michael: Upgrade to Python 3.7.2 If you install via home brew, it’s time for brew update && brew upgrade Michael: New course! Introduction to Ansible
December 26, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean This episode originally aired on Talk Python at talkpython.fm/192. It's been a fantastic year for Python. Literally, every year is better than the last with so much growth and excitement in the Python space. That's why I've asked two of my knowledgeable Python friends, Dan Bader and Brian Okken, to help pick the top 10 stories from the Python community for 2018. Guests Brian Okken @brianokken Dan Bader @dbader_org 10: Python 3.7: Cool New Features in Python 3.7 9: Changes in versioning patterns ZeroVer: 0-based Versioning Calendar Versioning Semantic Versioning 2.0.0 8: Python is becoming the world’s most popular coding language Economist article 7: 2018 was the year data science Pythonistas == web dev Pythonistas Python Developers Survey Results Covered in depth on Talk Python 176 6: Black Project Soundgarden : “Black Hole Sun” 5: New PyPI launched! Python Package Index 4: Rise of Python in the embedded world Covered at Python Bytes 3: Legacy Python's days are fading? Python 2.7 -- bugfix or security before EOL? Python 2 death clockhttps://pythonclock.org/ 2: It's the end of innocence for PyPi Twelve malicious Python libraries found and removed from PyPI 1: Guido stepped down as BDFL python-committers: Transfer of power Proposals for new governance structure
December 18, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Python Descriptors Are Magical Creatures an excellent discussion of understanding @property and Python’s descriptor protocol. discussion includes getter, setter, and deleter methods you can override. Michael #2: Data Science Survey 2018 JetBrains JetBrains polled over 1,600 people involved in Data Science and based in the US, Europe, Japan, and China, in order to gain insight into how this industry sector is evolving Key Takeaways Most people assume that Python will remain the primary programming language in the field for the next 5 years. Python is currently the most popular language among data scientists. Data Science professionals tend to use Keras and Tableau, while amateur data scientists are more likely to prefer Microsoft Azure ML. Most common activities among pros and amateurs: Data processing Data visualization Main programming language for data analysis Python 57% R 15% Julia 0% IDEs and Editors Jupyter 43% PyCharm 38% RStudio 23% … Brian #3: cache.py cache.py is a one file python library that extends memoization across runs using a cache file. memoization is an incredibly useful technique that many self taught or on the job taught developers don’t know about, because it’s not obvious. example: import cache @cache.cache() def expensive_func(arg, kwarg=None): # Expensive stuff here return arg The @cache.cache() function can take multiple arguments. @cache.cache(timeout=20) - Only caches the function for 20 seconds. @cache.cache(fname="my_cache.pkl") - Saves cache to a custom filename (defaults to hidden file .cache.pkl) @cache.cache(key=cache.ARGS[KWARGS,NONE]) - Check against args, kwargs or neither of them when doing a cache lookup. Michael #4: Setting up the data science tools part of a larger video series set up. Tools to keras ultimately Tools anaconda tensorflow Jupyter Keras good for true beginners setup and activate a condo venv Start up a notebook and switch envs use conda, rather than pip Brian #5: chartify “Python library that makes it easy for data scientists to create charts.” from the docs: Consistent input data format: Spend less time transforming data to get your charts to work. All plotting functions use a consistent tidy input data format. Smart default styles: Create pretty charts with very little customization required. Simple API: We've attempted to make to the API as intuitive and easy to learn as possible. Flexibility: Chartify is built on top of Bokeh, so if you do need more control you can always fall back on Bokeh's API. Michael #6: CPython byte code explorer JupyterLab extension to inspect Python Bytecode via Anton Helm by Jeremy Tuloup You’ll see exactly what it’s about if you watch the GIF movie at the github repo. Can’t think of a better way to understand Python bytecode quickly than to play a little with this Comparing versions of CPython: If you have several versions of Python installed on your machine (let's say in different conda environments), you can use the extension to check how the bytecode might differ. Nice visualization of different performance aspects of while vs. for at the end
December 11, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: pyjanitor - for cleaning data originally a port of an R package called janitor, now much more. “pyjanitor’s etymology has a two-fold relationship to “cleanliness”. Firstly, it’s about extending Pandas with convenient data cleaning routines. Secondly, it’s about providing a cleaner, method-chaining, verb-based API for common pandas routines.” functionality: Cleaning columns name (multi-indexes are possible!) Removing empty rows and columns Identifying duplicate entries Encoding columns as categorical Splitting your data into features and targets (for machine learning) Adding, removing, and renaming columns Coalesce multiple columns into a single column Convert excel date (serial format) into a Python datetime format Expand a single column that has delimited, categorical values into dummy-encoded variables This pandas code: df = pd.DataFrame(...) # create a pandas DataFrame somehow. del df['column1'] # delete a column from the dataframe. df = df.dropna(subset=['column2', 'column3']) # drop rows that have empty values in column 2 and 3. df = df.rename({'column2': 'unicorns', 'column3': 'dragons'}) # rename column2 and column3 df['newcolumn'] = ['iterable', 'of', 'items'] # add a new column. - looks like this with pyjanitor: df = ( pd.DataFrame(...) .remove_columns(['column1']) .dropna(subset=['column2', 'column3']) .rename_column('column2', 'unicorns') .rename_column('column3', 'dragons') .add_column('newcolumn', ['iterable', 'of', 'items']) ) Michael #2: What Does It Take To Be An Expert At Python? Presentation at PyData 2017 by James Powell Covers Python Data Model (dunder methods) Covers uses of Metaclasses All done very smoothly as a series of demos Pretty long and in depth, 1.5+ hours Brian #3: Awesome Python Applications pypi is a great place to find great packages you can use as examples for the packages you write. Where do you go for application examples? Well, now you can go to Awesome Python Applications. categories of applications included: internet, audio, video, graphics, games, productivity, organization, communication, education, science, CMS, ERP (enterprise resource planning), static site generators, and a whole slew of developer related applications. Mahmoud is happy to have help filling this out, so if you know of a great open source application written in Python, go ahead and contribute to this, or open an issue on this project. Michael #4: Django Core no more Write up by James Bennett If you’re not the sort of person who closely follows the internals of Django’s development, you might not know there’s a draft proposal to drastically change the project’s governance. What’s up: Django the open-source project is OK right now, but difficulty in recruiting and retaining enough active contributors. Some of the biggest open-source projects dodge this by having, effectively, corporate sponsorship of contributions. Django has become sort of a victim of its own success: the types of easy bugfixes and small features that often are the path to growing new committers have mostly been done already in Django. Not managed to bring in new committers at a sufficient rate to replace those who’ve become less active or even entirely inactive, and that’s not sustainable for much longer. Under-attracting women contributors too Governance: Some parallels to what the Python core devs are experiencing now. Project leads BDFLs stepped down. The proposal: what I’ve proposed is the dissolution of “Django core”, and the revocation of almost all commit bits Seems extreme but they were working much more as a team with PRs, etc anyway. Breaks down the barrier to needing to be on the core team to suggest, change anything. Two roles would be formalized — Mergers and Releasers — who would, respectively, merge pull requests into Django, and package/publish releases. But rather than being all-powerful decision-makers, these would be bureaucratic roles Brian #5: wemake django template a cookie-cutter template for serious django projects with lots of fun goodies “This project is used to scaffold a django project structure. Just like django-admin.py startproject but better.” features: Always up-to-date with the help of [@dependabot](https://dependabot.com/) poetry for managing dependencies mypy for optional static typing pytest for unit testing flake8 and wemake-python-styleguide for linting pre-commit hooks for consistent development docker for development, testing, and production sphinx for documentation Gitlab CI with full build, test, and deploy pipeline configured by default Caddy with https and http/2 turned on by default Michael #6: Django Hunter Tool designed to help identify incorrectly configured Django applications that are exposing sensitive information. Why? March 2018: 28,165 thousand django servers are exposed on the internet, many are showing secret API keys, database passwords, amazon AWS keys. Example: https://twitter.com/6IX7ine/status/978598496658960384 Some complained this inferred Django was insecure and said it wasn’t. Others thought “There is a reasonable argument to be made that DEBUG should default to False.” One beginner, Peter, chimes in: I probably have one of them, among my early projects that are on heroku and public GitHub repos. I did accidentally expose my aws password this way and all hell broke loose. The problem is that as a beginner, it wasn't obvious to me how to separate development and production settings and keep production stuff out of my public repository. Extras: Michael: Thanks for having me on your show Brian: https://blog.michaelckennedy.net/2018/12/08/being-a-great-podcast-guest/ Brian: open source extra: For Christmas, I want a dragon… pic.twitter.com/RmFAEgqpSr — Changelog (@changelog) Michael: Why did the multithreaded chicken cross the road? road the side get to the other of to to get the side to road the of other the side of to the to road other get to of the road to side other the get
December 7, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: glom: restructuring data, the Python way glom is a new approach to working with data in Python, featuring: Path-based access for nested structure data\['a'\]['b']['c'] → glom(data, 'a.b.c') Declarative data transformation using lightweight, Pythonic specifications glom(target, spec, **kwargs) with options such as a default value if value not found allowed exceptions Readable, meaningful error messages: PathAccessError: could not access 'c', part 2 of Path('a', 'b', 'c') is better than TypeError: 'NoneType' object is not subscriptable Built-in data exploration and debugging features glom.Inspect(``**a*``, ***kw*``) The [**Inspect**](https://glom.readthedocs.io/en/latest/api.html#glom.Inspect) specifier type provides a way to get visibility into glom’s evaluation of a specification, enabling debugging of those tricky problems that may arise with unexpected data. Michael #2: Scientific GUI apps with TraitsUI via Franklin Ventura They support: PyQt, wxPython, PySide, PyQt5 People should be aware of and when combined with Chaco (again from Enthought) the graphing and controlling capabilities really are amazing. Tutorial: Writing a graphical application for scientific programming using TraitsUI 6.0 Really simple UI / API for mapping object(s) to GUIs and back. Brian #3: Pampy: The Pattern Matching for Python you always dreamed of “Pampy is pretty small (150 lines), reasonably fast, and often makes your code more readable and hence easier to reason about.” uses _ as the missing info in a pattern simple match signature of match(input, pattern, action) Examples nested lists and tuples from pampy import match, _ x = [1, [2, 3], 4] match(x, [1, [_, 3], _], lambda a, b: [1, [a, 3], b]) # => [1, [2, 3], 4] - dicts: pet = { 'type': 'dog', 'details': { 'age': 3 } } match(pet, { 'details': { 'age': _ } }, lambda age: age) # => 3 match(pet, { _ : { 'age': _ } }, lambda a, b: (a, b)) # => ('details', 3) Michael #4: Google AI better than doctors at detecting breast cancer Google’s deep learning AI called LYNA able to correctly identify tumorous regions in lymph nodes 99 per cent of the time. We think of the impact of AI as killing 'low end' jobs [see poster], but these are "doctor" level positions. The presence or absence of these ‘nodal metastases’ influence a patient’s prognosis and treatment plan, so accurate and fast detection is important. In a second trial, six pathologists completed a diagnostic test with and without LYNA’s assistance. With LYNA’s help, the doctors found it ‘easier’ to detect small metastases, and on average the task took half as long. Brian #5: 2018 Advent of Code Another winter break activity people might enjoy is practicing with code challenges. AoC is a fun tradition. a calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. don't need a computer science background to participate don’t need a fancy computer; every problem has a solution that completes in at most 15 seconds on ten-year-old hardware. There’s a leaderboard, so you can compete if you want. Or just have fun. Past years available, back to 2015. Some extra tools and info: awesome-advent-of-code Michael #6: Red Hat Linux 8.0 Beta released, now (finally) updated to use Python 3.6 as default instead of 2.7 First of all, my favorite comment was a correction to the title: legacy python * “Python 3.6 is the default Python implementation in RHEL 8; limited support for Python 2.7 is provided. No version of Python is installed by default.“ Red Hat Enterprise Linux 8 is distributed with Python 3.6. The package is not installed by default. To install Python 3.6, use the yum install python3 command. Python 2.7 is available in the python2 package. However, Python 2 will have a shorter life cycle and its aim is to facilitate smoother transition to Python 3 for customers. Neither the default python package nor the unversioned /usr/bin/python executable is distributed with RHEL 8. Customers are advised to use python3 or python2 directly. Alternatively, administrators can configure the unversioned python command using the alternatives command. Python scripts must specify major version in hashbangs at RPM build time In RHEL 8, executable Python scripts are expected to use hashbangs (shebangs) specifying explicitly at least the major Python version. Extras: Michael: We were featured on TechMeme Long Ride Home podcast. Check out their podcast here. Thank you to Brian McCullough, the host of the show. I just learned about their show through this exchange but can easily see myself listening from time to time. It’s like Python Bytes, but for the wider tech world and less developer focused but still solid tech foundations. Brian: First story was about glom. I had heard of glom before, but got excited after interviewing Mahmoud for T&C 55, where we discussed the difficulty in testing if you use glom or DSLs in general. A twitter exchange and GH issue followed the episode, with Anthony Shaw. At one point, Ant shared this great joke from Brenan Kellar: A QA engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 99999999999 beers. Orders a lizard. Orders -1 beers. Orders a ueicbksjdhd. First real customer walks in and asks where the bathroom is. The bar bursts into flames, killing everyone. — Brenan Keller (@brenankeller) November 30, 2018
December 1, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Dependency Management through a DevOps Lens Python Application Dependency Management in 2018 - Hynek An opinionated comparison of one use case and pipenv, poetry, pip-tools “We have more ways to manage dependencies in Python applications than ever. But how do they fare in production? Unfortunately this topic turned out to be quite polarizing and was at the center of a lot of heated debates. This is my attempt at an opinionated review through a DevOps lens.” Best disclaimer in a blog article ever: “DISCLAIMER: The following technical opinions are mine alone and if you use them as a weapon to attack people who try to improve the packaging situation you’re objectively a bad person. Please be nice.” Requirements: Solution needs to meet the following features: Allow me specify my immediate dependencies (e.g. Django), resolve the dependency tree and lock all of them with their versions and ideally hashes (more on hashes), integrate somehow with tox so I can run my tests, and finally allow me to install a project with all its locked dependencies into a virtual environment of my choosing. Seem like reasonable wishes. So far, none of the solutions work perfectly. A good example of pointing out tooling issues with his use case while being respectful of the people involved in creating other tools. Michael #2: Plugins made simple with pluginlib makes creating plugins for Python very simple it relies on metaclasses, but the average programmer can easily get lost dealing with metaclasses Main Features: Plugins are validated when they are loaded (instead of when they are used) Plugins can be loaded through different mechanisms (modules, filesystem paths, entry points) Multiple versions of the same plugin are supported (The newest one is used by default) Plugins can be blacklisted by type, name, or version Multiple plugin groups are supported so one program can use multiple sets of plugins that won't conflict Plugins support conditional loading (examples: os, version, installed software, etc) Once loaded, plugins can be accessed through dictionary or dot notation Brian #3: How to Test Your Django App with Selenium and pytest Bob Belderbos “In this article I will show you how to test a Django app with pytest and Selenium. We will test our CodeChalleng.es platform comparing the logged out homepage vs the logged in dashboard. We will navigate the DOM matching elements and more.” Michael #4: Fluent collection APIs (flupy and asq) flupy implements a fluent interface for chaining multiple method calls as a single python expression. All flupy methods return generators and are evaluated lazily in depth-first order. This allows flupy expressions to transform arbitrary size data in extremely limited memory. Example: pipeline = flu(count()).map(lambda x: x**2) \ .filter(lambda x: x % 517 == 0) \ .chunk(5) \ .take(3) for item in pipeline: print(item) The CLI in particular has been great for our data science team. Not everyone is super comfortable with linux-fu so having a cross-platform way to leverage python knowledge on the shell has been an easy win. Also if you are LINQ inclined: https://github.com/sixty-north/asq asq is simple implementation of a LINQ-inspired API for Python which operates over Python iterables, including a parallel version implemented in terms of the Python standard library multiprocessing module. # ASQ >>> from asq import query >>> words = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"] >>> query(words).order_by(len).then_by().take(5).select(str.upper).to_list() ['ONE', 'SIX', 'TEN', 'TWO', 'FIVE'] Brian #5: Guido blogging again What to do with your computer science career Answering “A question about whether to choose a 9-5 job or be an entrepreneur” entrepreneurship isn’t for everyone working for someone else can be very rewarding shoot for “better than an entry-level web development job” And “A question about whether AI would make human software developers redundant (not about what I think of the field of AI as a career choice)” AI is about automating tasks that can be boring Software Engineering is never boring. Michael #6: Web apps in pure Python apps with Anvil Design with our visual designer Build with nothing but Python Publish Instant hosting in the cloud or on-site Paid product but has a free version Covered on Talk Python 138 Extras: Second Printing (P2) of “Python Testing with pytest”
November 23, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Colorizing and Restoring Old Images with Deep Learning Text interview by Charlie Harrington of Jason Antic, developer of DeOldify A whole bunch of machine learning buzzwords that I don’t understand in the slightest combine to make a really cool to to make B&W photos look freaking amazing. “This is a deep learning based model. More specifically, what I've done is combined the following approaches: Self-Attention Generative Adversarial Network Training structure inspired by (but not the same as) Progressive Growing of GANs. Two Time-Scale Update Rule. Generator Loss is two parts: One is a basic Perceptual Loss (or Feature Loss) based on VGG16. The second is the loss score from the critic.” Michael #2: PlatformIO IDE for VSCode via Jason Pecor PlatformIO is an open source ecosystem for IoT development Cross-platform IDE and unified debugger. Remote unit testing and firmware updates Built on Visual Studio Code which has a nice extension for Python PlatformIO, combined with the features of VSCode provides some great improvements for project development over the standard Arduino IDE for Arduino-compatible microcontroller based solutions. Some of these features are paid, but it’s a reasonable price With Python becoming more popular for microcontroller design, as well, this might be a very nice option for designers. And for Jason’s, specifically, it provides a single environment that can eventually be configured to handle doing the embedded code design, associated Python supporting tools mods, and HDL development. The PlatformIO Core written in Python. Python 2.7 (hiss…) Jason’s test drive video from Tuesday: Test Driving PlatformIO IDE for VSCode Brian #3: Python Data Visualization 2018: Why So Many Libraries? Nice overview of visualization landscape, by Anaconda team Differentiating factors, API types, and emerging trends Related: Drawing Data with Flask and matplotlib Finally! A really simple example app in Flask that shows how to both generate and display matplotlib plots. I was looking for something like this about a year ago and didn’t find it. Michael #4: coder.com - VS Code in the cloud Full Visual Studio Code, but in your browser Code in the browser Access up to 96 cores VS Code + extensions, so all the languages and features Collaborate in real time, think google docs Access linux from any OS Note: They sponsored an episode of Talk Python To Me, but this is not an ad here... Brian #5: By Welcoming Women, Python’s Founder Overcomes Closed Minds In Open Source Forbes’s article about Guido and the Python community actively working to get more women involved in core development as well as speaking at conferences. Good lessons for other projects, and work teams, about how you cannot just passively “let people join”, you need to work to make it happen. Michael #6: Machine Learning Basics From Anna-Lena Popkes Plain python implementations of basic machine learning algorithms Repository contains implementations of basic machine learning algorithms in plain Python (modern Python, yay!) All algorithms are implemented from scratch without using additional machine learning libraries. Goal is to provide a basic understanding of the algorithms and their underlying structure, not to provide the most efficient implementations. Most of the algorithms Linear Regression Logistic Regression Perceptron k-nearest-neighbor k-Means clustering Simple neural network with one hidden layer Multinomial Logistic Regression Decision tree for classification Decision tree for regression Anna-Lena was on Talk Python on 186: http://talkpython.fm/186 Extras: Michael: PSF Fellow Nominations are open Michael: Shiboken has no meaning Brian: Python 3.7 runtime now available in AWS Lambda
November 17, 2018
Python Bytes 104 Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Michael #0.1: Chapters and play at Chapters are now in the mp3 file Play at button on the website (doesn’t work on iOS unless you click the play to start it) Michael #0.2: Become a friend of the show https://pythonbytes.fm/friends-of-the-show Or just click “friends of the show” in the navbar Brian #1: wily: A Python application for tracking, reporting on timing and complexity in tests and applications. Anthony Shaw (aka “Friend of the Show”, aka “Ant”) (if listing 2 “aliases, do you just put one “aka” or one per alias?) I should cover this on Test & Code for the content of the package. But it’s the actual packaging that I want to talk about today. Wily is a code base that can be used as an example of embracing pyproject.toml (pyproject.toml discussed on PB 100 and T&C 52) A real nice clean project using newer packaging tools that also has some frequently used bells and whistles NO setup.py file wily’s pyproject.toml includes flit packaging, metadata, scripts tox configuration black configuration project also has testing done on TravisCI rst based docs and readthedocs updates code coverage black pre-commit for wily pre-commit hook for your project to run wily CONTRIBUTING.md that includes code of conduct HISTORY.md with a nice format tests using pytest Michael #2: Latest VS Code has Juypter support In this release, closed a total of 49 issues, including: Jupyter support: import notebooks and run code cells in a Python Interactive window Use new virtual environments without having to restart Visual Studio Code Code completions in the debug console window Improved completions in language server, including recognition of namedtuple, and generic types The extension now contains new editor-centric interactive programming capabilities built on top of Jupyter. have Jupyter installed in your environment (e.g. set your environment to Anaconda) and type #%% into a Python file to define a Cell. You will notice a “Run Cell” code lens will appear above the #%% line: Cells in the Jupyter Notebook will be converted to cells in a Python file by adding #%% lines. You can run the cells to view the notebook output in Visual Studio code, including plots Brian #3: API Evolution the Right Way A. Jesse Jiryu Davis adding features removing features adding parameters changing behavior Michael #4: PySimpleGUI now on Qt Project by Mike B Covered back on https://pythonbytes.fm/episodes/show/90/a-django-async-roadmap Simple declarative UI “builder” Looking to take your Python code from the world of command lines and into the convenience of a GUI? Have a Raspberry Pi with a touchscreen that's going to waste because you don't have the time to learn a GUI SDK? Look no further, you've found your GUI package. Now supports Qt Modern Python only More frameworks likely coming Brian #5: Comparison of the 7 governance PEPs Started by Victor Stinner The different PEPs are compared by: hierarchy number of people involved requirements for candidates to be considered for certain positions elections: who votes, and how term limits no confidence vote teams/experts PEP process core dev promotion and ejection how governance will be updated code of conduct PEP 8000, Python Language Governance Proposal Overview: PEP 8010 - The Technical Leader Governance Model continue status quo (ish) PEP 8011 - Python Governance Model Lead by Trio of Pythonistas like status quo but with 3 co-leaders PEP 8012 - The Community Governance Model no central authority PEP 8013 - The External Governance Model non-core oversight PEP 8014 - The Commons Governance Model core oversight PEP 8015 - Organization of the Python community push most decision-making to teams PEP 8016 - The Steering Council Model bootstrap iterating on governance Michael #6: Shiboken (from Qt for Python project) From PySide2 (AKA Qt for Python) project Generate Python bindings from arbitrary C/C++ code Has a Typesystem (based on XML) which allows modifying the obtained information to properly represent and manipulate the C++ classes into the Python World. Can remove and add methods to certain classes, and even modify the arguments of each function, which is really necessary when both C++ and Python collide and a decision needs to be made to properly handle the data structures or types. Qt for Python: under the hood Write your own Python bindings Other options include: CFFI (example dbader.org) Cython (example: via shamir.stav) Extras: Michael: Mission Python: Code a Space Adventure Game! book Michael: PyCon tickets are on sale Michael: PyCascade tickets are on sale
November 8, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: FEniCS “FEniCS is a popular open-source (LGPLv3) computing platform for solving partial differential equations (PDEs). FEniCS enables users to quickly translate scientific models into efficient finite element code. With the high-level Python and C++ interfaces to FEniCS, it is easy to get started, but FEniCS offers also powerful capabilities for more experienced programmers. FEniCS runs on a multitude of platforms ranging from laptops to high-performance clusters.” Solves partial differential equations efficiently with a combination of C++ and Python. Can be run on a desktop/laptop or deployed to a supercomputer with thousands of parallel processes. is a NumFOCUS fiscally supported project “makes the implementation of the mathematical formulation of a system of partial differential equations almost seamless.” - Sébastien Brisard “FEniCS is in fact a C++ project with a full-featured Python interface. The library itself generates C++ code on-the-fly, that can be called (on-the-fly) from python. It's almost magical... Under the hood, it used to use SWIG, and recently moved to pybind11. I guess the architecture that was set up to achieve this level of automation might be useful in other situations.” - Sébastien Brisard Michael #2: cursive_re via Christopher Patti, created by Bogdan Popa Readable regular expressions for Python 3.6 and up. It’s a tiny Python library made up of combinators that help you write regular expressions you can read and modify six months down the line. Best understood via an example: >>> hash = text('#') >>> hexdigit = any_of(in_range('0', '9') + in_range('a', 'f') + in_range('A', 'F')) >>> hexcolor = ( ... beginning_of_line() + hash + ... group(repeated(hexdigit, exactly=6) | repeated(hexdigit, exactly=3)) + ... end_of_line() ... ) >>> str(hexcolor) '^\\#([a-f0-9]{6}|[a-f0-9]{3})$' Has automatic escaping for [ and \ etc: str(any_of(text("[]"))) → '[\\[\\]]' Easily testable / inspectable. Just call str on any expression. Brian #3: pyimagesearch Adrian Rosebrock is focused on teaching OpenCV with Python Just a really cool resource of integrating computer vision and Python. Both free and paid resources. He had one of the most successful tech learning kickstarters (ever?) on this topic: https://www.kickstarter.com/projects/adrianrosebrock/deep-learning-for-computer-vision-with-python-eboo Michael #4: Visualization of Python development up till 2012 via Ophion Group (on twitter) mercurial (hg) source code repository commit history August 1990 - June 2012 (cpython 3.3.0 alpha) Watch the first minute, then click ahead minute at a time and watch for a few seconds to get the full feel Really interesting to see a visual representation of the growth of an open source ecosystem Built with Gource: https://gource.io/ Amazing video of the history gource and its visualization of various projects: https://vimeo.com/15943704 Who wants to build this for 2012-present? Would make an amazing lightning talk! Brian #5: Getting to 10x (Results): What Any Developer Can Learn from the Best Forget the “10x” bit if that term is fighting words. - Brian’s advice How about just “ways to improve your effectiveness as a developer”? “… there is a clear path to excellence. People aren’t born great developers. They get there through focused, deliberate practice.” traits of great developers problem solver skilled mentor/teacher excellent learner passionate traits to avoid: incompetent arrogant uncooperative unmotivated stubborn Focus on your strengths more than your weaknesses Pick 1 thing to improve on this week and focus on it relentlessly Michael #6: Chaos Toolkit Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system's capability to withstand turbulent conditions in production. Netflix uses the chaos monkey (et. al.) on their systems. Covered on https://talkpython.fm/episodes/show/16/python-at-netflix The Chaos Toolkit aims to be the simplest and easiest way to explore building, and automating, your own Chaos Engineering Experiments. Integrates with Kubernetes, AWS, Google Cloud, Microsoft Azure, etc. To give you an idea, here are some things it can do to aws: lambda: delete_function_concurrency Removes concurrency limit applied to the specified Lambda stop_instance Stop a single EC2 instance. You may provide an instance id explicitly or, if you only specify the AZ, a random instance will be selected. Extras: MK: Malicious Python Libraries Found & Removed From PyPI MK: Some really long type names Brian: Deep dive into pyproject.toml and the future of Python packaging with Brett Cannon follow up from episode 100 Python Bytes
October 31, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: QuantEcon “Open source code for economic modeling” “QuantEcon is a NumFOCUS fiscally sponsored project dedicated to development and documentation of modern open source computational tools for economics, econometrics, and decision making.” Educational resource that includes: Lectures, workshops, and seminars Cheatsheets for scientific programming in Python and Julia Notebooks QuantEcon.py : open source Python code library for economics Michael #2: Structure of a Flask Project Flask is very flexible, it has no certain pattern of a project folder structure. Here are some suggestions. I always keep this one certain rule when writing modules and packages: “Don't backward import from root __init__.py.” Candidate structure: project/ __init__.py models/ __init__.py users.py posts.py ... routes/ __init__.py home.py account.py dashboard.py ... templates/ base.html post.html ... services/ __init__.py google.py mail.py Love it! To this, I would rename routes to views or controllers and add a viewmodels folder and viewmodels themselves. Brian, see anything missing? ya. tests. :) Another famous folder structure is app based structure, which means things are grouped bp application I (Michael) STRONGLY recommend Flask blueprints Brian #3: Overusing lambda expressions in Python lambda expressions vs defined functions They can be immediately passed around (no variable needed) They can only have a single line of code within them They return automatically They can’t have a docstring and they don’t have a name They use a different and unfamiliar syntax misuses: naming them. Just write a function instead calling a single function with a single argument : just use that func instead overuse: if they get complex, even a little bit, they are hard to read has to be all on one line, which reduces readibility map and filter : use comprehensions instead using custom lambdas instead of using operators from the operator module. Michael #4: Asyncio in Python 3.7 by Cris Medina The release of Python 3.7 introduced a number of changes into the async world. Some may even affect you even if you don’t use asyncio. New Reserved Keywords: The async and await keywords are now reserved. There’s already quite a few modules broken because of this. However, the fix is easy: rename any variables and parameters. Context Variables: Version 3.7 now allows the use of context variables within async tasks. If this is a new concept to you, it might be easier to picture it as global variables whose values are local to the currently running coroutines. Python has similar constructs for doing this very thing across threads. However, those were not sufficient in async-world New asyncio.run() function With a call to asyncio.run(), we can now automatically create a loop, run a task on it, and close it when complete. Simpler Task Management: Along the same lines, there’s a new asyncio.create_task() function that helps make tasks that inside the current loop, instead of having to get the loop first and calling create task on top of it. Simpler Event Loop Management: The addition of asyncio.get_running_loop() will help determine the active event loop, and catch a RuntimeError if there’s no loop running. Async Context Managers: Another quality-of-life improvement. We now have the asynccontextmanager() decorator for producing async context managers without the need for a class that implements __aenter__() or __aexit__(). Performance Improvements: Several functions are now optimized for speed, some were even reimplemented in C. Here’s the list: asyncio.get_event_loop() is now 15 times faster. asyncio.gather() is 15% faster. asyncio.sleep() is two times faster when the delay is zero or negative. asyncio.Future callback management is optimized. Reduced overhead for asyncio debug mode. Lots lots more Brian #5: Giving thanks with **pip thank** proposal: https://github.com/pypa/pip/issues/5970 Michael #6: Getting Started With Testing in Python by Anthony Shaw, 33 minutes reading time according to Instapaper Automated vs. Manual Testing Unit Tests vs. Integration Tests: A unit test is a smaller test, one that checks that a single component operates in the right way. A unit test helps you to isolate what is broken in your application and fix it faster. Compares unittest, nose or nose2, pytest Covers things like: Writing Your First Test Where to Write the Test How to Structure a Simple Test How to Write Assertions Dangers of Side Effects Testing in PyCharm and VS Code Testing for Web Frameworks Like Django and Flask Advanced Testing Scenarios Even: Testing for Security Flaws in Your Application Extras: MK: Hack ur name — aka Pivot me bro (done in Python: https://github.com/veekaybee/hustlr ) by Vicki Boykis MK: Python 3.7.1 and 3.6.7 Are Now Available MK: Click-Driven Development (CDD) - via @tombaker Use Python Click package to mock up suite of commands w/options/args. Decorated functions print description of intended results. Replace placeholders with code.
October 24, 2018
Sponsored by DigitalOcean: pythnonbytes.fm/digitalocean Brian #1: Asterisks in Python: what they are and how to use them I just ** love *s Using * and ** to pass arguments to a function * for list, ** for keyword arguments from a dictionary Using * and ** to capture arguments passed into a function Using * to accept keyword-only arguments Using * to capture items during tuple unpacking you can capture the rest if you only want to grab a few Using * to unpack iterables into a list/tuple Using ** to unpack dictionaries into other dictionaries Michael #2: responder web framework From Kenneth Reitz — A familiar HTTP Service Framework Already has 1,393 github stars Flask-like but with async support and A pleasant API, with a single import statement. Class-based views without inheritance. ASGI framework, the future of Python web services. WebSocket support! The ability to mount any ASGI / WSGI app at a subroute. f-string syntax route declaration. Mutable response object, passed into each view. No need to return anything. Background tasks, spawned off in a ThreadPoolExecutor. GraphQL (with GraphiQL) support! OpenAPI schema generation. Single-page webapp support Responder gives you the ability to mount another ASGI / WSGI app at a subroute uvicorn: powers responder and is built on top of uvloop asgi: https://www.encode.io/articles/hello-asgi/ Brian #3: Python Example resource: pythonprogramming.in Lots of examples Python basics including date time, strings, dictionaries pandas, matplotlib, tensorflow basics data structures and algorithms Nice reference, especially for people getting into Python for data science or scientific work. Michael #4: This year’s Nobel Prize in economics was awarded to a Python convert Nordhaus and Romer “have designed methods that address some of our time’s most fundamental and pressing issues: long-term sustainable growth in the global economy and the welfare of the world’s population,” Notably for a 62-year-old economist of his distinction, he is a user of the programming language Python. Romer believes in making research transparent. He argues that openness and clarity about methodology is important for scientific research to gain trust. He tried to use Mathematica to share one of his studies in a way that anyone could explore every detail of his data and methods. It didn’t work. He says that Mathematica’s owner, Wolfram Research, made it too difficult to share his work in a way that didn’t require other people to use the proprietary software, too. Romer believes that open-source notebooks are the way forward for sharing research. He believes they support integrity, while proprietary software encourage secrecy. “The more I learn about proprietary software, the more I worry that objective truth might perish from the earth,” he wrote. Michael covered a similar story for the Nobel Prize in Physics at CERN on Talk Python Jake Vanderplas Keynote at PyCon 2017: “The unexpected effectiveness of Python in Science” Brian #5: More in depth TensorFlow Michael #6: MAKERphone - an educational DIY mobile phone MAKERphone is an educational DIY mobile phone designed to bring electronics and programming to the crowd in a fun and interesting way. A fully functional mobile phone that you can code yourself Games such as space invaders, pong, or snake Apps such as a custom media player that only plays cat videos Programs in Arduino Lines of code in Python Your first working piece of code in Scratch A custom case Extras: MK: Around 62% of all Internet sites will run an unsupported PHP version in 10 weeks The highly popular PHP 5.x branch will stop receiving security updates at the end of the year.
October 19, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Special guests: Anthony Shaw Dan Bader Brett Cannon Nina Zakharenko Brian #1: poetry “poetry is a tool to handle dependency installation as well as building and packaging of Python packages. It only needs one file to do all of that: the new, standardized pyproject.toml. In other words, poetry uses pyproject.toml to replace setup.py, requirements.txt, setup.cfg, MANIFEST.in and the newly added Pipfile.” poetry can be used for both application and library development handles dependencies and lock files strongly encourages virtual environment use (need specifically turn it off) can be used within an existing venv or be used to create a new venv automates package build process automates deployment to PyPI or to another repository CLI and the use model is very different than pipenv. Even if they produced the same files (which they don’t), you’d still want to try both to see which workflow works best for you. For me, I think poetry matches my way of working a bit more than pipenv, but I’m still in the early stages of using either. From Python's New Package Landscape “PEP 517 and PEP 518—accepted in September 2017 and May 2016, respectively—changed this status quo by enabling package authors to select different build systems. Said differently, for the first time in Python, developers may opt to use a distribution build tool other than **distutils** or **setuptools**. The ubiquitous **setup.py** file is no longer mandatory in Python libraries.” PEP 517 -- A build-system independent format for source trees PEP 518 -- Specifying Minimum Build System Requirements for Python Projects Another project that utilizes pyproject.toml is flit, which seems to overlap quite a bit with poetry, but I don’t think it does the venv, dependency management, dependency updating, etc. See also: Clarifying PEP 518 (a.k.a. pyproject.toml) - From Brett Question for @Brett C 517 and 518 still say “provisional” and not “final”. What’s that mean? We are still allowed to tweak it as necessary before it Biggest difference is poetry uses pyproject.toml (PEP518) instead of Pipfile. Replaces all others (setup.py, setup.cfg, requirements*.txt, manifest.IN) Even its lock file is in TOML Author “does not like” pipenv, or some of the decisions it has made. Note that Kenneth has recently made some calls to introduce more discussion and openness with a PEP-style process called PEEP (PipEnv Enhancement Proposals). E.g. uses a more extensive dependency resolver Pipenv does not support multiple environments (by design) making it useless for library development. Poetry makes this more open. See https://medium.com/@DJetelina/pipenv-review-after-using-in-production-a05e7176f3f0 Wait. Why am I doing your notes for you @Brian O ! (awesome. Thanks Ant.) Brett has had initial discussions on Twitter with both pipenv and poetry about possibly standardizing on a lockfile format so that’s the artifact these tools produce and everything else is tool preference Anthony Shaw #2: pylama and radon Have been investigating tools for measuring complexity and performance of code and how that relates to test If you can refactor your code so the tests still pass, the customers are still happy AND it’s simpler then that’s a good thing - right? Radon is a Python tool that leverages the AST to give statistics on Cyclomatic Complexity (number of decisions — nested if’s are bad), maintainability index (LoC & Halstead) and Halstead (number of operations an complexity of AST). Radon works by adding a ComplexityVisitor to the AST. Another option is Ned Batchelder’s McCabe tool which measures the number of possible branches (similar to cyclomatic) All of these tools are combined in pylama - a code linter for Python and Javascript. Embeds pycodestyle, mccabe, radon, gjslint and pyflakes. Final goal is to have a pytest plugin that fails tests if you make your code more complicated Nina Zakharenko #3: Tools for teaching Python Teaching Python can come with hurdles — virtual environments, installing python3, pip, working with the command line. Put out a call on twitter asking - “What software and tools do you use to teach Python?”. 50 Responses, 414 votes, learned about lots of new tools. Read the thread. 27% use python or ipython repl 13% use built-in IDLE 39% use an IDE or editor - Visual Studio Code, PyCharm, Atom. 21% use other (mix of local and hosted Jupyter notebooks and other responses) New tools I learned about: Mu editor - simple python editor, great for those completely new to programming. Large buttons with common actions above the editor. Support for educational platforms Integrates with hardware platforms -- adafruit Circuit Playground, micro:bit PyGame Awesome tutorials Neuron plugin for VS Code, Hydrogen plugin for Atom Interactive coding environment, brings a taste of Jupyter notebooks into your editor. Targeted towards data scientists. Show evaluated values, output pane to display charts and graphs Import to/from Jupyter notebooks repl.it - open source hosted cloud repl with reasonable free tier project goal - zero effort setup 3 vertical panes: files, editor, repl, and a button to run the current code. no login, no signup needed to get started visual package installation - no running pip, requirements.txt automatically generated includes a debugger bpython - Used it years ago, still an active project. Fancy curses interface to the Python interactive interpreter. Windows, type hints, expected parameters lists. Really cool feature — you can rewind your session! Pops the last line, and the entire session is reevaluated. Easily reload imported modules. Honorable mentions: Edublocks - Teaching tool for kids, visually drag and drop blocks of Python code. Open source, created by Joshua Lowe, a brilliant 14 year old maker and programmer. pythonanywhere, codeskulptor.org, codesters. Dan Bader #4: My favorite tool of 2018: “Black” code formatter by Łukasz Langa Black is the “uncompromising Python code formatter” An opinionated auto-formatter for your code (like YAPF/autopep for Python, or gofmt for golang who popularized the idea) Heard about it in episode #73 by Brian Started using it for some small tools, then rolled it out to the whole realpython.com code base including our public example code repo (https://github.com/realpython/materials) Benefits are: Auto formatting—Not only does it call you out on formatting violations, it auto-fixes them Code style discussions disappear—just use whatever Black does Super easy to make several code bases look consistent (no more mental gymnastics to format new code to match its surroundings) Automatically enforce consistent formatting on CI with “black --check” (I use a combo of flake8 + black because flake8 also catches syntax errors and some other “code smells”) pro-tip: set up a pre-commit hook/rule to automatically run black before committing to Git. Also add it to your editor workflow (reformat on save / reformat on paste) Tool support: Built into the Python extension for VS Code (which Łukasz uses 😉) Plug-in for PyCharm (for Michael and Brian 😁 ) Support in pre-commit For the most part I really like the formatting Black applies, if you’re not a fan you might hate this tool because it makes your code look “ugly” 🙂 Still in beta but found it very useful and helpful as of October 2018. Give it a try! Brett Cannon #5: A Web without JavaScript: Russell Keith-Magee at PyCon AU JavaScript has a monopoly in web browsers for client-side programming Mono-language situations are not good for anyone Can Python somehow break into the client-side web world? Example implementation of Luhn algorithm: JavaScript: 0.4KB Transcrypt: transpile to 32KB Brython: Python compiler for 0.5KB + 646KB bootstrap Batavia: Eval loop for 1.2KB + 5MB bootstrap Pyodide: CPython compiled to WASM for 0.5KB + 3MB bootstrap WASM as a Python target might make this feasible Example written in C compiled to 22KB (w/ a 65KB bootstrap for older browsers) Maybe easier to target Electron/Node instead of client-side web initially? Scott Hanselman’s blog post https://www.hanselman.com/blog/JavaScriptIsWebAssemblyLanguageAndThatsOK.aspx Hanselminutes interview https://hanselminutes.com/638/c-and-browser-monoculture-with-vivaldis-patricia-aas Michael #6: Async WebDriver implementation for asyncio and asyncio-compatible frameworks You’ve heard of Selenium but in an async world what do we use? Answer: arsenic # Example: Let's run a local Firefox instance. async def example(): # Runs geckodriver and starts a firefox session async with get_session(Geckodriver(), Firefox()) as session: # go to example.com await session.get('http://example.com') # wait up to 5 seconds to get the h1 element from the page h1 = await session.wait_for_element(5, 'h1') # print the text of the h1 element print(await h1.get_text()) Use cases include testing of web applications, load testing, automating websites, web scraping or anything else you need a web browser for. It uses real web browsers using the Webdriver specification. Warning: While this library is asynchronous, web drivers are not. You must call the APIs in sequence. The purpose of this library is to allow you to control multiple web drivers asynchronously or to use a web driver in the same thread as an asynchronous web server. Arsenic with pytest Supported browsers Headless Google Chrome Headless Firefox Everyone’s thoughts on async in Python these days? Selenium-Grid https://www.seleniumhq.org/docs/07_selenium_grid.jsp Extra: Take the python survey: https://talkpython.fm/survey2018 3.7.1rc1 is out https://docs.python.org/3.7/whatsnew/changelog.html#python-3-7-1-release-candidate-1 A good review on Python packaging http://andrewsforge.com/article/python-new-package-landscape/ New September release of Python Extension for Visual Studio Code — lots of new features, like automatic environment activation in the terminal, debugging improvements, and more! Submit a talk to PyCascades happening February 2019 in Seattle. Call for proposals closes October 21st. Mentorship available.
October 16, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Forbes cyber article: Cyber Saturday—Doubts Swirl Around Bloomberg's China Chip Hack Report Brian #1: parse “parse() is the opposite of format()” regex not required for parsing strings. Provides these functionalities: export parse(), search(), findall(), and with_pattern() # Note: space around < p > etc added to escape markdown parser safety measures >>> parse("It's {}, I love it!", "It's spam, I love it!") < Result ('spam',) {} > >>> search('Age: {:d}\n', 'Name: Rufus\nAge: 42\nColor: red\n') ( Result (42,) {} ) >>> ''.join(r.fixed[0] for r in findall("", "\< p >the < b >bold< /b > text< /p >")) 'the bold text' Can also compile for repeated use. Michael #2: fman Build System FBS lets you create GUI apps for Windows, Mac and Linux via Michael Herrmann Build Python GUIs, with Qt – in minutes Write a desktop application with PyQt or Qt for Python. Use fbs to package and deploy it on Windows, Mac and Linux. Avoid months of painful work with the proven solutions provided by fbs. Easy Packaging: Unlike other solutions, fbs makes packaging easy. Create installers for your app in seconds and distribute them to your users – on Windows, Mac and Linux! Open Source: fbs's source code is available on GitHub. You can use it for free in open source projects licensed under the GPL. Commercial licenses are also offered. Free under the GPL. If that's too restrictive, a commercial license is 250 Euros once. PyQt's licensing is similar (GPL/Commercial). A license for it is € 450 (source). Came from fman, a dual-pane file manager for Mac, Windows and Linux Brian #3: fastjsonschema Validate JSON against a schema, quickly. Michael #4: IPython 7.0, Async REPL via Nick Spirit Article by Matthias Bussonnier We are pleased to announce the release of IPython 7.0, the powerful Python interactive shell that goes above and beyond the default Python REPL with advanced tab completion, syntactic coloration, and more. Not having to support Python 2 allowed us to make full use of new Python 3 features and bring never before seen capability in a Python Console, see the Python 3 Statement. One of the core features we focused on for this release is the ability to (ab)use the async and await syntax available in Python 3.5+. TL;DR: You can now use async/await at the top level in the IPython terminal and in the notebook, it should — in most of the cases — “just work”. The only thing you need to remember is: If it is an async function you need to await it. Brian #5: molten Michael #6: A Python love letter Dear Python, where have you been all my life? (reddit thread) I am NOT a developer. But, I've tinkered with programming (in BASIC, Visual Basic, Perl, now Python) when needed over the years I decided that I needed to script something, and hoped that learning how to do it in Python was going to take me significantly less time than doing it manually - with the benefit of future timesavings. No, I didn't go from 0 to production in a day. But if my coworkers will leave me alone, I might be in production by the end of the day tomorrow. What I'm working on today isn't super complex — But putting together what I've done so far has just been a complete joy. Overall it feels natural, intuitive, and relatively easy to understand and write the code for the basic things I'm doing - I haven't had this much fun doing stuff with code since the days fooling around with BASIC in my teens. Feedback / comments Welcome to the club. I came up on c++; my job highly trained me in C and assembly but every project I touch I think, wait, "we can do 95% this in python". And we do. I used to have a chip on my shoulder. I wanted to do things the hard way to truly understand them. I went with C++. … I learned that doing things the smart way was better than doing things the hard way and didn't interfere with learning. I felt the exact same way I finally decided to learn it. It's like a breath of fresh air. Sadly there are few things in my life that made me feel like this, Python and Bitcoin both give me the same levels of enjoyment. … I've used Java, Groovy, Scala, Objective-C, C, C++, C#, Perl and Javascript in a professional capacity over the years and nothing feels as natural to me as Python does. The developers truly deserve any donations they get for making it. … Hell my next two planned tattoos are bitcoin and python logos on my wrists. I taught myself Python a little over 3 years ago and I quickly went from not being programmer to being a programmer. … However the real popularity of Python comes from the depth and quality of 3rd party libraries and how easy they are to install. Extra: Brian: Power Mode II
October 8, 2018
Sponsored by DigitalOcean: pythonbytes.fm/digitalocean Brian #1: Making Etch-a-Sketch Art With Python Really nice write up of methodically solving problems with simplifying the problem space, figuring out what parts need solved, grabbing off the shelf bits that can help, and putting it all together. Plus it would be a fun weekend (or several) project with kids helping. Controlling the Etch-a-Sketch Raspberry Pi, motors, cables, wood fixture Software to control the motors Picture simplification with edge detection with Canny edge detection. Lines to motor control with path finding with networkx library. Example results included in article. Pentium song: https://www.youtube.com/watch?v=qpMvS1Q1sos Michael #2: Dropbox moves to Python 3 They just rolled out one of the largest Python 3 migrations ever Dropbox is one of the most popular desktop applications in the world Much of the application is written using Python. In fact, Drew’s very first lines of code for Dropbox were written in Python for Windows using venerable libraries such as pywin32. Though we’ve relied on Python 2 for many years (most recently, we used Python 2.7), we began moving to Python 3 back in 2015. If you’re using Dropbox today, the application is powered by a Dropbox-customized variant of Python 3.5. Why Python 3? Exciting new features: Type annotations and async & await Aging toolchains: As Python 2 has aged, the set of toolchains initially compatible for deploying it has largely become obsolete Embedding Python To solve build and deploy problem, we decided on a new architecture to embed the Python runtime in our native application. Deep integration with the OS (e.g. smart sync) means native apps are required In future posts, we’ll look at: How we report crashes on Windows and macOS and use them to debug both native and Python code. How we maintained a hybrid Python 2 and 3 syntax, and what tools helped. Our very best bugs and stories from the Python 3 migration. Brian #3: Resources for PyCon that relate to really any talk venue Speaking page Talk proposal tips and resources And the poster session. Way cooler than I originally understood. Mariatta recently published her set of proposals Nice clean examples that don’t look overwhelming There’s also some links to examples at the talk proposal page. Related, on attending PyCon (or other technical conferences): You don't need to be a Pro @ Python to crack the code of Pycon missing: hang out and talk with, ask questions, and possibly help out with communities as part of the Expo. Michael #4: Electron as GUI of Python Applications via Andy Bulka Electron Python is a template of code where you use Electron (nodejs + chromium) as a GUI talking to Python 3 as a backend via zerorpc. Similar to Eel but much more capable e.g. you get proper native operating system menus — and users don’t need to have Chrome already installed. Needs to run zerorpc server and then start electron separately — can be done via the node backend using Electron as a GUI toolkit gets you native menus, notifications installers, automatic updates to your app debugging and profiling that you are used to, using the Chrome debugger ES6 syntax (a cleaner Javascript with classes, module imports, no need for semicolons etc.). Squint, look sideways, and it kinda looks like Python… ;-) the full power of nodejs and its huge npm package repository the large community and ecosystem of Electron How to package this all? Building a deployable Python-Electron App post by Andy Bulka One of the great things about using Electron as a GUI for Python is that you get to use cutting edge web technologies and you don’t have to learn some old, barely maintained GUI toolkit How much momentum, money, time and how many developer minds are focused on advancing web technologies? Answer: it’s staggeringly huge. Compare this with the number of people maintaining old toolkits from the 90’s e.g. wxPython? Answer: perhaps one or two people in their spare time. Which would you rather use? Final quote: And someone please wrap Electron-Python into an IDE so that in the future all we have to do is click a ‘build’ button — like we could 20 years ago. :-) Brian #5: pluggy: A minimalist production ready plugin system docs plugin management and hook system used by pytest A separate package to allow other projects to include plugin capabilities without exposing unnecessary state or behavior of the host project. Michael #6: How China Used a Tiny Chip to Infiltrate U.S. Companies via Eduardo Orochena The attack by Chinese spies reached almost 30 U.S. companies, including Amazon and Apple, by compromising America’s technology supply chain, according to extensive interviews with government and corporate sources. In 2015, Amazon.com Inc. began quietly evaluating a startup called Elemental Technologies, a potential acquisition to help with a major expansion of its streaming video service, known today as Amazon Prime Video. (from Portland!) To help with due diligence, AWS, which was overseeing the prospective acquisition, hired a third-party company to scrutinize Elemental’s security servers were assembled for Elemental by Super Micro Computer Inc., a San Jose-based company (commonly known as Supermicro) that’s also one of the world’s biggest suppliers of server motherboards Nested on the servers’ motherboards, the testers found a tiny microchip, not much bigger than a grain of rice, that wasn’t part of the boards’ original design. Amazon reported the discovery to U.S. authorities, sending a shudder through the intelligence community. Elemental’s servers could be found in Department of Defense data centers, the CIA’s drone operations, and the onboard networks of Navy warships. And Elemental was just one of hundreds of Supermicro customers. During the ensuing top-secret probe, which remains open more than three years later, investigators determined that the chips allowed the attackers to create a stealth doorway into any network that included the altered machines. Multiple people familiar with the matter say investigators found that the chips had been inserted at factories run by manufacturing subcontractors in China. One government official says China’s goal was long-term access to high-value corporate secrets and sensitive government networks. No consumer data is known to have been stolen. American investigators eventually figured out who else had been hit. Since the implanted chips were designed to ping anonymous computers on the internet for further instructions, operatives could hack those computers to identify others who’d been affected. Extra: Michael's Async course talkpython.fm/async
September 28, 2018
Sponsored by DataDog -- pythonbytes.fm/datadog Brian #1: Making a PyPI-friendly README twine now checks for rendering problems with README Install the latest version of twine; version 1.12.0 or higher is required: pip install --upgrade twine Build the sdist and wheel for your project as described under Packaging your project. Run twine check on the sdist and wheel: twine check dist/* This command will report any problems rendering your README. If your markup renders fine, the command will output Checking distribution FILENAME: Passed. Michael #2: Java goes paid Oracle's new Java SE subs: Code and support for $25/processor/month Prepare for audit after inevitable change, says Oracle licensing consultant There’s also a little bit of stick to go with the carrot, because come January 2019 Java SE 8 on the desktop won’t be updated any more … unless you buy a sub. The short version is that every commercial enterprise needs to look at their Java SE (Standard Edition) usage to see if they need to do something with licensing. Brian #3: Absolute vs Relative Imports in Python Review of how imports are used, along with subpackages and from ex: from package.sub import func Relative: what does this mean: from .some_module import some_class from ..some_package import some_function from . import some_class Michael #4: pyxel - A retro game engine for Python Thanks to its simple specifications inspired by retro gaming consoles, such as only 16 colors can be displayed and only 4 sounds can be played back at the same time, you can feel free to enjoy making pixel art style games. Run on Windows, Mac, and Linux Code writing with Python3 After installing Pyxel, the examples of Pyxel will be copied to the current directory with the following command: install_pyxel_examples Brian #5: Click 7.0 Released Changelog Drop support for Python 2.6 and 3.3. Add native ZSH autocompletion support. Usage errors now hint at the --help option Really long list of changes since the last release at the beginning of 2017 Michael #6: How we spent 30k USD in Firebase in less than 72 hours the largest crowdfunding campaign in Colombia, collecting 3 times more than the previous record so far in only two days! Run on the Vaki platform -- subject of this article We had reached more than 2 million sessions, more than 20 million pages visited and received more than 15 thousand supports. This averages to a thousand users active on the site in average and collecting more than 20 supports per minute. Site was running slow, tried things like upgraded the frontend frameworks Logged into Firebase: had spent $30,356.56 USD in just 72 hours! Going at $600/hr All came down to a very bad implementation of this.loadPayments(). Comments are interesting It could happen to any of us, it happened to me this month. Extras: Dropbox has upgraded from Python 2 → 3! Michael’s async course is live: Async Techniques and Examples in Python 2019 PyCon CFPs open PyCascades CFP is open until mid-Oct
September 22, 2018
Sponsored by DigitalOcean -- pythonbytes.fm/digitalocean Brian #1: Plumbum: Shell Combinators and More Toolbox of goodies to do shell-like things from Python. “The motto of the library is “Never write shell scripts again”, and thus it attempts to mimic the shell syntax (shell combinators) where it makes sense, while keeping it all Pythonic and cross-platform.” Example: >>> from plumbum.cmd import grep, wc, cat, head >>> chain = ls["-a"] | grep["-v", "\\.py"] | wc["-l"] >>> print chain /bin/ls -a | /bin/grep -v '\.py' | /usr/bin/wc -l >>> chain() u'13\n' >>> ((cat < "setup.py") | head["-n", 4])() u'#!/usr/bin/env python\nimport os\n\ntry:\n' >>> (ls["-a"] > "file.list")() u'' >>> (cat["file.list"] | wc["-l"])() u'17\n' Michael #2: Windows 10 Linux subsystem for Python developers via Marcus Sherman “One of the hardest days in teaching introduction to bioinformatics material is the first day: Setting up your machine.” While I have seen a very large bias towards Macs in academia, there are plenty of people that keep their Windows machines as a badge of pride... Marcus included. Even though Anaconda is cross platform and helpful, how does this work on Windows? python3 -m venv .env and source .env/bin/activate? Spoiler alert: Not well. Step by step getting Ubuntu on Windows Shows how to setup an x-server Brian #3: Type hints cheat sheet (Python 3) Do you remember how to type hint duck types? Something accessed like an array (list or tuple or …) and holds strings → Sequence[str] Something that works like a dictionary mapping integers to strings → Mapping[int, str] As I’m adding more and more typing to interface functions, I keep this cheat sheet bookmarked. Michael #4: Python driving new languages Here are five predictions for what programming will look like 10 years from now. Programming will be more abstract Trends like serverless technologies, containers, and low code platforms suggest that many developers may work at higher levels of abstraction in the future AI will become part of every developer's toolkit—but won't replace them A universal programming language will arise To reap the benefits of emerging technologies like AI, programming has to be easy to learn and easy to build upon "Python may be remembered as being the great-great-great grandmother of languages of the future, which underneath the hood may look like the English language, but are far easier to use," Every developer will need to work with data Programming will be a core tenet of the education system Brian #5: asyncio documentation rewritten from scratch twitter thread by Yury Selivanov‏ “Big news! asyncio documentation has been rewritten from scratch! Read the new version here: https://docs.python.org/3/library/asyncio.html …. Huge thanks to @WillingCarol, @elprans, and @andrew_svetlov for support, ideas, and reviews!’ “BTW, this is just the beginning. We'll continue to refine and update the documentation. Next up is adding two tutorials: one teaching high-level concepts and APIs, and another teaching how to use protocols and transports. A section about asyncio architecture is also planned.” “And this is just the beginning not only for asyncio documentation, but for asyncio itself. Just for Python 3.8 we plan to add: new streaming API TaskGroups and cancel scopes Supervisors and tracing API new SSL implementation many usability improvements” Michael #6: The 2018 Python Language Summit Here are the sessions: Subinterpreter support for Python: a way to have a better story for multicore scalability using an existing feature of the language. Subinterpreters will allow multiple Python interpreters per process and there is the potential for zero-copy data sharing between them. But subinterpreters share the GIL, so that needs to be changed in order to make it multicore friendly. Modifying the Python object model: looking at changes to CPython data structures to increase the performance of the interpreter. - via Instagram and Carl Shapiro - By modifying the Python object model fairly substantially, they were able to roughly double the performance - A little controversial - Shapiro's overall point was that he felt Python sacrificed its performance for flexibility and generality, but the dynamic features are typically not used heavily in performance-sensitive production workloads. A Gilectomy update: a status report on the effort to remove the GIL from CPython. Larry Hastings updated attendees on the status of his Gilectomy project. Since his status report at last year's summit, little has happened, which is part of why the session was so short. He hasn't given up on the overall idea, but it needs a new approach. Using GitHub Issues for Python: a discussion on moving from bugs.python.org to GitHub Issues. Mariatta Wijaya described her reasoning for advocating moving Python away from its current bug tracker to GitHub Issues. it would complete Python's journey to GitHub that started a ways back. Shortening the Python release schedule: a discussion on possibly changing from an 18-month to a yearly cadence. The Python release cycle has an 18-month cadence; a new major release (e.g. Python 3.7) is made roughly on that schedule. But Łukasz Langa, who is the release manager for Python 3.8 and 3.9, would like to see things move more quickly—perhaps on a yearly cadence. Unplugging old batteries: should some older, unloved modules be removed from the standard library? Python is famous for being a "batteries included" language—its standard library provides a versatile set of modules with the language There may be times when some of those batteries have reached their end of life. Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work. Linux distributions and Python 2: the end of life for Python 2 is coming, what distributions are doing to prepare. Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work. To figure out how to help the Python downstreams so that Python 2 can be fully discontinued. Python static typing update: a look at where static typing is now and where it is headed for Python 3.7. Started things off by talking about stub files, which contain type information for libraries and other modules. Right now, static typing is only partially useful for large projects because they tend to use a lot of packages from the Python Package Index (PyPI), which has limited stub coverage. There are only 35 stubs for third-party modules in the typeshed library, which is Python's stub repository. He suggested that perhaps a centralized library for stubs is not the right development model. Some projects have stubs that live outside of typeshed, such as Django and SQLAlchemy. PEP 561 ("Distributing and Packaging Type Information") will provide a way to pip install stubs from packages that advertise that they have them. Python virtual environments: a short session on virtual environments and ideas for other ways to isolate local installations. Steve Dower brought up the shortcomings of Python virtual environments, which are meant to create isolated installations of the language and its modules. Thomas Wouters defended virtual environments in a response: The correct justification is that for the average person, not using a virtualenv all too soon creates confusion, pain, and very difficult to fix breakage. Starting with a virtualenv is the easiest way to avoid that, at very little cost. But Beazley and others (including Dower) think that starting Python tutorials or training classes with a 20-minute digression on setting up a virtual environment is wasted time. PEP 572 and decision-making in Python: a discussion of the controversy around PEP 572 and how to avoid the thread explosion that it caused in the future. The "PEP 572 mess" was the topic of a 2018 Python Language Summit session led by benevolent dictator for life (BDFL) Guido van Rossum. Getting along in the Python community: trying to find ways to keep the mailing list welcoming even in the face of rudeness. About tkinter… Mentoring and diversity for Python: a discussion on how to increase the diversity of the core development team. Victor Stinner outlined some work he has been doing to mentor new developers on their path toward joining the core development ranks Mariatta Wijaya gave a very personal talk that described the diversity problem while also providing some concrete action items that the project and individuals could take to help make Python more welcoming to minorities. Extras Listener feedback: CUDA is NVidia only, so no MacBook pro unless you have a custom external GPU.
September 15, 2018
Sponsored by DataDog -- pythonbytes.fm/datadog Brian #1: dataset: databases for lazy people dataset provides a simple abstraction layer removes most direct SQL statements without the necessity for a full ORM model - essentially, databases can be used like a JSON file or NoSQL store. A simple data loading script using dataset might look like this: import dataset db = dataset.connect('sqlite:///:memory:') table = db['sometable'] table.insert(dict(name='John Doe', age=37)) table.insert(dict(name='Jane Doe', age=34, gender='female')) john = table.find_one(name='John Doe') Michael #2: CuPy GPU NumPy A NumPy-compatible matrix library accelerated by CUDA How many cores does a modern GPU have? CuPy's interface is highly compatible with NumPy; in most cases it can be used as a drop-in replacement. You can easily make a custom CUDA kernel if you want to make your code run faster, requiring only a small code snippet of C++. CuPy automatically wraps and compiles it to make a CUDA binary PyCon 2018 presentation: Shohei Hido - CuPy: A NumPy-compatible Library for GPU Code example >>> # This will run on your GPU! >>> import cupy as np # This is the only non-NumPy line >>> x = np.arange(6).reshape(2, 3).astype('f') >>> x array([[ 0., 1., 2.], [ 3., 4., 5.]], dtype=float32) >>> x.sum(axis=1) array([ 3., 12.], dtype=float32) Brian #3: Automate Python workflow using pre-commits We covered pre-commit in episode 84, but I still had trouble getting my head around it. This article by LJ Miranda does a great job with the workflow introduction and configuration necessary to get pre-commit working for black and flake8. Includes a nice visual of the flow. Demo of it all in action with a short video. Michael #4: py-spy Sampling profiler for Python programs Written by Ben Frederickson Lets you visualize what your Python program is spending time on without restarting the program or modifying the code in any way. Written in Rust for speed Doesn't run in the same process as the profiled Python program Does NOT it interrupt the running program in any way. This means Py-Spy is safe to use against production Python code. The default visualization is a top-like live view of your python program How does py-spy work? Py-spy works by directly reading the memory of the python program using the process_vm_readv system call on Linux, the vm_read call on OSX or the ReadProcessMemory call on Windows. Brian #5: SymPy is a Python library for symbolic mathematics “Symbolic computation deals with the computation of mathematical objects symbolically. This means that the mathematical objects are represented exactly, not approximately, and mathematical expressions with unevaluated variables are left in symbolic form.” example: >>> integrate(sin(x**2), (x, -oo, oo)) √2⋅√π ───── 2 examples on site are interactive so you can play with it without installing anything. Michael #6: Starlette ASGI web framework The little ASGI framework that shines. It is ideal for building high performance asyncio services, and supports both HTTP and WebSockets. Very flask-esq Can use ultrajson - Ultra fast JSON decoder and encoder written in C with Python bindings aiofiles for file responses Run using uvicorn Extras: Michael: PyCon 2019 dates out, put them on your calendar! Tutorials: May 1-2 • Wednesday, Thursday Talks and Events: May 3–5 • Friday, Saturday, Sunday Sprints: May 6–9 • Monday through Thursday Listener follow up on git pre-commit hooks util: pre-commit package Matthew Layman, @mblayman Heard the discussion about Git commit hooks at the end. I wanted to bring up pre-commit as an interesting project (written in Python!) that's useful for Git commit hooks. tl;dr: $ pip install pre-commit $ ... create a .pre-commit-config.yaml $ pre-commit install # This is a one time operation. pre-commit's job is to manage a project's Git commit hooks. We use this on my team at work and the devs only need to run pre-commit install. This saves us from a bunch of failing CI builds where flake8 or other code style checks would fail. We use pre-commit to run flake8 and black before allowing a commit to proceed. Some projects have a pre-commit configuration to use right out of the box (e.g., black https://github.com/ambv/black#version-control-integration). Listener: You don't need that (pattern) John Tocher PyCon AU Talk Called "You don't need that” - by Christopher Neugebauer, it was an interesting take on why with a modern and powerful language like python, you may not need the conventionally described design patterns, ala the "Gang of four".
September 6, 2018
Sponsored by DigialOcean -- pythonbytes.fm/digitalocean Brian #1: Python Patterns @brandon_rhodes vs GOF Michael #2: Arctic: Millions of rows a sec (time data) Arctic is a high-performance datastore for numeric data. It supports Pandas, numpy arrays and pickled objects out-of-the-box, with pluggable support for other data types and optional versioning. Arctic can query millions of rows per second per client, achieves ~10x compression on network bandwidth, ~10x compression on disk, and scales to hundreds of millions of rows per second per MongoDB instance. Arctic has been under active development at Man AHL since 2012. Super fast, some latency numbers: 1xDay Data 4ms for 10k rows, vs 2,210 ms from SQL Server) Tick Data 1s for 3.5 MB (Python) or 15 MB (Java) vs 15-40sec from “other tick” Versioned data Built on MongoDB Slides Based on pandas Tested with pytest Brian #3: PyCon Australia videos How To Publish A Package On PyPI Mark Smith @judy2k Michael #4: GAE: Introducing App Engine Second Generation runtimes and Python 3.7 Today, Google Cloud is announcing the availability of Second Generation App Engine standard runtimes, a significant upgrade to the platform that allows you to easily run web apps using up-to-date versions of popular languages, frameworks and libraries. Python 3.7 is one of the new Second Generation runtimes that we announced at Cloud Next. Based on technology from the gVisor container sandbox, these Second Generation runtimes eliminate many previous App Engine restrictions, giving you the ability to write portable web apps and microservices that take advantage of App Engine's unique auto-scaling, built-in security and pay-per-use billing model. This new runtime allows you to take advantage of Python's vibrant ecosystem of open-source libraries and frameworks. While the Python 2 runtime only allowed the use of specific versions of whitelisted libraries, Python 3 supports arbitrary third-party libraries, including those that rely on C code and native extensions. Just add Django 2.0, NumPy, scikit-learn or your library of choice to a requirements.txt file. App Engine will install these libraries in the cloud when you deploy your app. Brian #5: I don’t like notebooks @joelgrus Michael #6: PEP 8000 -- Python Language Governance Proposal Overview This PEP provides an overview of the selection process for a new model of Python language governance in the wake of Guido's retirement. Once the governance model is selected, it will be codified in PEP 13. PEPs in the lower 8000s describe the general process for selecting a governance model. PEP 8001 - Python Governance Voting Process PEP 8002 - Open Source Governance Survey PEPs in the 8010s describe the actual proposals for Python governance. PEP 8010 - The BDFL Governance Model PEP 8011 - The Council Governance Model PEP 8012 - The Community Governance Model Extras Free Brian Granger ACM webcast on Jupyter Friday TIOBE jump to #3: https://www.tiobe.com/tiobe-index/
August 31, 2018
Sponsored by DataDog -- pythonbytes.fm/datadog Brian #1: Replacing Bash Scripting with Python. reading & writing files CLI’s and working with stdin, stdout, stderr Path and shutil replacing sed, grep, awk, with regex running processes dealing with datetime see also: regex search and replace example scripts Michael #2: pyodide Scientific Python in the browser ALL of CPython (allowed in the browser) NumPy MatPlotLib ... Project by Mozilla We asked “Will there be a PyBlazor?” just two weeks ago. I think we are on a path… Brian #3: The subset of reStructuredText worth committing to memory A lot of Python packages document with reStructuredText, a lot of reStructuredText tutorials are overwhelming. This post is the answer. paragraphs are with two newlines headings use a weird underlined method of above and below and =, -, and ~ bulleted lists work with asterisks but spacing is important italics and bold are with one or two surrounding asterisks inline code uses two backticks links and code snippets are weird and I have to always look this up, as with images, and internal references. so I’ll bookmark this link Michael #4: bandit via Anthony Shaw Bandit is a tool designed to find common security issues in Python code. To do this Bandit processes each file, builds an AST from it, and runs appropriate plugins against the AST nodes. Once Bandit has finished scanning all the files it generates a report. Issues detected: B312 telnetlib B307 eval B110 try_except_pass B602 subprocess_popen_with_shell_equals_true Brian #5: Learn Python 3 within Jupyter Notebooks just fun Also shows how to run pytest in a cell. Michael #6: detect-secrets An enterprise friendly way of detecting and preventing secrets in code. From Yelp detect-secrets is an aptly named module for (surprise, surprise) detecting secrets within a code base. However, unlike other similar packages that solely focus on finding secrets, this package is designed with the enterprise client in mind: providing a backwards compatible, systematic means of: Preventing new secrets from entering the code base, Detecting if such preventions are explicitly bypassed, and Providing a checklist of secrets to roll, and migrate off to a more secure storage. Allows you to set a baseline set it up as a git commit hook
    15
    15
      0:00:00 / 0:00:00