Tom Christie is probably best known as the creator of Django REST Framework, but his contributions to the state the web in Python extend well beyond that. In this episode he shares his story of getting involved in web development, his work on various projects to power the asynchronous web in Python, and his efforts to make his open source contributions sustainable. This was an excellent conversation about the state of asynchronous frameworks for Python and the challenges of making a career out of open source.
Video games have been a vehicle for learning to program since the early days of computing. Continuing in that tradition, Paul Craven created the Arcade library as a modern alternative to PyGame for use in his classroom. In this episode he explains his motivations for starting a new framework for video game development, his view on the benefits of games in computer education, and how his students and the broader community are using it to build interesting and creative projects. If you are looking for a way to get new programmers engaged, or just want to experiment with building your own games, then this is the conversation for you. Give it a listen and then give Arcade a try for yourself.
The companies that we entrust our personal data to are using that information to gain extensive insights into our lives and habits while not always making those findings accessible to us. Pascal van Kooten decided that he wanted to have the same capabilities to mine his personal data, so he created the Nostalgia project to integrate his various data sources and query across them. In this episode he shares his motivation for creating the project, how he is using it in his day-to-day, and how he is planning to evolve it in the future. If you're interested in learning more about yourself and your habits using the personal data that you share with the various services you use then listen now to learn more.
A standard feature in most modern web applications is the ability to log in or register using accounts that you already own on other sites such as Google, Facebook, or Twitter. Building your own integrations for each service can be complex and time consuming, distracting you from the features that you and your users actually care about. Fortunately the Python social auth library makes it easy to support third party authentication with a large and growing number of services with minimal effort. In this episode Matías Aguirre discusses his motivation for creating the library, how he has designed it to allow for flexibility and ease of use, and the benefits of delegating identity and authentication to third parties rather than managing passwords yourself.
In order for an organization to be data driven they need easy access to their data and a simple way of sharing it. Arik Fraimovich built Redash as a way to address that need by connecting to any data source and building attractive dashboards on top of them. In this episode he shares the origin story of the project, his experiences running a business based on open source, and the challenges of working with data effectively.
An effective strategy for teaching and learning is to rely on well structured exercises and collaboration for practicing the material. In this episode long time Python trainer Reuven Lerner reflects on the lessons that he has learned in the 5 years since his first appearance on the show, how his teaching has evolved, and the ways that he has incorporated more hands-on experiences into his lessons. This was a great conversation about the benefits of being deliberate in your approach to ongoing education in the field of technology, as well as having some helpful references for ways to keep your own skills sharp.
Python has been part of the standard toolkit for systems administrators since it was created. In recent years there has been a shift in how servers are deployed and managed, and how code gets released due to the rise of cloud computing and the accompanying DevOps movement. The increased need for automation and speed of iteration has been a perfect use case for Python, cementing its position as a powerful tool for operations. In this episode Moshe Zadka reflects on his experiences using Python in a DevOps context and the book that he wrote on the subject. He also discusses the difference in what aspects of the language are useful as an introduction for system operators and where they can continue their learning.
One of the first challenges that new programmers are faced with is figuring out what editing environment to use. For the past 20 years, Python has had an easy answer to that question in the form of IDLE. In this episode Tal Einat helps us explore its history, the ways it is used, how it was built, and what is in store for its future. Even if you have never used the IDLE editor yourself, it is still an important piece of Python's strength and history, and this conversation helps to highlight why that is.
The past two decades have seen massive growth in the language, community, and ecosystem of Python. The career of Pete Fein has occurred during that same period and his use of the language has paralleled some of the major shifts in focus that have occurred. In this episode he shares his experiences moving from a trader writing scripts, through the rise of the web, to the current renaissance in data. He also discusses how his engagement with the community has evolved, why he hasn't needed to use any other languages in his career, and what he is keeping an eye on for the future.
Debugging is a painful but necessary practice in software development. The tools that are available in Python range from the built-in debugger, to tools integrated with your coding environment, to the trusty print function. In this episode Ram Rachum describes his work on PySnooper and how it can be used to speed up your problem solving in complex or legacy applications.
Starting a new project is always exciting because the scope is easy to understand and adding new features is fun and easy. As it grows, the rate of change slows down and the amount of communication necessary to introduce new engineers to the code increases along with the complexity. Thomas Hatch, CTO and creator of SaltStack, didn't want to accept that as an inevitable fact of software, so he created a new paradigm and a proof-of-concept framework to experiment with it. In this episode he shares his thoughts and findings on the topic of plugin oriented programming as a way to build and scale complex projects while keeping them fun and flexible.
Any software project that is worked on or used by multiple people will inevitably reach a point where certain capabilities need to be turned on or off. In this episode Pete Hodgson shares his experience and insight into when, how, and why to use feature flags in your projects as a way to enable that practice. In addition to the simple on and off controls for certain logic paths, feature toggles also allow for more advanced patterns such as canary releases and A/B testing. This episode has something useful for anyone who works on software in any language.
Building well designed and easy to use web applications requires a significant amount of knowledge and experience across a range of domains. This can act as an impediment to engineers who primarily work in so-called back-end technologies such as machine learning and systems administration. In this episode Adrien Treuille describes how the Streamlit framework empowers anyone who is comfortable writing Python scripts to create beautiful applications to share their work and make it accessible to their colleagues and customers. If you have ever struggled with hacking together a simple web application to make a useful script self-service then give this episode a listen and then go experiment with how Streamlit can level up your work.
The internet is rife with bots and bad actors trying to compromise your servers. To counteract these threats it is necessary to diligently harden your systems to improve server security. Unfortunately, the hardening process can be complex or confusing. In this week's episode 18 year old Orhun Parmaksiz shares the story of how he and his friends created the GrapheneX framework to simplify the process of securing and maintaining your servers using the power and flexibility of Python. If you run your own software then this is definitely worth a listen.
Large companies often have a variety of programming languages and technologies being used across departments to keep the business running. Python has been gaining ground in these environments because of its flexibility, ease of use, and developer productivity. In order to accelerate the rate of adoption at Wayfair this week's guest Jonathan Biddle started a team to work with other engineering groups on their projects and show them how best to take advantage of the benefits of Python. In this episode he explains their operating model, shares their success stories, and provides advice on the pitfalls to avoid if you want to follow in his footsteps. This is definitely worth a listen if you are using Python in your work or would like to aid in its adoption.
Quantum computers are the biggest jump forward in processing power that the industry has seen in decades. As part of this revolution it is necessary to change our approach to algorithm design. D-Wave is one of the companies who are pushing the boundaries in quantum processing and they have created a Python SDK for experimenting with quantum algorithms. In this episode Alexander Condello explains what is involved in designing and implementing these algorithms, how the Ocean SDK helps you in that endeavor, and what types of problems are well suited to this approach.
Deep learning is a phrase that is used more often as it continues to transform the standard approach to artificial intelligence and machine learning projects. Despite its ubiquity, it is often difficult to get a firm understanding of how it works and how it can be applied to a particular problem. In this episode Jon Krohn, author of Deep Learning Illustrated, shares the general concepts and useful applications of this technique, as well as sharing some of his practical experience in using it for his work. This is definitely a helpful episode for getting a better comprehension of the field of deep learning and when to reach for it in your own projects.
Software development is a unique profession in many ways, and it has given rise to its own subculture due to the unique sets of challenges that face developers. Andrew Smith is an author who is working on a book to share his experiences learning to program, and understand the impact that software is having on our world. In this episode he shares his thoughts on programmer culture, his experiences with Python and other language communities, and how learning to code has changed his views on the world. It was interesting getting an anthropological perspective from a relative newcomer to the world of software.
Designing and maintaining enterprise networks and the associated hardware is a complex and time consuming task. Network automation tools allow network engineers to codify their workflows and make them repeatable. In this episode Antoine Fourmy describes his work on eNMS and how it can be used to automate enterprise grade networks. He explains how his background in telecom networking led him to build an open source platform for network engineers, how it is architected, and how you can use it for creating your own workflows. This is definitely worth listening to as a way to gain some appreciation for all of the work that goes on behind the scenes to make the internet possible.
Building and sustaining a healthy community requires a substantial amount of effort, especially online. The design and user experience of the digital space can impact the overall interactions of the participants and guide them toward respectful conversation. In this episode Rafał Pitoń shares his experience building the Misago platform for creating community forums. He explains his motivation for creating the project, the lessons he has learned in the process, and how it is being used by himself and others. This was a great conversation about how technology is just a means, and not the end in itself.
There are countless tools and libraries in Python for data scientists to perform powerful analyses, but they often have a setup cost that acts as a barrier to ad-hoc exploration of data. Visidata is a command line application that eliminates the friction involved with starting the discovery process. In this episode Saul Pwanson explains his motivation for creating it, why a terminal environment is a useful place for this work, and how you can use Visidata for your own work. If you have ever avoided looking at a data set because you couldn't be bothered with the boilerplate for a Jupyter notebook, then Visidata is the perfect addition to your toolbox.
The Python community in Argentina is large and active, thanks largely to the motivated individuals who manage and organize it. In this episode Facundo Batista explains how he helped to found the Python user group for Argentina and the work that he does to make it accessible and welcoming. He discusses the challenges of encompassing such a large and distributed group, the types of events, resources, and projects that they build, and his own efforts to make information free and available. He is an impressive individual with a substantial list of accomplishments, as well as exhibiting the best of what the global Python community has to offer.
The internet has made it easier than ever to share information, but at the same time it has increased our ability to track that information. In order to ensure that news agencies are able to accept truly anonymous material submissions from whistelblowers, the Freedom of the Press foundation has supported the ongoing development and maintenance of the SecureDrop platform. In this episode core developers of the project explain what it is, how it protects the privacy and identity of journalistic sources, and some of the challenges associated with ensuring its security. This was an interesting look at the amount of effort that is required to avoid tracking in the modern era.
The ecosystem of tools and libraries in Python for data manipulation and analytics is truly impressive, and continues to grow. There are, however, gaps in their utility that can be filled by the capabilities of a data warehouse. In this episode Robert Hodges discusses how the PyData suite of tools can be paired with a data warehouse for an analytics pipeline that is more robust than either can provide on their own. This is a great introduction to what differentiates a data warehouse from a relational database and ways that you can think differently about running your analytical workloads for larger volumes of data.
Software engineers are frequently faced with problems that have been fixed by other developers in different projects. The challenge is how and when to surface that information in a way that increases their efficiency and avoids wasted effort. DeepCode is an automated code review platform that was built to solve this problem by training a model on a massive array of open sourced code and the history of their bug and security fixes. In this episode their CEO Boris Paskalev explains how the company got started, how they build and maintain the models that provide suggestions for improving your code changes, and how it integrates into your workflow.
PyPI is a core component of the Python ecosystem that most developer's have interacted with as either a producer or a consumer. But have you ever thought deeply about how it is implemented, who designs those interactions, and how it is secured? In this episode Nicole Harris and William Woodruff discuss their recent work to add new security capabilities and improve the overall accessibility and user experience. It is a worthwhile exercise to consider how much effort goes into making sure that we don't have to think much about this piece of infrastructure that we all rely on.
With the increasing role of software in our world there has been an accompanying focus on teaching people to program. There are numerous approaches that have been attempted to achieve this goal with varying levels of success. Nicholas Tollervey has begun a new effort that blends the approach adopted by musicians and martial artists that uses a series of grades to provide recognition for the achievements of students. In this episode he explains how he has structured the study groups, syllabus, and evaluations to help learners build projects based on their interests and guide their own education while incorporating useful skills that are necessary for a career in software. If you are interested in learning to program, teach others, or act as a mentor then give this a listen and then get in touch with Nicholas to help make this endeavor a success.
Computers are excellent at following detailed instructions, but they have no capacity for understanding the information that they work with. Knowledge graphs are a way to approximate that capability by building connections between elements of data that allow us to discover new connections among disparate information sources that were previously uknown. In our day-to-day work we encounter many instances of knowledge graphs, but building them has long been a difficult endeavor. In order to make this technology more accessible Tom Grek built Zincbase. In this episode he explains his motivations for starting the project, how he uses it in his daily work, and how you can use it to create your own knowledge engine and begin discovering new insights of your own.
Docker is a useful technology for packaging and deploying software to production environments, but it also introduces a different set of complexities that need to be understood. In this episode Itamar Turner-Trauring shares best practices for running Python workloads in production using Docker. He also explains some of the security implications to be aware of and digs into ways that you can optimize your build process to cut down on wasted developer time. If you are using Docker, thinking about using it, or just heard of it recently then it is worth your time to listen and learn about some of the cases you might not have considered.
The Python language has seen exponential growth in popularity and usage over the past decade. This has been driven by industry trends such as the rise of data science and the continued growth of complex web applications. It is easy to think that there is no threat to the continued health of Python, its ecosystem, and its community, but there are always outside factors that may pose a threat in the long term. In this episode Russell Keith-Magee reprises his keynote from PyCon US in 2019 and shares his thoughts on potential black swan events and what we can do as engineers and as a community to guard against them.
Project management is a discipline that has been through many incarnations, spawning an entire industry of businesses and tools. The challenge is to build a platform that is sufficiently powerful and adaptable to fit the workflow of your teams, while remaining opinionated enough to be useful. It also helps to have an open and extensible platform that can be customized as needed. In this episode Pablo Ruiz Múzquiz explains the motivation for creating the open source tool Taiga, how it compares to the other options in the market, and how you can use it for your own projects. He also discusses the challenges inherent to project management tools, his philosophies on what makes a project successful, and how to manage your team workflows to be most effective. It was helpful learning from Pablo's long experience in the software industry and managing teams of various sizes.
When your software projects start to scale it becomes a greater challenge to understand and maintain all of the pieces. In this episode Henry Percival shares his experiences working with domain driven design in large Python projects. He explains how it is helpful, and how you can start using it for your own applications. This was an informative conversation about software architecture patterns for large organizations and how they can be used by Python developers.
Machine learning is growing in popularity and capability, but for a majority of people it is still a black box that we don't fully understand. The team at MindsDB is working to change this state of affairs by creating an open source tool that is easy to use without a background in data science. By simplifying the training and use of neural networks, and making their logic explainable, they hope to bring AI capabilities to more people and organizations. In this interview George Hosu and Jorge Torres explain how MindsDB is built, how to use it for your own purposes, and how they view the current landscape of AI technologies. This is a great episode for anyone who is interested in experimenting with machine learning and artificial intelligence. Give it a listen and then try MindsDB for yourself.
One of the secrets of the success of Python the language is the tireless efforts of the people who work with and for the Python Software Foundation. They have made it their mission to ensure the continued growth and success of the language and its community. In this episode Ewa Jodlowska, the executive director of the PSF, discusses the history of the foundation, the services and support that they provide to the community and language, and how you can help them succeed in their mission.
Algorithmic trading is a field that has grown in recent years due to the availability of cheap computing and platforms that grant access to historical financial data. QuantConnect is a business that has focused on community engagement and open data access to grant opportunities for learning and growth to their users. In this episode CEO Jared Broad and senior engineer Alex Catarino explain how they have built an open source engine for testing and running algorithmic trading strategies in multiple languages, the challenges of collecting and serving currrent and historical financial data, and how they provide training and opportunity to their community members. If you are curious about the financial industry and want to try it out for yourself then be sure to listen to this episode and experiment with the QuantConnect platform for free.
The knowledge and effort required for building a fully functional web application has grown at an accelerated rate over the past several years. This introduces a barrier to entry that excludes large numbers of people who could otherwise be producing valuable and interesting services. To make the onramp easier Meredydd Luff and Ian Davies created Anvil, a platform for full stack web development in pure Python. In this episode Meredydd explains how the Anvil platform is built and how you can use it to build and deploy your own projects. He also shares some examples of people who were able to create profitable businesses themselves because of the reduced complexity. It was interesting to get Meredydd's perspective on the state of the industry for web development and hear his vision of how Anvil is working to make it available for everyone.
Serverless computing is a recent category of cloud service that provides new options for how we build and deploy applications. In this episode Raghu Murthy, founder of DataCoral, explains how he has built his entire business on these platforms. He explains how he approaches system architecture in a serverless world, the challenges that it introduces for local development and continuous integration, and how the landscape has grown and matured in recent years. If you are wondering how to incorporate serverless platforms in your projects then this is definitely worth your time to listen to.
One of the biggest pain points when working with data is getting is dealing with the boilerplate code to load it into a usable format. Intake encapsulates all of that and puts it behind a single API. In this episode Martin Durant explains how to use the Intake data catalogs for encapsulating source information, how it simplifies data science workflows, and how to incorporate it into your projects. It is a lightweight way to enable collaboration between data engineers and data scientists in the PyData ecosystem.
Learning to program can be a frustrating process, because even the simplest code relies on a complex stack of other moving pieces to function. When working with a microcontroller you are in full control of everything so there are fewer concepts that need to be understood in order to build a functioning project. CircuitPython is a platform for beginner developers that provides easy to use abstractions for working with hardware devices. In this episode Scott Shawcroft explains how the project got started, how it relates to MicroPython, some of the cool ways that it is being used, and how you can get started with it today. If you are interested in playing with low cost devices without having to learn and use C then give this a listen and start tinkering!
Being able to control a computer with your voice has rapidly moved from science fiction to science fact. Unfortunately, the majority of platforms that have been made available to consumers are controlled by large organizations with little incentive to respect users' privacy. The team at Snips are building a platform that runs entirely off-line and on-device so that your information is always in your control. In this episode Adrien Ball explains how the Snips architecture works, the challenges of building a speech recognition and natural language understanding toolchain that works on limited resources, and how they are tackling issues around usability for casual consumers. If you have been interested in taking advantage of personal voice assistants, but wary of using commercially available options, this is definitely worth a listen.
The U.S. government has a vast quantity of software projects across the various agencies, and many of them would benefit from a modern approach to development and deployment. The U.S. Digital Services Agency has been tasked with making that happen. In this episode the current director of engineering for the USDS, David Holmes, explains how the agency operates, how they are using Python in their efforts to provide the greatest good to the largest number of people, and why you might want to get involved. Even if you don't live in the U.S.A. this conversation is worth listening to so you can see an interesting model of how to improve government services for everyone.
Most programming is deterministic, relying on concrete logic to determine the way that it operates. However, there are problems that require a way to work with uncertainty. PyMC3 is a library designed for building models to predict the likelihood of certain outcomes. In this episode Thomas Wiecki explains the use cases where Bayesian statistics are necessary, how PyMC3 is designed and implemented, and some great examples of how it is being used in real projects.
Managing an event is rife with inherent complexity that scales as you move from scheduling a meeting to organizing a conference. Indico is a platform built at CERN to handle their efforts to organize events such as the Computing in High Energy Physics (CHEP) conference, and now it has grown to manage booking of meeting rooms. In this episode Adrian Mönnich, core developer on the Indico project, explains how it is architected to facilitate this use case, how it has evolved since its first incarnation two decades ago, and what he has learned while working on it. The Indico platform is definitely a feature rich and mature platform that is worth considering if you are responsible for organizing a conference or need a room booking system for your office.
The CPython interpreter has been the primary implementation of the Python runtime for over 20 years. In that time other options have been made available for different use cases. The most recent entry to that list is RustPython, written in the memory safe language Rust. One of the added benefits is the option to compile to WebAssembly, offering a browser-native Python runtime. In this episode core maintainers Windel Bouwman and Adam Kelly explain how the project got started, their experience working on it, and the plans for the future. Definitely worth a listen if you are curious about the inner workings of Python and how you can get involved in a relatively new project that is contributing to new options for running your code.
Version control has become table stakes for any software team, but for machine learning projects there has been no good answer for tracking all of the data that goes into building and training models, and the output of the models themselves. To address that need Dmitry Petrov built the Data Version Control project known as DVC. In this episode he explains how it simplifies communication between data scientists, reduces duplicated effort, and simplifies concerns around reproducing and rebuilding models at different stages of the projects lifecycle. If you work as part of a team that is building machine learning models or other data intensive analysis then make sure to give this a listen and then start using DVC today.
Ecommerce is an industry that has largely faded into the background due to its ubiquity in recent years. Despite that, there are new trends emerging and room for innovation, which is what the team at Mirumee focuses on. To support their efforts, they build and maintain the open source Saleor framework for Django as a way to make the core concerns of online sales easy and painless. In this episode Mirek Mencel and Patryk Zawadzki discuss the projects that they work on, the current state of the ecommerce industry, how Saleor fits with their technical and business strategy, and their predictions for the near future of digital sales.
Naomi Ceder was fortunate enough to learn Python from Guido himself. Since then she has contributed books, code, and mentorship to the community. Currently she serves as the chair of the board to the Python Software Foundation, leads an engineering team, and has recently completed a new draft of the Quick Python Book. In this episode she shares her story, including a discussion of her experience as a technical author and a detailed account of the role that the PSF plays in supporting and growing the community.
Python has become one of the dominant languages for data science and data analysis. Wes McKinney has been working for a decade to make tools that are easy and powerful, starting with the creation of Pandas, and eventually leading to his current work on Apache Arrow. In this episode he discusses his motivation for this work, what he sees as the current challenges to be overcome, and his hopes for the future of the industry.
The current buzz in data science and big data is around the promise of deep learning, especially when working with unstructured data. One of the most popular frameworks for building deep learning applications is PyTorch, in large part because of their focus on ease of use. In this episode Adam Paszke explains how he started the project, how it compares to other frameworks in the space such as Tensorflow and CNTK, and how it has evolved to support deploying models into production and on mobile devices.
The Redis database recently celebrated its 10th birthday. In that time it has earned a well-earned reputation for speed, reliability, and ease of use. Python developers are fortunate to have a well-built client in the form of redis-py to leverage it in their projects. In this episode Andy McCurdy and Dr. Christoph Zimmerman explain the ways that Redis can be used in your application architecture, how the Python client is built and maintained, and how to use it in your projects.
Any time that your program needs to interact with other systems it will have to deal with serializing and deserializing data. To prevent duplicate code and provide validation of the data structures that your application is consuming Steven Loria created the Marshmallow library. In this episode he explains how it is built, how to use it for rendering data objects to various serialization formats, and some of the interesting and unique ways that it is incorporated into other projects.
Chaos engineering is the practice of injecting failures into your production systems in a controlled manner to identify weaknesses in your applications. In order to build, run, and report on chaos experiments Sylvain Hellegouarch created the Chaos Toolkit. In this episode he explains his motivation for creating the toolkit, how to use it for improving the resiliency of your systems, and his plans for the future. He also discusses best practices for building, running, and learning from your own experiments.
Music is a part of every culture around the world and throughout history. Musicology is the study of that music from a structural and sociological perspective. Traditionally this research has been done in a manual and painstaking manner, but the advent of the computer age has enabled an increase of many orders of magnitude in the scope and scale of analysis that we can perform. The music21 project is a Python library for computer aided musicology that is written and used by MIT professor Michael Scott Cuthbert. In this episode he explains how the project was started, how he is using it personally, professionally, and in his lectures, as well as how you can use it for your own exploration of musical analysis.
Software development is a career that attracts people from all backgrounds, and Python in particular helps to make it an approachable occupation. Because of the variety of paths that can be taken it is becoming increasingly common for practitioners to bypass the traditional computer science education. In this episode David Kopec discusses some of the classic problems that he has found most useful to understand in his work as a professor and practitioner of software engineering. He shares his motivation for writing the book "Classic Computer Science Problems In Python", the practical approach that he took, and an overview of how the contents can be used in your day-to-day work.
As a developer and user of open source code, you interact with software and digital media every day. What is often overlooked are the rights and responsibilities conveyed by the intellectual property that is implicit in all creative works. Software licenses are a complicated legal domain in their own right, and they can often conflict with each other when you factor in the web of dependencies that your project relies on. In this episode Luis Villa, Co-Founder of Tidelift, explains the catagories of software licenses, how to select the right one for your project, and what to be aware of when you contribute to someone else's code.
As we build software projects, complexity and technical debt are bound to creep into our code. To counteract these tendencies it is necessary to calculate and track metrics that highlight areas of improvement so that they can be acted on. To aid in identifying areas of your application that are breeding grounds for incidental complexity Anthony Shaw created Wily. In this episode he explains how Wily traverses the history of your repository and computes code complexity metrics over time and how you can use that information to guide your refactoring efforts.
Computers have found their way into virtually every area of human endeavor, and archaeology is no exception. To aid his students in their exploration of digital archaeology Shawn Graham helped to create an online, digital textbook with accompanying interactive notebooks. In this episode he explains how computational practices are being applied to archaeological research, how the Online Digital Archaeology Textbook was created, and how you can use it to get involved in this fascinating area of research.
Every day there are satellites collecting sensor readings and imagery of our Earth. To help make sense of that information, developers at the meterological institutes of Sweden and Denmark worked together to build a collection of Python packages that simplify the work of downloading and processing the data gathered by satellites. In this episode one of the core developers of PyTroll explains how the project got started, how that data is being used by the scientific community, and how citizen scientists like you are getting involved.
The web has spawned numerous methods for communicating between applications, including protocols such as SOAP, XML-RPC, and REST. One of the newest entrants is GraphQL which promises a simplified approach to client development and reduced network requests. To make implementing these APIs in Python easier, Syrus Akbary created the Graphene project. In this episode he explains the origin story of Graphene, how GraphQL compares to REST, how you can start using it in your applications, and how he is working to make his efforts sustainable.
Real-time communication over the internet is an amazing feat of modern engineering. The protocol that powers a majority of video calling platforms is WebRTC. In this episode Jeremy Lainé explains why he wrote a Python implementation of this protocol in the form of AIORTC. He also discusses how it works, how you can use it in your own projects, and what he has planned for the future.
Using computers to analyze text can produce useful and inspirational insights. However, when working with multiple languages the capabilities of existing models are severely limited. In order to help overcome this limitation Rami Al-Rfou built Polyglot. In this episode he explains his motivation for creating a natural language processing library with support for a vast array of languages, how it works, and how you can start using it for your own projects. He also discusses current research on multi-lingual text analytics, how he plans to improve Polyglot in the future, and how it fits in the Python ecosystem.
Do you know what your servers are doing? If you have a metrics system in place then the answer should be "yes". One critical aspect of that platform is the timeseries database that allows you to store, aggregate, analyze, and query the various signals generated by your software and hardware. As the size and complexity of your systems scale, so does the volume of data that you need to manage which can put a strain on your metrics stack. Julien Danjou built Gnocchi during his time on the OpenStack project to provide a time oriented data store that would scale horizontally and still provide fast queries. In this episode he explains how the project got started, how it works, how it compares to the other options on the market, and how you can start using it today to get better visibility into your operations.
Keeping up with the work being done in the Python community can be a full time job, which is why Dan Bader has made it his! In this episode he discusses how he went from working as a software engineer, to offering training, to now managing both the Real Python and PyCoders properties. He also explains his strategies for tracking and curating the content that he produces and discovers, how he thinks about building products, and what he has learned in the process of running his businesses.
Digital books are convenient and useful ways to have easy access to large volumes of information. Unfortunately, keeping track of them all can be difficult as you gain more books from different sources. Keeping your reading device synchronized with the material that you want to read is also challenging. In this episode Kovid Goyal explains how he created the Calibre digital library manager to solve these problems for himself, how it grew to be the most popular application for organizing ebooks, and how it works under the covers. Calibre is an incredibly useful piece of software with a lot of hidden complexity and a great story behind it.
Investigative reporters have a challenging task of identifying complex networks of people, places, and events gleaned from a mixed collection of sources. Turning those various documents, electronic records, and research into a searchable and actionable collection of facts is an interesting and difficult technical challenge. Friedrich Lindenberg created the Aleph project to address this issue and in this episode he explains how it works, why he built it, and how it is being used. He also discusses his hopes for the future of the project and other ways that the system could be used.
The Python Community is large and growing, however a majority of articles, books, and presentations are still in English. To increase the accessibility for Spanish language speakers, Maricela Sanchez helped to create the Charlas track at PyCon US, and is an organizer for Python Day Mexico. In this episode she shares her motivations for getting involved in community building, her experiences working on Python Day Mexico and PyCon Charlas, and the lessons that she has learned in the process.
As data science becomes more widespread and has a bigger impact on the lives of people, it is important that those projects and products are built with a conscious consideration of ethics. Keeping ethical principles in mind throughout the lifecycle of a data project helps to reduce the overall effort of preventing negative outcomes from the use of the final product. Emily Miller and Peter Bull of Driven Data have created Deon to improve the communication and conversation around ethics among and between data teams. It is a Python project that generates a checklist of common concerns for data oriented projects at the various stages of the lifecycle where they should be considered. In this episode they discuss their motivation for creating the project, the challenges and benefits of maintaining such a checklist, and how you can start using it today.
The breadth of use cases that Python supports, coupled with the level of productivity that it provides through its ease of use have contributed to the incredible popularity of the language. To explore the ways that it can contribute to the success of a young and growing startup two of the lead engineers at Wanderu discuss their experiences in this episode. Matt Warren, the technical operations lead, explains the ways that he is using Python to build and scale the infrastructure that Wanderu relies on, as well as the ways that he deploys and runs the various Python applications that power the business. Chris Kirkos, the lead software architect, describes how the original Django application has grown into a suite of microservices, where they have opted to use a different language and why, and how Python is still being used for critical business needs. This is a great conversation for understanding the business impact of the Python language and ecosystem.
Many people learn to program because of their interest in building their own video games. Once the necessary skills have been acquired, it is often the case that the original idea of creating a game is forgotten in favor of solving the problems we confront at work. Game jams are a great way to get inspired and motivated to finally write a game from scratch. This week Daniel Pope discusses the origin and format for PyWeek, his experience as a participant, and the landscape of options for building a game in Python. He also explains how you can register and compete in the next competition.
Any application that communicates with other systems or services will at some point require a credential or sensitive piece of information to operate properly. The question then becomes how best to securely store, transmit, and use that information. The world of software secrets management is vast and complicated, so in this episode Brian Kelly, engineering manager at Conjur, aims to help you make sense of it. He explains the main factors for protecting sensitive information in your software development and deployment, ways that information might be leaked, and how to get the whole team on the same page.
There are many aspects of learning how to program and at least as many ways to go about it. This is multiplicative with the different problem domains and subject areas where software development is applied. In this episode William Vincent discusses his experiences learning how web development mid-career and then writing a series of books to make the learning curve for Django newcomers shallower. This includes his thoughts on the business aspects of technical writing and teaching, the challenges of keeping content up to date with the current state of software, and the ever-present lack of sufficient information for new programmers.
Maintaining the health and well-being of your software is a never-ending responsibility. Automating away as much of it as possible makes that challenge more achievable. In this episode Anthony Sottile describes his work on the pre-commit framework to simplify the process of writing and distributing functions to make sure that you only commit code that meets your definition of clean. He explains how it supports tools and repositories written in multiple languages, enforces team standards, and how you can start using it today to ship better software.
How secure are your servers? The best way to be sure that your systems aren't being compromised is to do it yourself. In this episode Daniel Goldberg explains how you can use his project Infection Monkey to run a scan of your infrastructure to find and fix the vulnerabilities that can be taken advantage of. He also discusses his reasons for building it in Python, how it compares to other security scanners, and how you can get involved to keep making it better.
The need to process unbounded and continually streaming sources of data has become increasingly common. One of the popular platforms for implementing this is Kafka along with its streams API. Unfortunately, this requires all of your processing or microservice logic to be implemented in Java, so what's a poor Python developer to do? If that developer is Ask Solem of Celery fame then the answer is, help to re-implement the streams API in Python. In this episode Ask describes how Faust got started, how it works under the covers, and how you can start using it today to process your fast moving data in easy to understand Python code. He also discusses ways in which Faust might be able to replace your Celery workers, and all of the pieces that you can replace with your own plugins.
Writing a book is hard work, especially when you are trying to teach such a broad concept as programming. In this episode Ana Bell discusses her recent work in writing Get Programming: Learn To Code With Python, including her views on how to separate the principles from the implementation, making the book evergreen in its appeal, and how her experience as a lecturer at MIT has helped her maintain the perspectives of beginners. She also shares her views on the values of learning about programming, even when you have no intention of doing it as a career and ways to take the next steps if that is your goal.
Masonite is an ambitious new web framework that draws inspiration from many other successful projects in other languages. In this episode Joe Mancuso, the primary author and maintainer, explains his goal of unseating Django from its position of prominence in the Python community. He also discusses his motivation for building it, how it is architected, and how you can start using it for your own projects.
There are a number of resources available for teaching beginners to code in Python and many other languages, and numerous endeavors to introduce programming to educational environments. Sometimes those efforts yield success and others can simply lead to frustration on the part of the teacher and the student. In this episode Nicholas Tollervey discusses his work as a teacher and a programmer, his work on the micro:bit project and the PyCon UK education summit, as well as his thoughts on the place that Python holds in educational programs for teaching the next generation.
Continuous integration systems are important for ensuring that you don't release broken software. Some projects can benefit from simple, standardized platforms, but as you grow or factor in additional projects the complexity of checking your deployments grows. Zuul is a deployment automation and gating system that was built to power the complexities of OpenStack so it will grow and scale with you. In this episode Monty Taylor explains how he helped start Zuul, how it is designed for scale, and how you can start using it for your continuous delivery systems. He also discusses how Zuul has evolved and the directions it will take in the future.
Michael Foord has been working on building and testing software in Python for over a decade. One of his most notable and widely used contributions to the community is the Mock library, which has been incorporated into the standard library. In this episode he explains how he got involved in the community, why testing has been such a strong focus throughout his career, the uses and hazards of mocked objects, and how he is transitioning to freelancing full time.
Twisted is one of the earliest frameworks for developing asynchronous applications in Python and it has yet to fulfill its original purpose. It can be used to build network servers that integrate a multitude of protocols, increase the performance of your I/O bound applications, serve as the full web stack for your WSGI projects, and anything else that needs a battle tested and performant foundation. In this episode long time maintainer Moshe Zadka discusses the history of Twisted, how it has evolved over the years, the transition to Python 3, some of its myriad use cases, and where it is headed in the future. Try it out today and then send some thanks to all of the people who have dedicated their time to building it.
Mike Driscoll has been writing blogs and books for the Python community for years, including his popular series on the Python Module Of The Week. In his daily work he uses Python to test graphical interfaces written in C++ and QT for embedded platforms. In this episode he explains his work, how he got involved in writing as a regular exercise, and an overview of his recent books.
Hosting your own artifact repositories can have a huge impact on the reliability of your production systems. It reduces your reliance on the availability of external services during deployments and ensures that you have access to a consistent set of dependencies with known versions. Many repositories only support one type of package, thereby requiring multiple systems to be maintained, but Pulp is a platform that handles multiple content types and is easily extendable to manage everything you need for running your applications. In this episode maintainers Bihan Zhang and Austin Macdonald explain how the Pulp project works, the exciting new changes coming in version 3, and how you can get it set up to use for your deployments today.
The future is here, it's just not evenly distributed. One of the places where this is especially true is in sub-Saharan Africa which is a vast region with little to no reliable internet connectivity. To help communities in this region leapfrog infrastructure challenges and gain access to opportunities for education and market information the Ascoderu non-profit has built Lokole. In this episode one of the lead engineers on the project, Clemens Wolff, explains what it is, how it is built, and how the venerable e-mail protocols can continue to provide access cheaply and reliably.
Machine learning models are often inscrutable and it can be difficult to know whether you are making progress. To improve feedback and speed up iteration cycles Benjamin Bengfort and Rebecca Bilbro built Yellowbrick to easily generate visualizations of model performance. In this episode they explain how to use Yellowbrick in the process of building a machine learning project, how it aids in understanding how different parameters impact the outcome, and the improved understanding among teammates that it creates. They also explain how it integrates with the scikit-learn API, the difficulty of producing effective visualizations, and future plans for improvement and new features.
The command line is a powerful and resilient interface for getting work done, but the user experience is often lacking. This can be especially pronounced in database clients because of the amount of information being transferred and examined. To help improve the utility of these interfaces Amjith Ramanujam built PGCLI, quickly followed by MyCLI with the Prompt Toolkit library. In this episode he describes his motivation for building these projects, how their popularity led him to create even more clients, and how these tools can help you in your command line adventures.
Pandas is a swiss army knife for data processing in Python but it has long been difficult to customize. In the latest release there is now an extension interface for adding custom data types with namespaced APIs. This allows for building and combining domain specific use cases and alternative storage mechanisms. In this episode Tom Augspurger describes how the new ExtensionArray works, how it came to be, and how you can start building your own extensions today.
Software development is a skill that can create value and reduce drudgery in a wide variety of contexts. Sometimes the causes that are most in need of software expertise are also the least able to pay for it. By volunteering our time and abilities to causes that we believe in, we can help make a tangible difference in the world. In this episode Eric Schles describes his experiences working on social justice initiatives and the types of work that proved to be the most helpful to the groups that he was working with.
One of the challenges of machine learning is obtaining large enough volumes of well labelled data. An approach to mitigate the effort required for labelling data sets is active learning, in which outliers are identified and labelled by domain experts. In this episode Tivadar Danka describes how he built modAL to bring active learning to bioinformatics. He is using it for doing human in the loop training of models to detect cell phenotypes with massive unlabelled datasets. He explains how the library works, how he designed it to be modular for a broad set of use cases, and how you can use it for training models of your own.
Testing is a critical activity in all software projects, but one that is often neglected in data pipelines. The complexities introduced by the inherent statefulness of the problem domain and the interdependencies between systems contribute to make pipeline testing difficult to manage. To make this endeavor more manageable Abe Gong and James Campbell have created Great Expectations. In this episode they discuss how you can use the project to create tests in the exploratory phase of building a pipeline and leverage those to monitor your systems in production. They also discussed how Great Expectations works, the difficulties associated with pipeline testing and managing associated technical debt, and their future plans for the project.
We take it for granted every day, but creating and displaying vivid colors in our digital media is a complicated and often difficult process. There are different ways to represent color, the ways in which they are displayed can cause them to look different, and translating between systems can cause losses of information. To simplify the process of working with color information in code Thomas Mansencal wrote the Colour project. In this episode we discuss his motiviation for creating and sharing his library, how it works to translate and manage color representations, and how it can be used in your projects.
Many developers enter the market from backgrounds that don't involve a computer science degree, which can lead to blind spots of how to approach certain types of problems. Gary Bernhardt produces screen casts and articles that aim to teach these principles with code to make them approachable and easy to understand. In this episode Gary discusses his views on the state of software education, both in academia and bootcamps, the theoretical concepts that he finds most useful in his work, and some thoughts on how to build better software.
With libraries such as Tensorflow, PyTorch, scikit-learn, and MXNet being released it is easier than ever to start a deep learning project. Unfortunately, it is still difficult to manage scaling and reproduction of training for these projects. Mourad Mourafiq built Polyaxon on top of Kubernetes to address this shortcoming. In this episode he shares his reasons for starting the project, how it works, and how you can start using it today.
One of the biggest issues facing us is the availability of sustainable energy sources. As individuals and energy consumers it is often difficult to understand how we can make informed choices about energy use to reduce our impact on the environment. Electricity Map is a project that provides up to date and historical information about the balance of how the energy we are using is being produced. In this episode Olivier Corradi discusses his motivation for creating Electricity Map, how it is built, and his goals for the project and his other work at Tomorrow Co.
Email is one of the oldest methods of communication that is still in use on the internet today. Despite many attempts at building a replacement and predictions of its demise we are sending more email now than ever. Recognizing that the venerable inbox is still an important repository of information, Christine Spang co-founded Nylas to integrate your mail with the rest of your tools, rather than just replacing it. In this episode Christine discusses how Nylas is built, how it is being used, and how she has helped to grow a successful business with a strong focus on diversity and inclusion.
Most applications require data to operate on in order to function, but sometimes that data is hard to come by, so why not just make it up? Mimesis is a library for randomly generating data of different types, such as names, addresses, and credit card numbers, so that you can use it for testing, anonymizing real data, or for placeholders. This week Nikita Sobolev discusses how the project got started, the challenges that it has posed, and how you can use it in your applications.
Making computers identify and understand what they are looking at in digital images is an ongoing challenge. Recent years have seen notable increases in the accuracy and speed of object detection due to deep learning and new applications of neural networks. In order to make it easier for developers to take advantage of these techniques Tryo Labs built Luminoth. In this interview Joaquin Alori explains how how Luminoth works, how it can be used in your projects, and how it compares to API oriented services for computer vision.
Learning to program is a rewarding pursuit, but is often challenging. One of the roadblocks on the way to proficiency is getting a development environment installed and configured. In order to simplify that process Aivar Annamaa built Thonny, a Python IDE designed for beginning programmers. In this episode he discusses his initial motivations for starting Thonny and how it helps newcomers to Python learn and understand how to write software.
Maintaining a consistent taxonomy for your music library is a challenging and time consuming endeavor. Eventually you end up with a mess of folders and files with inconsistent names and missing metadata. Beets is built to solve this problem by programmatically managing the tags and directory structure for all of your music files and providing a fast lookup when you are trying to find that perfect song to play. Adrian Sampson began the project because he was trying to clean up his own music collection and in this episode he discusses how the project was built, how streaming media is affecting our relationship to digital music, and how he envisions Beets position in the ecosystem in the future.
Determining the best way to manage the capacity and flow of goods through a system is a complicated issue and can be exceedingly expensive to get wrong. Rather than experimenting with the physical objects to determine the optimal algorithm for managing the logistics of everything from global shipping lanes to your local bank, it is better to do that analysis in a simulation. Ruud van der Ham has been working in this area for the majority of his professional life at the Dutch port of Rotterdam. Using his acquired domain knowledge he wrote Salabim as a library to assist others in writing detailed simulations of their own and make logistical analysis of real world systems accessible to anyone with a Python interpreter.