
A lot of things can (and have) gone wrong when folks tried to apply data science projects. So how might we prevent that? Maybe what we need to do is to look at the medical profession and their practice of checklists before surgery.
Jul 17, 2024
1 hr 5 min

Historically it's always been the case that you would use a pickle file to store a trained scikit-learn model on disk for deployment. Pickles make sense because these are so flexible, but they do carry a security concern. Adrin has been working on a remedy called skops, which is the main topic of this podcast. To learn more about skops, make sure to check the documentation: https://skops.readthedocs.io/en/stable/
Jun 27, 2024
1 hr 1 min

Leland McInnes is known for a lot of packages. There's UMAP, but also PyNNDescent and HDBScan. Recently he's also been working on tools to help visualise clusters of data and he's also cooking up something new that's related to nearest neighbor algorithms. This interview touches all of these topics.If you're interested in learning more about the MoMA exhibition, it was by Refik Anadol: https://refikanadol.com/ and this was the work at MoMA: https://refikanadol.com/works/unsupervised/.The other artist was Kyle McDonald: https://kylemcdonald.net/ and the piece we mentioned was this one: https://www.youtube.com/watch?v=04DqdT0-NtI.
May 30, 2024
57 min

Ibis is a Python library that offers a single data-frame API, from Python, which can run your queries on many different backends. These include databases like Postgres, but also commercial vendors like BigQuery and Snowflake. This ability to control multiple backends from a single API has a lot of use-cases, as well as maintainer challenges, all of which are discussed in this episode. To learn more about Ibis, check out the docs here: https://ibis-project.org/ If you're attending PyCon US this year, you may be interested in Philip's talk: https://us.pycon.org/2024/schedule/presentation/55/ During the podcast, Philip also mentioned a blogpost about DuckDB, here: https://ibis-project.org/posts/why-duckdb/ There was also a dogfooding blogpost, which is this one: https://ibis-project.org/posts/ci-analysis/
May 2, 2024
1 hr 4 min

In this (first!) episode of Sample Space we talk to Trevor Mantz, the creator of anywidget. It's a (neat!) tool to help you build more interactive notebooks by giving you tools to apply just enough Javascript to get directional communication working in your favorite notebook environment. That means that Python can talk to widgets, but also that widgets can talk to Python. There's a lot to like about these widgets and we're doing a proper deep dive in this first episode.To learn more about anywidget, check out the docs. In particular you may want to glance at the gallery first, it has loads of nice examples.You can also find the project on Github and if you're eager to talk to folks involved with the project, consider joining the discord here.
Apr 11, 2024
1 hr 11 min

