
In this episode, I speak with Guy Smoilovsky, my friend, Co-Founder, and the CTO of DagsHub. We talk about quantum computing and AGI, concrete approaches for automating ML deployment, and how DagsHub came to be.
Watch the video: https://www.youtube.com/watch?v=67dByhXPT5g
Join our Discord community: https://discord.gg/tEYvqxwhah
Relevant Links:
➡️ Guy Smoilovsky on LinkedIn – https://www.linkedin.com/in/guy-smoilovsky/
➡️ Guy Smoilovsky on Twitter – https://twitter.com/Guy_T_Sky/
TDD in machine learning – https://towardsdatascience.com/tdd-datascience-689c98492fcc
Recommendation Links:
Astral Codex Ten – https://astralcodexten.substack.com/
Don't Worry About the Vase – https://thezvi.wordpress.com/
The Sandman – https://www.imdb.com/title/tt1751634/
Lady Silver – https://www.ladysilverband.com/
🌐 Check Out Our Website! https://dagshub.com
Social Links:
➡️ LinkedIn: https://www.linkedin.com/company/dagshub
➡️ Twitter: https://twitter.com/TheRealDAGsHub
➡️ Dean Pleban: https://twitter.com/DeanPlbn
Oct 18, 2022
1 hr 20 min

In this episode, I speak with Dean Langsam, Data Scientist at SentinelOne and one of the organizers of PyData in Israel. We chat about imposter syndrome, the best field in machine learning, why XGBoost is the best model, and the fact that most organizations have too much data. It was fascinating for me, so I hope you enjoy it too.
🎬 Watch the video: https://www.youtube.com/watch?v=Akz_PpDdLlQ
Join our Discord community: https://discord.gg/tEYvqxwhah
Relevant Links:
⭐️ Join PyData Tel Aviv: https://pydata.org/telaviv2022/ ⭐️
➡️ Dean Langsam on LinkedIn – https://www.linkedin.com/in/deanla/
➡️ Dean Langsam on Twitter – https://twitter.com/dean_la
Recommendation Links:
🌐 Check Out Our Website! https://dagshub.com
Social Links:
➡️ LinkedIn: https://www.linkedin.com/company/dagshub
➡️ Twitter: https://twitter.com/TheRealDAGsHub
➡️ Dean Pleban: https://twitter.com/DeanPlbn
Sep 16, 2022
1 hr 11 min

In this episode, I had the pleasure of speaking with Jacopo Tagliabue, Director of AI at Coveo. We talk about Reasonable Scale MLOps, how to approach building your ML platform, and how quickly you might hit the limits of model deployment (hint: it's pretty surprising)
Join our Discord community: https://discord.gg/tEYvqxwhah
Relevant Links:
➡️ Jacopo on LinkedIn – https://www.linkedin.com/in/jacopotagliabue/
➡️ Jacopo on Twitter – https://twitter.com/jacopotagliabue
Recommendation Links:
📺The Boys – https://www.imdb.com/title/tt1190634/
📚Gödel, Escher, Bach – https://www.goodreads.com/book/show/24113.G_del_Escher_Bach
📚The Three-Body Problem – https://www.goodreads.com/book/show/20518872-the-three-body-problem
🌐 Check Out Our Website! https://dagshub.com
Social Links:
➡️ LinkedIn: https://www.linkedin.com/company/dagshub
➡️ Twitter: https://twitter.com/TheRealDAGsHub
➡️ Dean Pleban: https://twitter.com/DeanPlbn
Aug 22, 2022
1 hr 20 min

In this episode, I had the pleasure of speaking with Goku Mohandas, founder of Made With ML. Goku has an incredible amount of experience building and teaching the community about machine learning and MLOps systems. We dive into system thinking and solving for ML workflows, his journey in the machine learning world, and how he chooses what to learn next. We discuss the most common mistakes he's seen in productionizing ML models and why building models no one will use is not necessarily bad.
Join our Discord community: https://discord.gg/tEYvqxwhah
Relevant Links:
🤩 Check out Made With ML and thank us later – https://madewithml.com/
➡️ Goku on LinkedIn – https://www.linkedin.com/in/goku/
➡️ Goku on Twitter – https://twitter.com/gokumohandas
🌐 Check Out Our Website! https://dagshub.com
Social Links:
➡️ LinkedIn: https://www.linkedin.com/company/dagshub
➡️ Twitter: https://twitter.com/TheRealDAGsHub
➡️ Dean Pleban: https://twitter.com/DeanPlbn
Jul 18, 2022
1 hr 28 min

In this episode, I had the pleasure of speaking with Kyle Gallatin, a Machine Learning Software Engineer at Etsy. We talk about how he built the machine learning platform at Etsy, experimentation in production (yes, you heard right), and how to optimize model performance at very large scales. It was awesome, and I'm sure many of you can learn a ton from this one!
Join our Discord community: https://discord.gg/tEYvqxwhah
Relevant Links:
➡️ Kyle on LinkedIn – https://www.linkedin.com/in/kylegallatin/
🌐Check Out Our Website! https://dagshub.com
Social Links:
LinkedIn: https://www.linkedin.com/company/dagshub
Twitter: https://twitter.com/TheRealDAGsHub
Dean Pleban: https://twitter.com/DeanPlbn
Jun 20, 2022
58 min

In this episode, I'm speaking with Charlene Chambliss, Software Engineer at Aquarium. Charlene has vast experience getting NLP models to production. We dive into the intricacies of these models and how they differ from other ML subfields, the challenges in productionizing them, and how to get excited about data quality issues.
Join our Discord community: https://discord.gg/tEYvqxwhah
Relevant Links:
➡️Charlene on LinkedIn – https://www.linkedin.com/in/charlenechambliss/
➡️Charlene on Twitter – https://twitter.com/blissfulchar
Recommendations:
🎬3blue1brown – Awesome YouTube channel about math & science: https://www.youtube.com/c/3blue1brown
🎙NLP Highlights – Allen AI Insititute podcast about NLP research: https://soundcloud.com/nlp-highlights
🎙Software engineering daily: https://softwareengineeringdaily.com/
🎙TWiML – Another great podcast about machine learning and AI: https://twimlai.com/
📰Sebastian Ruder's blog and newsletter about NLP and ML: https://ruder.io/
📰Taming the Tail: Adventures in Improving AI Economics: https://a16z.com/2020/08/12/taming-the-tail-adventures-in-improving-ai-economics/
📰State of AI report (2021): https://www.stateof.ai/
📕Learn to learn – Ultralearning by Scott Young: https://www.scotthyoung.com/
🌐Check Out Our Website! https://dagshub.com
Social Links:
🟦LinkedIn: https://www.linkedin.com/company/dagshub
🐦Twitter: https://twitter.com/TheRealDAGsHub
🐦Dean Pleban: https://twitter.com/DeanPlbn
May 16, 2022
1 hr 1 min

In this episode, I'm speaking with the one and only, Yannic Kilcher! We talk about sunglasses 😎, the value and methodologies behind taking complex machine learning research, and making the idea accessible and digestible. We also discuss reproducibility in machine learning and the moving between research and entrepreneurship.
If you haven't seen his videos you should definitely check them out on his YouTube channel (https://www.youtube.com/c/YannicKilcher).
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Relevant Links:
➡️Yannics's amazing YouTube channel – https://www.youtube.com/c/YannicKilcher
➡️Yannic on LinkedIn – https://www.linkedin.com/in/ykilcher/
➡️Yannic on Twitter – https://twitter.com/ykilcher
Recommendations:
🎬Veritasium – YouTube channel about science with really good explanations about complex topics: https://www.youtube.com/c/veritasium
📖The Fifth Season – Good sci-fi fantasy book: https://www.amazon.com/Fifth-Season-Broken-Earth/dp/0316229296
🌐Check Out Our Website! https://dagshub.com Social
Links:
➡️LinkedIn: https://www.linkedin.com/company/dagshub
➡️Twitter: https://twitter.com/TheRealDAGsHub
➡️Dean PlbnTwitter: https://twitter.com/DeanPlbn
Apr 18, 2022
1 hr 3 min

In this episode, we dive into the challenging but very important topic of getting data scientists to write better code. How to approach complex machine learning projects and break them down, and why growing unicorns 🦄 is better than hunting them. Check out this is an awesome conversation with Laszlo Sragner, Founder at 🔥 Hypergolic.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Podcast intro
01:00 Guest introduction
02:34 Why is writing better code important for data scientists?
03:40 How to improve your code
08:17 Don't be afraid of your code.
10:42 Breaking experiments into manageable pieces
12:35 How did your past experiences teach you to strive for better code?
15:21 Proving better code is worth it
18:07 What could be adopted from software development
23:06 What's the most interesting/challenging part of taking models to production?
27:12 What is the hardest part about building a machine learning model?
29:30 How it looks when it works well – a detailed example
36:23 The difference in writing better code in smaller startups compared to larger organizations
39:18 Laszlo's process for the first iteration in a machine learning project
44:33 Breaking data problems down into vertical slices
47:55 End-To-End Platforms vs. Best-of-breed tools
50:30 Obligatory job title discussion...
53:30 Hunting for data science unicorns
56:33 Traits to look for when building a data science team
58:30 Build vs. Buy? What's better?
59:56 What is the most exciting trend in ML and MLOps?
1:00:47 How do you stay up to date?
1:01:40 Recommendations for the audience
---
Relevant Links:
➡️Laszlo's awesome substack – https://laszlo.substack.com/
➡️Laszlo's LinkedIn – https://www.linkedin.com/in/laszlosragner/
➡️Laszlo's Twitter – https://twitter.com/xLaszlo
Recommendations:
👀Explore/Expand/Extract by Kent Beck: https://www.youtube.com/watch?v=FlJN6_4yI2A
👩💻Code Quality – Refactoring by Martin Fowler: https://martinfowler.com/books/refactoring.html
📐Geometric Deep Learning by Bronstein/Velickovic: https://www.youtube.com/watch?v=5h6MbQ_65-o
✍️Online Writing by Nicolas Cole: https://www.youtube.com/watch?v=Od5J2V-Lmlg
📕The Last Shadow by Orson Scott Card: https://www.goodreads.com/en/book/show/7108926-the-last-shadow
🎬7 minutes, 26 seconds, and the Fundamental Theorem of Agile Software Development: https://www.youtube.com/watch?v=WSes_PexXcA
🌐Check Out Our Website! https://dagshub.com
Social Links:
➡️LinkedIn: https://www.linkedin.com/company/dagshub
➡️Twitter: https://twitter.com/TheRealDAGsHub
➡️Dean PlbnTwitter: https://twitter.com/DeanPlbn
Feb 14, 2022
1 hr 5 min

In this episode, I'm speaking with Lee Harper, Principal Data Scientist at Catapult Systems. Lee holds a Ph.D. in Physical and Theoretical Chemistry. Lee is a teacher-turned-data scientist. We cover the various entry paths into the world of data science, the value of background diversity, security in ML production, and even AI fairness.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Podcast intro
01:00 Guest introduction
01:39 How did you get into the fields of data science and machine learning?
05:04 Coding boot camps vs. academia & diversity of backgrounds in ML
09:37 How does the process of bringing your work into production change over the years?
13:02 How has the change in the languages used for data science affected production processes?
16:01 How do you accelerate the timeframes for getting from POC to production in ML?
18:19 Do data scientists reinvent the wheel more often than software developers, and why?
22:14 The value of learning how to Google
23:00 Recurring themes, challenges, and common issues in data science
27:50 Solving for security in ML in production
31:57 ML security considerations for startups
34:30 Data security considerations in ML
35:18 What is the most interesting topic in machine learning right now?
38:05 ML fairness, bias, and responsible AI
41:44 What does it mean to build a fair or unbiased model?
47:15 If you had to choose one challenge in bringing models to production, what would it be?
51:00 What are the tools and processes that you use to make the transition to production easier?
55:35 About "vendor lock-in"
58:00 Your favorite tool recommendations
1:03:35 Recommendations for the audience
---
Relevant Links:
Linux Command Line and Shell Scripting Bible – https://www.amazon.com/Linux-Command-Shell-Scripting-Bible/dp/1119700914
Project Hail Mary – https://www.amazon.com/Project-Hail-Mary-Andy-Weir/dp/0593135202
Social Links:
https://www.linkedin.com/company/dagshub/
https://www.linkedin.com/company/catapult-systems/
https://www.linkedin.com/in/leeharper2425/
https://twitter.com/DeanPlbn
https://twitter.com/TheRealDAGsHub
Nov 4, 2021
1 hr 8 min

🧠 Algorithmic challenges in bringing ML models into production with Roey Mechrez, CTO at BeyondMinds
In this episode, I'm speaking with Roey Mechrez from BeyondMinds. Roey holds a Ph.D. in Electrical Engineering, with vast experience in computer vision and deep learning research. We discuss the challenges of gluing together infrastructure solutions for an end-to-end ML platform, as well as generating monitoring insights for non-technical stakeholders and combating catastrophic forgetting.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Podcast intro
01:00 Guest intro
01:49 What does BeyondMinds do?
06:24 Audience for an end-to-end ML platform
12:14 Communicating with non-technical stakeholders/users
15:03 The future of "AI-powered tools", and human-machine collaboration
20:04 On complex system orchestration, generating insights from monitoring, and catastrophic forgetting – Biggest challenges in production ML
25:23 Why is catastrophic forgetting a hard problem and how do you deal with it?
30:02 "Secret" tips on how to get started with automating the retraining process
33:30 Generating monitoring insights and observations in a user-friendly format
38:12 Making data labeling issues explainable (automatically)
45:07 Customizing complex systems per user – Orchestrating an ML platform
52:58 API design in ML platform components
55:45 Measuring success for researchers, ML engineers, and software developers – can ML work fit into the Agile workflow.
1:02:22 Is "time to production" a good metric? Gains in time to production in the real world
1:06:02 How do you divide the work between ML researchers and engineers?
1:08:39 Recommendations for the audience
---
Relevant Links:
A16z blog about AI
Data Science work in an agile environment – A talk by Dima Goldenberg
Hayot Kis (Hebrew Podcast) חיות כיס
Data Engineering Podcast
ACX Podcast
Social Links:
https://www.linkedin.com/company/beyondminds/
https://www.linkedin.com/company/dagshub/
https://twitter.com/roeyme
https://twitter.com/DeanPlbn
https://twitter.com/TheRealDAGsHub
Sep 20, 2021
1 hr 13 min
Load more
