
Summary
Building machine learning systems and other intelligent applications are a complex undertaking. This often requires retrieving data from a warehouse engine, adding an extra barrier to every workflow. The RelationalAI engine was built as a co-processor for your data warehouse that adds a greater degree of flexibility in the representation and analysis of the underlying information, simplifying the work involved. In this episode CEO Molham Aref explains how RelationalAI is designed, the capabilities that it adds to your data clouds, and how you can start using it to build more sophisticated applications on your data.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Molham Aref about RelationalAI and the principles behind it for powering intelligent applications
Interview
Introduction
How did you get involved in machine learning?
Can you describe what RelationalAI is and the story behind it?
On your site you call your product an "AI Co-processor". Can you explain what you mean by that phrase?
What are the primary use cases that you address with the RelationalAI product?
What are the types of solutions that teams might build to address those problems in the absence of something like the RelationalAI engine?
Can you describe the system design of RelationalAI?
How have the design and goals of the platform changed since you first started working on it?
For someone who is using RelationalAI to address a business need, what does the onboarding and implementation workflow look like?
What is your design philosophy for identifying the balance between automating the implementation of certain categories of application (e.g. NER) vs. providing building blocks and letting teams assemble them on their own?
What are the data modeling paradigms that teams should be aware of to make the best use of the RKGS platform and Rel language?
What are the aspects of customer education that you find yourself spending the most time on?
What are some of the most under-utilized or misunderstood capabilities of the RelationalAI platform that you think deserve more attention?
What are the most interesting, innovative, or unexpected ways that you have seen the RelationalAI product used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on RelationalAI?
When is RelationalAI the wrong choice?
What do you have planned for the future of RelationalAI?
Contact Info
LinkedIn
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
RelationalAI
Snowflake
AI Winter
BigQuery
Gradient Descent
B-Tree
Navigational Database
Hadoop
Teradata
Worst Case Optimal Join
Semantic Query Optimization
Relational Algebra
HyperGraph
Linear Algebra
Vector Database
Pathway
Data Engineering Podcast Episode
Pinecone
Data Engineering Podcast Episode
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Dec 31, 2023
58 min

Summary
Machine learning and generative AI systems have produced truly impressive capabilities. Unfortunately, many of these applications are not designed with the privacy of end-users in mind. TripleBlind is a platform focused on embedding privacy preserving techniques in the machine learning process to produce more user-friendly AI products. In this episode Gharib Gharibi explains how the current generation of applications can be susceptible to leaking user data and how to counteract those trends.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Gharib Gharibi about the challenges of bias and data privacy in generative AI models
Interview
Introduction
How did you get involved in machine learning?
Generative AI has been gaining a lot of attention and speculation about its impact. What are some of the risks that these capabilities pose?
What are the main contributing factors to their existing shortcomings?
What are some of the subtle ways that bias in the source data can manifest?
In addition to inaccurate results, there is also a question of how user interactions might be re-purposed and potential impacts on data and personal privacy. What are the main sources of risk?
With the massive attention that generative AI has created and the perspectives that are being shaped by it, how do you see that impacting the general perception of other implementations of AI/ML?
How can ML practitioners improve and convey the trustworthiness of their models to end users?
What are the risks for the industry if generative models fall out of favor with the public?
How does your work at Tripleblind help to encourage a conscientious approach to AI?
What are the most interesting, innovative, or unexpected ways that you have seen data privacy addressed in AI applications?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on privacy in AI?
When is TripleBlind the wrong choice?
What do you have planned for the future of TripleBlind?
Contact Info
LinkedIn
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
TripleBlind
ImageNet Geoffrey Hinton Paper
BERT language model
Generative AI
GPT == Generative Pre-trained Transformer
HIPAA Safe Harbor Rules
Federated Learning
Differential Privacy
Homomorphic Encryption
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Nov 22, 2023
46 min

Summary
Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran Yahav shares the journey that he has taken in building this product and the ways that it enhances the ability of humans to get their work done, and when the humans have to adapt to the tool.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Eran Yahav about building an AI powered developer assistant at Tabnine
Interview
Introduction
How did you get involved in machine learning?
Can you describe what Tabnine is and the story behind it?
What are the individual and organizational motivations for using AI to generate code?
What are the real-world limitations of generative AI for creating software? (e.g. size/complexity of the outputs, naming conventions, etc.)
What are the elements of skepticism/oversight that developers need to exercise while using a system like Tabnine?
What are some of the primary ways that developers interact with Tabnine during their development workflow?
Are there any particular styles of software for which an AI is more appropriate/capable? (e.g. webapps vs. data pipelines vs. exploratory analysis, etc.)
For natural languages there is a strong bias toward English in the current generation of LLMs. How does that translate into computer languages? (e.g. Python, Java, C++, etc.)
Can you describe the structure and implementation of Tabnine?
Do you rely primarily on a single core model, or do you have multiple models with subspecialization?
How have the design and goals of the product changed since you first started working on it?
What are the biggest challenges in building a custom LLM for code?
What are the opportunities for specialization of the model architecture given the highly structured nature of the problem domain?
For users of Tabnine, how do you assess/monitor the accuracy of recommendations?
What are the feedback and reinforcement mechanisms for the model(s)?
What are the most interesting, innovative, or unexpected ways that you have seen Tabnine's LLM powered coding assistant used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI assisted development at Tabnine?
When is an AI developer assistant the wrong choice?
What do you have planned for the future of Tabnine?
Contact Info
LinkedIn
Website
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
TabNine
Technion University
Program Synthesis
Context Stuffing
Elixir
Dependency Injection
COBOL
Verilog
MidJourney
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Nov 13, 2023
1 hr 4 min

Summary
Software systems power much of the modern world. For applications that impact the safety and well-being of people there is an extra set of precautions that need to be addressed before deploying to production. If machine learning and AI are part of that application then there is a greater need to validate the proper functionality of the models. In this episode Erez Kaminski shares the work that he is doing at Ketryx to make that validation easier to implement and incorporate into the ongoing maintenance of software and machine learning products.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Erez Kaminski about using machine learning in safety critical and highly regulated medical applications
Interview
Introduction
How did you get involved in machine learning?
Can you start by describing some of the regulatory burdens placed on ML teams who are building solutions for medical applications?
How do these requirements impact the development and validation processes of model design and development?
What are some examples of the procedural and record-keeping aspects of the machine learning workflow that are required for FDA compliance?
What are the opportunities for automating pieces of that overhead?
Can you describe what you are doing at Ketryx to streamline the development/training/deployment of ML/AI applications for medical use cases?
What are the ideas/assumptions that you had at the start of Ketryx that have been challenged/updated as you work with customers?
What are the most interesting, innovative, or unexpected ways that you have seen ML used in medical applications?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Ketryx?
When is Ketryx the wrong choice?
What do you have planned for the future of Ketryx?
Contact Info
Email
LinkedIn
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
Ketryx
Wolfram Alpha
Mathematica
Tensorflow
SBOM == Software Bill Of Materials
Air-gapped Systems
AlexNet
Shapley Values
SHAP
Podcast.__init__ Episode
Bayesian Statistics
Causal Modeling
Prophet
FDA Principles Of Software Validation
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Nov 8, 2023
51 min

Summary
Large language models have gained a substantial amount of attention in the area of AI and machine learning. While they are impressive, there are many applications where they are not the best option. In this episode Piero Molino explains how declarative ML approaches allow you to make the best use of the available tools across use cases and data formats.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Piero Molino about the application of declarative ML in a world being dominated by large language models
Interview
Introduction
How did you get involved in machine learning?
Can you start by summarizing your perspective on the effect that LLMs are having on the AI/ML industry?
In a world where LLMs are being applied to a growing variety of use cases, what are the capabilities that they still lack?
How does declarative ML help to address those shortcomings?
The majority of current hype is about commercial models (e.g. GPT-4). Can you summarize the current state of the ecosystem for open source LLMs?
For teams who are investing in ML/AI capabilities, what are the sources of platform risk for LLMs?
What are the comparative benefits of using a declarative ML approach?
What are the most interesting, innovative, or unexpected ways that you have seen LLMs used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on declarative ML in the age of LLMs?
When is an LLM the wrong choice?
What do you have planned for the future of declarative ML and Predibase?
Contact Info
LinkedIn
Website
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Links
Predibase
Podcast Episode
Ludwig
Podcast.__init__ Episode
Recommender Systems
Information Retrieval
Vector Database
Transformer Model
BERT
Context Windows
LLAMA
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Oct 24, 2023
46 min

Summary
Artificial Intelligence is experiencing a renaissance in the wake of breakthrough natural language models. With new businesses sprouting up to address the various needs of ML and AI teams across the industry, it is a constant challenge to stay informed. Matt Turck has been compiling a report on the state of ML, AI, and Data for his work at FirstMark Capital. In this episode he shares his findings on the ML and AI landscape and the interesting trends that are developing.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
As more people start using AI for projects, two things are clear: It’s a rapidly advancing field, but it’s tough to navigate. How can you get the best results for your use case? Instead of being subjected to a bunch of buzzword bingo, hear directly from pioneers in the developer and data science space on how they use graph tech to build AI-powered apps. . Attend the dev and ML talks at NODES 2023, a free online conference on October 26 featuring some of the brightest minds in tech. Check out the agenda and register today at Neo4j.com/NODES.
Your host is Tobias Macey and today I'm interviewing Matt Turck about his work on the MAD (ML, AI, and Data) landscape and the insights he has gained on the ML ecosystem
Interview
Introduction
How did you get involved in machine learning?
Can you describe what the MAD landscape project is and the story behind it?
What are the major changes in the ML ecosystem that you have seen since you first started compiling the landscape?
How have the developments in consumer-grade AI in recent years changed the business opportunities for ML/AI?
What are the coarse divisions that you see as the boundaries that define the different categories for ML/AI in the landscape?
For ML infrastructure products/companies, what are the biggest challenges that they face in engineering and customer acquisition?
What are some of the challenges in building momentum for startups in AI (existing moats around data access, talent acquisition, etc.)?
For products/companies that have ML/AI as their core offering, what are some strategies that they use to compete with "big tech" companies that already have a large corpus of data?
What do you see as the societal vs. business importance of open source models as AI becomes more integrated into consumer facing products?
What are the most interesting, innovative, or unexpected ways that you have seen ML/AI used in business and social contexts?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on the ML/AI elements of the MAD landscape?
When is ML/AI the wrong choice for businesses?
What are the areas of ML/AI that you are paying closest attention to in your own work?
Contact Info
Website
@mattturck on Twitter
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
MAD Landscape
Data Engineering Podcast Episode
First Mark Capital
Bayesian Techniques
Hadoop
ChatGPT
AutoGPT
Dataiku
Generative AI
Databricks
MLOps
OpenAI
Anthropic
DeepMind
BloombergGPT
HuggingFace
Jexi Movie
"Her" Movie
Synthesia
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Oct 15, 2023
1 hr 2 min

Summary
A core challenge of machine learning systems is getting access to quality data. This often means centralizing information in a single system, but that is impractical in highly regulated industries, such as healthchare. To address this hurdle Rhino Health is building a platform for federated learning on health data, so that everyone can maintain data privacy while benefiting from AI capabilities. In this episode Ittai Dayan explains the barriers to ML in healthcare and how they have designed the Rhino platform to overcome them.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Ittai Dayan about using federated learning at Rhino Health to bring AI capabilities to the tightly regulated healthcare industry
Interview
Introduction
How did you get involved in machine learning?
Can you describe what Rhino Health is and the story behind it?
What is federated learning and what are the trade-offs that it introduces?
What are the benefits to healthcare and pharmalogical organizations from using federated learning?
What are some of the challenges that you face in validating that patient data is properly de-identified in the federated models?
Can you describe what the Rhino Health platform offers and how it is implemented?
How have the design and goals of the system changed since you started working on it?
What are the technological capabilities that are needed for an organization to be able to start using Rhino Health to gain insights into their patient and clinical data?
How have you approached the design of your product to reduce the effort to onboard new customers and solutions?
What are some examples of the types of automation that you are able to provide to your customers? (e.g. medical diagnosis, radiology review, health outcome predictions, etc.)
What are the ethical and regulatory challenges that you have had to address in the development of your platform?
What are the most interesting, innovative, or unexpected ways that you have seen Rhino Health used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Rhino Health?
When is Rhino Health the wrong choice?
What do you have planned for the future of Rhino Health?
Contact Info
LinkedIn
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
Rhino Health
Federated Learning
Nvidia Clara
Nvidia DGX
Melloddy
Flair NLP
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Sep 11, 2023
49 min

Summary
Satellite imagery has given us a new perspective on our world, but it is limited by the field of view for the cameras. Synthetic Aperture Radar (SAR) allows for collecting images through clouds and in the dark, giving us a more consistent means of collecting data. In order to identify interesting details in such a vast amount of data it is necessary to use the power of machine learning. ICEYE has a fleet of satellites continuously collecting information about our planet. In this episode Tapio Friberg shares how they are applying ML to that data set to provide useful insights about fires, floods, and other terrestrial phenomena.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Tapio Friberg about building machine learning applications on top of SAR (Synthetic Aperture Radar) data to generate insights about our planet
Interview
Introduction
How did you get involved in machine learning?
Can you describe what ICEYE is and the story behind it?
What are some of the applications of ML at ICEYE?
What are some of the ways that SAR data poses a unique challenge to ML applications?
What are some of the elements of the ML workflow that you are able to use "off the shelf" and where are the areas that you have had to build custom solutions?
Can you share the structure of your engineering team and the role that the ML function plays in the larger organization?
What does the end-to-end workflow for your ML model development and deployment look like?
What are the operational requirements for your models? (e.g. batch execution, real-time, interactive inference, etc.)
In the model definitions, what are the elements of the source domain that create the largest challenges? (e.g. noise from backscatter, variance in resolution, etc.)
Once you have an output from an ML model how do you manage mapping between data domains to reflect insights from SAR sources onto a human understandable representation?
Given that SAR data and earth imaging is still a very niche domain, how does that influence your ability to hire for open positions and the ways that you think about your contributions to the overall ML ecosystem?
How can your work on using SAR as a representation of physical attributes help to improve capabilities in e.g. LIDAR, computer vision, etc.?
What are the most interesting, innovative, or unexpected ways that you have seen ICEYE and SAR data used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on ML for SAR data?
What do you have planned for the future of ML applications at ICEYE?
Contact Info
LinkedIn
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
ICEYE
SAR == Synthetic Aperture Radar
Transfer Learning
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
Jun 17, 2023
42 min

Summary
The focus of machine learning projects has long been the model that is built in the process. As AI powered applications grow in popularity and power, the model is just the beginning. In this episode Josh Tobin shares his experience from his time as a machine learning researcher up to his current work as a founder at Gantry, and the shift in focus from model development to machine learning systems.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Josh Tobin about the state of industry best practices for designing and building ML models
Interview
Introduction
How did you get involved in machine learning?
Can you start by describing what a "traditional" process for building a model looks like?
What are the forces that shaped those "best practices"?
What are some of the practices that are still necessary/useful and what is becoming outdated?
What are the changes in the ecosystem (tooling, research, communal knowledge, etc.) that are forcing teams to reconsider how they think about modeling?
What are the most critical practices/capabilities for teams who are building services powered by ML/AI?
What systems do they need to support them in those efforts?
Can you describe what you are building at Gantry and how it aids in the process of developing/deploying/maintaining models with "modern" workflows?
What are the most challenging aspects of building a platform that supports ML teams in their workflows?
What are the most interesting, innovative, or unexpected ways that you have seen teams approach model development/validation?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Gantry?
When is Gantry the wrong choice?
What are some of the resources that you find most helpful to stay apprised of how modeling and ML practices are evolving?
Contact Info
LinkedIn
Website
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
Gantry
Full Stack Deep Learning
OpenAI
Kaggle
NeurIPS == Neural Information Processing Systems Conference
Caffe
Theano
Deep Learning
Regression Model
scikit-learn
Large Language Model
Foundation Models
Cohere
Federated Learning
Feature Store
dbt
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Support The Machine Learning Podcast
May 29, 2023
46 min

Summary
Machine learning models have predominantly been built and updated in a batch modality. While this is operationally simpler, it doesn't always provide the best experience or capabilities for end users of the model. Tecton has been investing in the infrastructure and workflows that enable building and updating ML models with real-time data to allow you to react to real-world events as they happen. In this episode CTO Kevin Stumpf explores they benefits of real-time machine learning and the systems that are necessary to support the development and maintenance of those models.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Kevin Stumpf about the challenges and promise of real-time ML applications
Interview
Introduction
How did you get involved in machine learning?
Can you describe what real-time ML is and some examples of where it might be applied?
What are the operational and organizational requirements for being able to adopt real-time approaches for ML projects?
What are some of the ways that real-time requirements influence the scale/scope/architecture of an ML model?
What are some of the failure modes for real-time vs analytical or operational ML?
Given the low latency between source/input data being generated or received and a prediction being generated, how does that influence susceptibility to e.g. data drift?
Data quality and accuracy also become more critical. What are some of the validation strategies that teams need to consider as they move to real-time?
What are the most interesting, innovative, or unexpected ways that you have seen real-time ML applied?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on real-time ML systems?
When is real-time the wrong choice for ML?
What do you have planned for the future of real-time support for ML in Tecton?
Contact Info
LinkedIn
@kevinmstumpf on Twitter
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
Tecton
Podcast Episode
Data Engineering Podcast Episode
Uber Michelangelo
Reinforcement Learning
Online Learning
Random Forest
ChatGPT
XGBoost
Linear Regression
Train-Serve Skew
Flink
Data Engineering Podcast Episode
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0Sponsored By:Data Council: 
Join us at the event for the global data community, Data Council Austin. From March 28-30th 2023, we'll play host to hundreds of attendees, 100 top speakers, and dozens of startups that are advancing data science, engineering and AI. Data Council attendees are amazing founders, data scientists, lead engineers, CTOs, heads of data, investors and community organizers who are all working together to build the future of data. As a listener to the Data Engineering Podcast you can get a special discount off tickets by using the promo code dataengpod20. Don't miss out on our only event this year! Visit: [themachinelearningpodcast.com/data-council](https://www.themachinelearningpodcast.com/data-council) Promo Code: dataengpod20Support The Machine Learning Podcast
Mar 9, 2023
34 min
Load more
