Data Science Deployed
Data Science Deployed
@dsdeployed
Data Versioning for Data Science
51 minutes Posted Oct 20, 2021 at 12:49 pm.
0:00
51:16
Download MP3
Show notes

Today we talk about Data Versioning. Why you should do it, what to do about humans in the loop, and how to minimize mistakes. 

 

Tools mentioned:

 

DVC - https://dvc.org/

Quilt Data Versioning - https://quiltdata.com/

Apache Airflow - https://airflow.apache.org/

Apache Superset - https://superset.apache.org/

OpenProject - https://www.openproject.org/

 

----------------------------------------

 

Follow the podcast on Twitter: @dsdeployed

https://twitter.com/dsdeployed

 

----------------------------------------

 

Donny Winston

 

I help researchers do data-intensive science together.

Twitter: https://twitter.com/donnywinston @donnywinston

Email: [email protected]

Website: https://polyneme.xyz/

LinkedIn: https://www.linkedin.com/in/donnywinston/

 

Ben Cook

I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.

 

Twitter: ​​@jbencook https://twitter.com/jbencook

LinkedIn: https://www.linkedin.com/in/jbencook/

Email: [email protected]

Website: https://sparrow.dev/

 

Jillian Rowe

I help biotech startups deploy scalable high performance compute infrastructure on AWS.

 

Email: [email protected] 

Website: https://www.dabbleofdevops.com

Twitter: www.twitter.com/jillianerowe

LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/