From encoding to machine learning a story of managing media data at Netflix
Have you heard of “software 2.0”? If so, you probably know that managing your dataset is as important as managing your code. Netflix is building a modern media data management system, that is called to address this and some other problems. Curiously enough, it originated form world of video encoding.
Sometime in 2017 Netflix encoding team realized there was neither a common format, not a storage service to access media (spatio-temporal) metadata. Zoo of custom formats for every other team/project/service was getting out of hands. We looked around Netflix, at other companies, and, of course, open source. Nothing seemed to fit the bill. After NMDB became a reality it git some more curious use cases. In this talk I am going to point lessons I learned while implementing NMDB and argue a platform like it could address dataset management issues as well.