Scaling ML from 0 to millions of users

Talk video

Talk presentation

Data scientists and machine learning engineers use a variety of open source projects in their everyday tasks: scikit-learn, SparkML, TensorFlow, Apache MXNet, Pytorch, etc. They make it very easy to get started, but as models become more complex and datasets become larger, training time and prediction latency become a significant concern. Here too, containers can help, especially when used with elastic on-demand compute services.

In this session, I'll show you how to scale machine learning workloads using containers on AWS (Deep Learning AMI and containers, ECS, EKS, SageMaker). We'll discuss the pros and cons of these different services from a technical, operational and cost perspective. Of course, we'll run some demos :)

Julien Simon

Amazon Web Services

Global AI & Machine Learning Evangelist.
Frequently speaks at conferences and he's also actively blogging at Medium.
Prior to joining AWS, Julien served for 10 years as CTO/VP Engineering in top-tier web startups where he led large Software and Ops teams.
In the process, he fought his way through a wide range of technical, business and procurement issues, which helped him gain a deep understanding of physical infrastructure, its limitations and how cloud computing can help.
Holds ten AWS certifications.
Twitter, Gitlab

Buy tickets for the next conference Highload fwdays'26 conference!

Scaling ML from 0 to millions of users

Talk video

Talk presentation