Active learning and weak supervision in NLP projects

Successful artificial intelligence solutions always require a massive amount of high-quality labeled data. In most cases, we don’t have a large and qualitative labeled set together. Weak supervision and active learning tools may help you optimize the labeling process and address the shortage of data labels.

First, we will review how active learning can significantly reduce the amount of labeled data for training with classic approaches. We will show how active learning methods can be customized for a specific (NLP) task by using text embedding.

With weak supervision, we will see how using simple rules gets a big train dataset automatically and high model performance without manual labeling at all.

In the end, we will combine active learning and weak supervision by taking advantage of both techniques and achieving the best metrics.

Mariia Havrylovych
Wix
  • The data scientist in the machine learning team at Wix
  • Specializing in NLP solutions and has experience in building NLP-based recommendation scalable systems and end-to-end ML applications
  • She is interested in insolving the low-resource problems and non-supervised techniques
  • Linkedin
Sign in
Or by mail
Sign in
Or by mail
Register with email
Register with email
Forgot password?