NLP Resources
Here are some links to resources about the core concepts of Natural Language Processing (NLP) that will help you get started with Haystack.
What is NLP?
Learn about what is possible when we apply computational power to language processing.
Title | Type | Author | Description | Level |
---|---|---|---|---|
Natural Language Processing (NLP) | Blog | IBM | High level introduction to the tasks, tools, and use cases of NLP. | Beginner |
Introduction to NLP | Video | Data Science Dojo | Covers many of the different tasks from part-of-speech tagging to the creation of word embeddings. Contains some probabilistic notation. | Intermediate |
Text Classification with NLP: Tf-Idf vs Word2Vec vs BERT | Blog with Code | Mauro Di Pietro | Hands-on and in depth dive into text classification using TF-IDF, Word2Vec and BERT. | Intermediate |
Search and Question Answering
There are many different flavors of search. Learn the differences between them and understand how the task of question answering can improve the search experience.
Title | Type | Author | Description | Level |
---|---|---|---|---|
Question Answering at Scale With Haystack | Blog | Branden Chan (deepset) | High level description of the Retriever-Reader pipeline that gives some intuition about how it works, how it can be deployed. | Beginner |
Understanding Semantic Search | Blog | Branden Chan (deepset) | Disambiguates search jargon and explains the differences between various styles of search. | Beginner |
Haystack: The State of Search in 2021 | Blog | Branden Chan (deepset) | Description of the Retriever-Reader pipeline and an introduction to some complementary tasks. | Beginner |
Modern Question Answering Systems Explained | Blog | Branden Chan (deepset) | Illustrated deeper dive into the inner workings of the Reader model. | Beginner |
How to Build an Open-Domain Question Answering System? | Blog | Lilian Weng | Comprehensive look into the inner workings of a Question Answering system. Contains a lot of mathematical notation. | Advanced |
Text Vectorization and Embeddings
In NLP, text is often converted into a sequence of numbers called an embedding. Learn how they are generated and why they are useful.
Title | Type | Author | Description | Level |
---|---|---|---|---|
What Is Text Vectorization? Everything You Need to Know | Blog | Branden Chan (deepset) | High-level overview of text vectorization starting from TF-IDF to Transformers. | Beginner |
Word Embeddings for NLP | Blog | Renu Khandelwal | Gives good intuition of what word embeddings are and how we use them. Contains some helpful illustrations. | Intermediate |
Introduction to Word Embedding and Word2Vec | Blog | Dhruvil Karani | A deeper dive into the CBOW and Skip Gram versions of Word2Vec. | Advanced |
BERT and Transformers
The majority of the latest NLP systems use a machine learning architecture called the Transformer. BERT is one of the first models of this kind. Learn why these were so revolutionary and how they work.
Title | Type | Author | Description | Level |
---|---|---|---|---|
From Language Model to Haystack Reader | Documentation | deepset | High level overview of how language models, Readers and prediction heads are all related | Beginner |
Intuitive Explanation of BERT- Bidirectional Transformers for NLP | Blog | Renu Khandelwal | Touches upon many of the concepts that are essential to understanding how Transformers work. | Beginner |
A dummy’s guide to BERT | Blog | Nicole Nair | A good high-level summary of the BERT paper. | Beginner |
Learn About Transformers: A Recipe | Blog | Elvis Saravia | Links to many other resources that give explanations or implementations of the Transformer architecture. | Intermediate |
The Illustrated Transformer | Blog | Jay Alammar | Excellent visualization of the inner workings of transformer models. Gets quite deep into details. | Advanced |
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) | Blog | Jay Alammar | Excellent visualization of the inner workings of language models. Gets quite deep into details. | Advanced |