Have you ever wondered how Google can give a granular answer to our question?
Conventional search engines were only capable of giving relevant websites for questions. However, due to advancements in Machine Reading Comprehension, Transfer Learning, and Language Modeling, current search engines can provide granular answers to every question. In this article, I will introduce Machine Reading Comprehension.
Machine Reading Comprehension is the task of building a system that understands the passage to answer questions related to it. The input to the Reading Comprehension model is a question and a context/passage. The output to the model is the answer from the passage.
Building any good machine learning model requires relevant and high-quality datasets. Since 2015, datasets with a collection of more than 100,000 samples (question, passage, and answer) have been released.
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of 100,000 question-answer pairs posed by crowdsources on a set of Wikipedia articles. The answer to every question is a segment of text from the reading passage.
A neural question-answering model (QA Model) can be built using SQuAD. The model takes the provided question and respective passage to predict probabilities of starting and ending index of the answer in the passage. The red block in the below picture indicates the index with high probability.
To make the model understand the context of the text, we convert words in question and passages into embeddings/vectors using Language Models.
Language models are probabilistic or statistical models trained on large datasets like google books / Wikipedia dump / scraped websites using different deep learning neural network architectures to understand the context and distinguish between the different meanings of the same word.
The idea of Transfer Learning is to use a Language Model and fine-tune the model for the relevant dataset (in our case, SQuAD).
BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based language model developed by Google that contains 12 encoder stacks (BERT Base) or 24 encoder stacks (BERT Large) trained on both Google Book corpus and Wikipedia corpus.
To finetune BERT for question answering tasks, we can add a single linear layer on top of the BERT Base or Large architectures that contains two outputs, one for predicting starting index position and another for predicting the ending index position.
After fine-tuning the model, the last layer weights get updated based on the task (predicting the start and end index). We can expect fewer changes in the initial layers’ weights as they will have general representations learned by BERT when pre-trained on Google Book corpus and Wikipedia corpus.
Kudos, you completed Introduction to Machine Reading Comprehension. Try the Machine Reading Comprehension demo from Allen NLP.
In the next article, we will discuss scaling Machine Reading Comprehension for long documents of text. Stay tuned for more articles in Open Domain Question Answering Series!