Uncategorized 3 min read

All About Google's new NLP Model BERT

Kheng Guan Toh / Shutterstock.com

Kheng Guan Toh / Shutterstock.com

Data, or the lack of it, required for model training, is a significant challenge in the field of natural language processing (NLP).

To get the hang of a specific task, deep learning-based NLP models need tons of human-annotated training examples.

Researchers have been looking for ways to overcome this challenge and make use of the mountains of unlabeled data available on the web. They call this pre-training.

Models are pre-trained on language dumps (like Wikipedia), then fine-tuned for a specific NLP task using labeled data. Then, here comes Google’s NLP model BERT and its newest lite version, ALBERT.

Read More: 3 Amazing Natural Language Processing Applications

NLP model already understands the language and doesn’t have to go from scratch to perform a task at hand. Compared to traditional models, language models can fine-tune itself, unsupervised, to perform better.

In October 2018, Google open-sourced a new method for NLP model pre-training called BERT, short for Bidirectional Encoder Representations from Transformers.

BERT, said Google, allows anyone in the world to train their state-of-the-art NLP model for a variety of tasks in a few hours using a single GPU. Using a cloud Tensor Processing Unit (TPU), like Google’s, training takes even less, about 30 minutes.

Meet Google’s new NLP Model – ALBERT

Now, Google has launched a lite version of BERT, called ALBERT, introduced as A Lite BERT for Self-Supervised Learning of Language Representations.

With this upgrade to BERT, Google’s new deep learning-based NLP model achieved SOTA (state-of-the-art) performance on 12 popular NLP tasks, like question-answering and reading comprehension.

In a paper, the BERT team presented “two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT. Comprehensive empirical evidence shows that our proposed methods lead to models that scale much better compared to the original BERT.”

As the name suggests, ALBERT is a leaner version of BERT. Basically, it’s the same language representation model, with about the same accuracy, but much faster and with 89 percent fewer parameters.

Thanks to the two optimization techniques, ALBERT comes with only 12M parameters, while BERT has 108M. Compared to BERT’s 82.3% average, ALBERT achieves an average of 80.1% accuracy on several NLP benchmarks.

The team trained ALBERT-xxlarge, or double-extra-large, model which gained an overall 30% parameter reduction and performed significantly better on benchmarks compared to the BERT-large model.

“The success of ALBERT, said Google, demonstrates the importance of identifying the aspects of a model that give rise to powerful contextual representations. By focusing improvement efforts on these aspects of the model architecture, it is possible to greatly improve both the model efficiency and performance on a wide range of NLP tasks.”

Besides the English language-based version of ALBERT, Google has also released Chinese-language ALBERT models.

To power further NLP research, Google has made ALBERT open-source, and both ABERT’s code and models are available on this GitHub page.

Read More: Google BERT: All you Need to Know About Google Latest Update

First AI Web Content Optimization Platform Just for Writers

Found this article interesting?

Let Zayan Guedim know how much you appreciate this article by clicking the heart icon and by sharing this article on social media.

Profile Image

Zayan Guedim

Trilingual poet, investigative journalist, and novelist. Zed loves tackling the big existential questions and all-things quantum.

Comments (0)
Most Recent most recent
share Scroll to top

Link Copied Successfully

Sign in

Sign in to access your personalized homepage, follow authors and topics you love, and clap for stories that matter to you.

Sign in with Google Sign in with Facebook

By using our site you agree to our privacy policy.