Marketing 2 min read

Meet Vespa, an Open Source Coronavirus Search Engine

Jonathan Weiss /

Jonathan Weiss /

Verizon Media has launched a new open-source, big data coronavirus search engine called Vespa.

Access to information is essential during the current COVID-19 pandemic. Not only are we interested in how the virus makes us ill, but we also want to know what to do about it.

Luckily, researchers have created over 50,000 articles to address these questions. Yes, that’s a lot of information, and it begs the question: how do we make sense of it all?

That’s where Verizon media’s Vespa comes in. Vespa is an open-source, big data processing software to create a coronavirus academic research search engine.

In a statement to the press, Verizon Media CTO, Rathi Murthy said:

“Given our experience with big data at Yahoo, we thought the best way to help was to index the data set and develop a search engine that lets researchers filter and search the 45,000 plus scholarly articles using keywords and simple search terms.”

Here’s how it works.

Using Vespa to Make Sense of Coronavirus Research

The engine works on top of the COVID-19 Open Research Dataset (CORD-19).

With this dataset, medical researchers can conveniently find and create new insights on ways to fight the virus. The researchers update the documents as they publish new papers in peer-reviewed publications and archival services.

These include bioRxiv, biological sciences preprints, as well as medRxiv, health science preprints. Other documents in the database also link to PubMedMicrosoft Academic, and the WHO COVID-19 database of publications.

Unlike some search engines on the internet today, Vespa combines several methods to find the best answer. The Verizon search engine uses a pre-trained data mining model called scibert-nli to search texts.

Normally, Verizon uses Vespa for applications that range from article recommendations to ad targeting. However, the company has now keyword-indexed COVID-19 articles to provide easy access to information related to the disease.

The more tech-savvy researchers can still access data via the CORD-19 application programming interface (API).

Read More: EcoInternet Launches Coronavirus Newsfeed and Search Engine

First AI Web Content Optimization Platform Just for Writers

Found this article interesting?

Let Sumbo Bello know how much you appreciate this article by clicking the heart icon and by sharing this article on social media.

Profile Image

Sumbo Bello

Sumbo Bello is a creative writer who enjoys creating data-driven content for news sites. In his spare time, he plays basketball and listens to Coldplay.

Comment (1)
Most Recent most recent
  1. Profile Image
    Chandrangsu Biswas June 07 at 11:55 am GMT

    They are in such great numbers

share Scroll to top

Link Copied Successfully

Sign in

Sign in to access your personalized homepage, follow authors and topics you love, and clap for stories that matter to you.

Sign in with Google Sign in with Facebook

By using our site you agree to our privacy policy.