Researchers develop question-answering dataset for evaluating NLP tools for COVID-19

April 28, 2020
Very important

In a preprint paper, researchers from the University of Waterloo and other organizations used the COVID-19 Open Research Dataset (CORD-19) to develop a question answering dataset to evaluate the information retrieval capabilities of NLP tools. The dataset included 124 question-answering pairs, which consisted of keyword searches and natural language questions associated with the text containing the answer in a document. As the dataset was too small for training models from scratch, the best-performing models tested against the benchmark used transfer learning to fine-tune unsupervised NLP models. Clients should expect efforts like these and the forthcoming effort from NIST to play a key role in improving AI tools used for COVID-19 research.

For the original news article, click here .

Further Reading

Data cleaning adjacencies: Identifying automation opportunities in the broader data preparation process

Analyst Insight | June 29, 2020

In our earlier insights "Data Cleaning 101" and "Emerging data cleaning solutions," Lux discussed the foundational elements of dirty data as well as the various solutions organizations can implement to automate data cleaning. However, as evident from these insights, only a handful of successful ... Not part of subscription

Deep Learning

Technology | January 07, 2021

Advanced machine learning techniques making use of neural networks for much‑improved computer vision, speech recognition, natural language processing, filtering, and more. Not part of subscription

Six key AI technology trends emerging in 2021

Analyst Insight | May 12, 2021

Artificial intelligence (AI) technology trends evolve quickly, and staying on top of these trends is a critical part of planning for those developing, adopting, or investing in AI. From our research and conversations with many AI and machine learning stakeholders, we've identified six key trends ... Not part of subscription