Finding the Good with Watson Discovery and NLU

Gaurav Kumbhat
IBM Data Science in Practice
4 min readApr 21, 2020

--

Co-authored with Evaline Ju

Whenever we check any level of news — from local to international, we see are headlines regarding the COVID-19 pandemic. Most of these articles report new cases, death counts, and government policies, all of which contribute to an increasingly negative atmosphere. Among these news articles, people often find it difficult to identify progress in fighting the coronavirus, such as research for cures and vaccinations or patient recovery stories.

We wanted to find more positive news about the pandemic using our Watson Natural Language Understanding (NLU) sentiment analysis capability. However, as developers, we are aware that stock sentiment models are not trained to handle data in the healthcare domain which could sometimes limit their capability to identify positive news articles related to the pandemic. Our sentiment models are trained on social media and user reviews data, where any mention of “coronavirus” may trigger negative sentiment. Certain domains like healthcare have specific vocabulary usage and rules as compared to those like social media. In this case, we decided to explore our new sentiment customization capability to train a model to get improved performance on healthcare data and identify these positive articles.

Combining Watson Discovery and NLU

We gathered our corpus of articles using Watson Discovery News, accessible through an Advanced instance of Watson Discovery.

For timeliness, we filtered for articles published on a given day relating to “COVID” and “Coronavirus”. Below is an example query with the Discovery client:

# Add imports
from ibm_watson import DiscoveryV1
from datetime import datetime
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
# Today's date in format like 2020-04-13
today = datetime.today().strftime('%Y-%m-%d')
# Supply your discovery instance API key here
discovery_apikey = "<YOUR-DISCOVERY-API-KEY>"
# Supply your discovery instance URL here
discovery_url = 'https://gateway.watsonplatform.net/discovery/api'
# Authenticate
# https://github.com/watson-developer-cloud/python-sdk#iam
discovery_client = DiscoveryV1(
version='2020-04-07',
authenticator = IAMAuthenticator(discovery_apikey))
# Set the service URL for the client
discovery_client.set_service_url(discovery_url)
# Supply the environment ID with news we are using
environment_id = 'system'
# Supply the collection ID with news we are using
collection_id = 'news-en'
# Send a query, filtering on documents with a publication date of
# today and with mentions of coronavirus or covid
result = discovery_client.query(environment_id, collection_id,
filter='publication_date:"%s",
(text:"coronavirus"|text:"covid")' % today,
query='(enriched_text.entities.text:"covid"|
enriched_text.entities.text:"coronavirus")',
offset=50,
count=50)

Watson News results are limited to 50 at a time so we needed to use the offset parameter to page through results.

Next, we trained and deployed a custom sentiment model in a separate NLU instance using annotated health data. We then analyzed the URLs from our Discovery News query results with the custom NLU sentiment model to identify positive news articles.

Some Positive News

From the URLs we were able to analyze, our collection included research results and on-the-ground stories about hospitals.

Here are some examples of news that our model identified as expressing the positive sentiment, published on April 13:

Buffalo police showing support to health care workers

“It was a humbling sight on Friday that will make your heart full. Over two dozen police officers in their cars lined the parking lot at ECMC to show their support of the doctors and nurses that are working 24/7 to care for people, not only affected by the Coronavirus, but by all of the other emergencies that continue to happen.”

Beginning studies of baricitinib as a treatment for COVID-19 by Eli Lilly and Co.

“Eli Lilly and Co. has entered into an agreement with the National Institute of Allergy and Infectious Diseases (NIAID), to study baricitinib as an arm in NIAID’s Adaptive COVID-19 Treatment Trial. The study will evaluate safety and efficacy of baricitinib as a potential treatment for hospitalized patients with COVID-19….Given the inflammatory cascade seen in COVID-19, baricitinib’s anti-inflammatory activity has been hypothesized to have a potential benefit in COVID-19 and warrants further study in patients with this infection…..”

Patient recovery stories in Odisha, India

“Bhubaneswar: Two COVID-19 survivors, who have been discharged from hospitals, after their complete recovery urged people Monday not to lose hope on the system….Apart from abiding the guidelines, one should develop self-confidence to defeat the disease, the Jajpur survivor informed and added that doctors, nurses and family members boosted his mental courage which helped him a lot…”

Through these challenging times, we hope shedding some light on the positive news around the world can alleviate some fears and isolation people may be feeling.

Interested in how to do analyses like this yourself? You can find more information on how to create your own custom sentiment model with NLU here. You can also check out Discovery News here!

Want to share what you would analyze with a custom sentiment model? Leave us a comment!

--

--