Quantifying online news media coverage of the COVID-19 pandemic: Text mining study and resource

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Quantifying online news media coverage of the COVID-19 pandemic : Text mining study and resource. / Krawczyk, Konrad; Chelkowski, Tadeusz; Laydon, Daniel J.; Mishra, Swapnil; Xifara, Denise; Flaxman, Seth; Flaxman, Seth; Mellan, Thomas; Schwämmle, Veit; Röttger, Richard; Hadsund, Johannes T.; Bhatt, Samir.

In: Journal of Medical Internet Research, Vol. 23, No. 6, e28253, 06.2021.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Krawczyk, K, Chelkowski, T, Laydon, DJ, Mishra, S, Xifara, D, Flaxman, S, Flaxman, S, Mellan, T, Schwämmle, V, Röttger, R, Hadsund, JT & Bhatt, S 2021, 'Quantifying online news media coverage of the COVID-19 pandemic: Text mining study and resource', Journal of Medical Internet Research, vol. 23, no. 6, e28253. https://doi.org/10.2196/28253

APA

Krawczyk, K., Chelkowski, T., Laydon, D. J., Mishra, S., Xifara, D., Flaxman, S., Flaxman, S., Mellan, T., Schwämmle, V., Röttger, R., Hadsund, J. T., & Bhatt, S. (2021). Quantifying online news media coverage of the COVID-19 pandemic: Text mining study and resource. Journal of Medical Internet Research, 23(6), [e28253]. https://doi.org/10.2196/28253

Vancouver

Krawczyk K, Chelkowski T, Laydon DJ, Mishra S, Xifara D, Flaxman S et al. Quantifying online news media coverage of the COVID-19 pandemic: Text mining study and resource. Journal of Medical Internet Research. 2021 Jun;23(6). e28253. https://doi.org/10.2196/28253

Author

Krawczyk, Konrad ; Chelkowski, Tadeusz ; Laydon, Daniel J. ; Mishra, Swapnil ; Xifara, Denise ; Flaxman, Seth ; Flaxman, Seth ; Mellan, Thomas ; Schwämmle, Veit ; Röttger, Richard ; Hadsund, Johannes T. ; Bhatt, Samir. / Quantifying online news media coverage of the COVID-19 pandemic : Text mining study and resource. In: Journal of Medical Internet Research. 2021 ; Vol. 23, No. 6.

Bibtex

@article{8e3533daf24c46c78d934317db33431b,
title = "Quantifying online news media coverage of the COVID-19 pandemic: Text mining study and resource",
abstract = "Background: Before the advent of an effective vaccine, nonpharmaceutical interventions, such as mask-wearing, social distancing, and lockdowns, have been the primary measures to combat the COVID-19 pandemic. Such measures are highly effective when there is high population-wide adherence, which requires information on current risks posed by the pandemic alongside a clear exposition of the rules and guidelines in place. Objective: Here we analyzed online news media coverage of COVID-19. We quantified the total volume of COVID-19 articles, their sentiment polarization, and leading subtopics to act as a reference to inform future communication strategies. Methods: We collected 26 million news articles from the front pages of 172 major online news sources in 11 countries (available online at SciRide). Using topic detection, we identified COVID-19–related content to quantify the proportion of total coverage the pandemic received in 2020. The sentiment analysis tool Vader was employed to stratify the emotional polarity of COVID-19 reporting. Further topic detection and sentiment analysis was performed on COVID-19 coverage to reveal the leading themes in pandemic reporting and their respective emotional polarizations. Results: We found that COVID-19 coverage accounted for approximately 25.3% of all front-page online news articles between January and October 2020. Sentiment analysis of English-language sources revealed that overall COVID-19 coverage was not exclusively negatively polarized, suggesting wide heterogeneous reporting of the pandemic. Within this heterogenous coverage, 16% of COVID-19 news articles (or 4% of all English-language articles) can be classified as highly negatively polarized, citing issues such as death, fear, or crisis. Conclusions: The goal of COVID-19 public health communication is to increase understanding of distancing rules and to maximize the impact of governmental policy. The extent to which the quantity and quality of information from different communication channels (eg, social media, government pages, and news) influence public understanding of public health measures remains to be established. Here we conclude that a quarter of all reporting in 2020 covered COVID-19, which is indicative of information overload. In this capacity, our data and analysis form a quantitative basis for informing health communication strategies along traditional news media channels to minimize the risks of COVID-19 while vaccination is rolled out.",
keywords = "COVID-19, Infoveillance, Public health, Sentiment analysis, Text mining",
author = "Konrad Krawczyk and Tadeusz Chelkowski and Laydon, {Daniel J.} and Swapnil Mishra and Denise Xifara and Seth Flaxman and Seth Flaxman and Thomas Mellan and Veit Schw{\"a}mmle and Richard R{\"o}ttger and Hadsund, {Johannes T.} and Samir Bhatt",
note = "Publisher Copyright: {\textcopyright}Konrad Krawczyk, Tadeusz Chelkowski, Daniel J Laydon, Swapnil Mishra, Denise Xifara, Seth Flaxman, Seth Flaxman, Thomas Mellan, Veit Schw{\"a}mmle, Richard R{\"o}ttger, Johannes T Hadsund, Samir Bhatt.",
year = "2021",
month = jun,
doi = "10.2196/28253",
language = "English",
volume = "23",
journal = "Journal of Medical Internet Research",
issn = "1439-4456",
publisher = "JMIR Publications",
number = "6",

}

RIS

TY - JOUR

T1 - Quantifying online news media coverage of the COVID-19 pandemic

T2 - Text mining study and resource

AU - Krawczyk, Konrad

AU - Chelkowski, Tadeusz

AU - Laydon, Daniel J.

AU - Mishra, Swapnil

AU - Xifara, Denise

AU - Flaxman, Seth

AU - Flaxman, Seth

AU - Mellan, Thomas

AU - Schwämmle, Veit

AU - Röttger, Richard

AU - Hadsund, Johannes T.

AU - Bhatt, Samir

N1 - Publisher Copyright: ©Konrad Krawczyk, Tadeusz Chelkowski, Daniel J Laydon, Swapnil Mishra, Denise Xifara, Seth Flaxman, Seth Flaxman, Thomas Mellan, Veit Schwämmle, Richard Röttger, Johannes T Hadsund, Samir Bhatt.

PY - 2021/6

Y1 - 2021/6

N2 - Background: Before the advent of an effective vaccine, nonpharmaceutical interventions, such as mask-wearing, social distancing, and lockdowns, have been the primary measures to combat the COVID-19 pandemic. Such measures are highly effective when there is high population-wide adherence, which requires information on current risks posed by the pandemic alongside a clear exposition of the rules and guidelines in place. Objective: Here we analyzed online news media coverage of COVID-19. We quantified the total volume of COVID-19 articles, their sentiment polarization, and leading subtopics to act as a reference to inform future communication strategies. Methods: We collected 26 million news articles from the front pages of 172 major online news sources in 11 countries (available online at SciRide). Using topic detection, we identified COVID-19–related content to quantify the proportion of total coverage the pandemic received in 2020. The sentiment analysis tool Vader was employed to stratify the emotional polarity of COVID-19 reporting. Further topic detection and sentiment analysis was performed on COVID-19 coverage to reveal the leading themes in pandemic reporting and their respective emotional polarizations. Results: We found that COVID-19 coverage accounted for approximately 25.3% of all front-page online news articles between January and October 2020. Sentiment analysis of English-language sources revealed that overall COVID-19 coverage was not exclusively negatively polarized, suggesting wide heterogeneous reporting of the pandemic. Within this heterogenous coverage, 16% of COVID-19 news articles (or 4% of all English-language articles) can be classified as highly negatively polarized, citing issues such as death, fear, or crisis. Conclusions: The goal of COVID-19 public health communication is to increase understanding of distancing rules and to maximize the impact of governmental policy. The extent to which the quantity and quality of information from different communication channels (eg, social media, government pages, and news) influence public understanding of public health measures remains to be established. Here we conclude that a quarter of all reporting in 2020 covered COVID-19, which is indicative of information overload. In this capacity, our data and analysis form a quantitative basis for informing health communication strategies along traditional news media channels to minimize the risks of COVID-19 while vaccination is rolled out.

AB - Background: Before the advent of an effective vaccine, nonpharmaceutical interventions, such as mask-wearing, social distancing, and lockdowns, have been the primary measures to combat the COVID-19 pandemic. Such measures are highly effective when there is high population-wide adherence, which requires information on current risks posed by the pandemic alongside a clear exposition of the rules and guidelines in place. Objective: Here we analyzed online news media coverage of COVID-19. We quantified the total volume of COVID-19 articles, their sentiment polarization, and leading subtopics to act as a reference to inform future communication strategies. Methods: We collected 26 million news articles from the front pages of 172 major online news sources in 11 countries (available online at SciRide). Using topic detection, we identified COVID-19–related content to quantify the proportion of total coverage the pandemic received in 2020. The sentiment analysis tool Vader was employed to stratify the emotional polarity of COVID-19 reporting. Further topic detection and sentiment analysis was performed on COVID-19 coverage to reveal the leading themes in pandemic reporting and their respective emotional polarizations. Results: We found that COVID-19 coverage accounted for approximately 25.3% of all front-page online news articles between January and October 2020. Sentiment analysis of English-language sources revealed that overall COVID-19 coverage was not exclusively negatively polarized, suggesting wide heterogeneous reporting of the pandemic. Within this heterogenous coverage, 16% of COVID-19 news articles (or 4% of all English-language articles) can be classified as highly negatively polarized, citing issues such as death, fear, or crisis. Conclusions: The goal of COVID-19 public health communication is to increase understanding of distancing rules and to maximize the impact of governmental policy. The extent to which the quantity and quality of information from different communication channels (eg, social media, government pages, and news) influence public understanding of public health measures remains to be established. Here we conclude that a quarter of all reporting in 2020 covered COVID-19, which is indicative of information overload. In this capacity, our data and analysis form a quantitative basis for informing health communication strategies along traditional news media channels to minimize the risks of COVID-19 while vaccination is rolled out.

KW - COVID-19

KW - Infoveillance

KW - Public health

KW - Sentiment analysis

KW - Text mining

U2 - 10.2196/28253

DO - 10.2196/28253

M3 - Journal article

C2 - 33900934

AN - SCOPUS:85107431618

VL - 23

JO - Journal of Medical Internet Research

JF - Journal of Medical Internet Research

SN - 1439-4456

IS - 6

M1 - e28253

ER -

ID: 272230822