关键词: COVID-19 Twitter coronavirus infodemic infodemiology infoveillance latent Dirichlet allocation risk communication topic modeling topic phase detection

Mesh : COVID-19 / epidemiology virology Communication Humans Pandemics SARS-CoV-2 / isolation & purification Social Media / statistics & numerical data

来  源:   DOI:10.2196/23272   PDF(Pubmed)

Abstract:
COVID-19, caused by SARS-CoV-2, has led to a global pandemic. The World Health Organization has also declared an infodemic (ie, a plethora of information regarding COVID-19 containing both false and accurate information circulated on the internet). Hence, it has become critical to test the veracity of information shared online and analyze the evolution of discussed topics among citizens related to the pandemic.
This research analyzes the public discourse on COVID-19. It characterizes risk communication patterns in four Asian countries with outbreaks at varying degrees of severity: South Korea, Iran, Vietnam, and India.
We collected tweets on COVID-19 from four Asian countries in the early phase of the disease outbreak from January to March 2020. The data set was collected by relevant keywords in each language, as suggested by locals. We present a method to automatically extract a time-topic cohesive relationship in an unsupervised fashion based on natural language processing. The extracted topics were evaluated qualitatively based on their semantic meanings.
This research found that each government\'s official phases of the epidemic were not well aligned with the degree of public attention represented by the daily tweet counts. Inspired by the issue-attention cycle theory, the presented natural language processing model can identify meaningful transition phases in the discussed topics among citizens. The analysis revealed an inverse relationship between the tweet count and topic diversity.
This paper compares similarities and differences of pandemic-related social media discourse in Asian countries. We observed multiple prominent peaks in the daily tweet counts across all countries, indicating multiple issue-attention cycles. Our analysis identified which topics the public concentrated on; some of these topics were related to misinformation and hate speech. These findings and the ability to quickly identify key topics can empower global efforts to fight against an infodemic during a pandemic.
摘要:
由SARS-CoV-2引起的COVID-19导致了全球大流行。世界卫生组织还宣布了一个流行病(即,大量关于COVID-19的信息,其中包含互联网上传播的虚假和准确信息)。因此,测试在线共享信息的真实性并分析与大流行有关的公民讨论主题的演变已变得至关重要。
本研究分析了关于COVID-19的公众话语。它描述了四个亚洲国家的风险传播模式,这些国家的疫情严重程度不同:韩国,伊朗,越南,和印度。
我们在2020年1月至3月疾病爆发的早期阶段从四个亚洲国家收集了有关COVID-19的推文。数据集是通过每种语言的相关关键词收集的,正如当地人所建议的。我们提出了一种基于自然语言处理的无监督方式自动提取时间-主题内聚关系的方法。提取的主题根据其语义含义进行定性评估。
这项研究发现,每个政府的官方疫情阶段与每日推文数量所代表的公众关注程度并不完全一致。受问题-注意力周期理论的启发,提出的自然语言处理模型可以识别公民讨论主题中有意义的过渡阶段。分析显示,推文数量与主题多样性之间存在反比关系。
本文比较了亚洲国家与流行病相关的社交媒体话语的异同。我们在所有国家的每日推文计数中观察到多个突出的高峰,表明多个问题-注意力周期。我们的分析确定了公众关注的主题;其中一些主题与错误信息和仇恨言论有关。这些发现和快速确定关键主题的能力可以使全球努力在大流行期间与信息流行病作斗争。
公众号