New data platform captures tweets about Covid-19

Researchers in the School of English have developed a new open access platform comprising of millions of tweets relating to Covid-19, in order to capture public opinion, moderate misinformation and provide a resource to other researchers.

People using their phones and holding face masks

Covid conversations

Andrew KehoeMatt GeeMark McGlashanTatiana TkacukovaSelina Schmidtand Robert Lawson have created the Trust and Communication: Coronavirus Online Visual Dashboard, otherwise known as TRAC:COVID.

The team were inspired to develop this resource due to the volume of social media updates posted on Twitter from a broad cross section of people.

“Twitter has seen journalists, politicians, doctors, nurses, and members of the general public share their thoughts about lockdown, furlough, quarantine, testing, masks, vaccination, social distancing, recovery, and, in the worst cases, loss and bereavement,” Robert says.

“We wanted to make sense of the kinds of Covid-related conversations people have been having – what worried them, how they viewed lockdown – as well as charting how these conversations shifted over time.”

TRAC:COVID will also, the team hopes, serve as an important tool in addressing misinformation, which has been identified as a key concern ever since Covid-19 became prominent.

For example, studies have found that tweets with completely false claims spread faster than those with partially false claims and unverified personal Twitter accounts feature the highest rate of Covid-19 misinformation.

“Although these studies have been essential in tackling the problem of misinformation, it’s not always easy for people to explore the underlying datasets,” Robert says.

“We wanted to develop something that was publicly accessible to researchers, academics, and members of the general public – hence the creation of TRAC:COVID.”

Exploring the data

The TRAC:COVID dashboard enables the analysis of over 84 million UK tweets which contain words and hashtags related to the pandemic. 

The dashboard is built on methods from a discipline known as corpus linguistics, which helps people quickly analyse millions of words. Corpus linguistic methods have recently been used to analyse healthcare communications, including work on NHS patient feedback.

The dataset currently covers January 2020 to April 2021, and there are hopes to extend this in the future.

“Through the dashboard, users can chart how language use changed during the pandemic and how particular words have acquired new meanings, to when certain words stop being used altogether, all without requiring specialist knowledge or language analysis skills,” Robert says.

Understanding pandemic perspectives

The team hopes that TRAC:COVID will also prove beneficial in tackling ongoing issues related to the pandemic, such as engagement with the vaccination programme.

“Studies have shown vaccine hesitancy is fuelled by misinformation, particularly on Twitter,” explains Mark McGlashan, the author of the team’s case study about COVID-19 misinformation.

“Using TRAC:COVID, we – and the public – have a good idea of the scale and diversity of ‘anti-vax’ stances on Twitter.”

The team also charted public reaction to the range of public health measures implemented over the past 18 months, including masks, social distancing, and the vaccination programme.

“People on Twitter were generally supportive of these public health measures, in some cases arguing for stronger restrictions to drive down rates of infection quicker”, said Tatiana Tkacukova, who led the public reaction strand of the project.

The team’s work has already resulted in an article in The Conversation and an appearance on the Pandemic and Beyond podcast, while parts of the case study on public reactions were submitted to the call for evidence on Initial Learning from the Government’s Response to the COVID-19 pandemic and, more recently, to the call for evidence by the All-Party Parliamentary Group.

Findings from the research form part of a newly published report from the House of Commons Committee of Public Accounts titled Initial lessons from the government’s response to the COVID-19 pandemic 

Written evidence submitted to the Committee found that messaging from the government and public health bodies: 

  • Lacked clarity about who the messages were directed to 
  • Used long sentences with complex vocabulary, grammar and syntax, which made it difficult to understand 
  • Used language which was ambiguous 
  • Used terminology which was likely to elicit negative reactions 
  • Used terms which could have excluded some intended recipients of messages (e.g. using the term ‘house’ rather than ‘home’) 

Tatiana added that, "We hope that these reports will contribute to planning the communication strategy on the safety measures required to address the ongoing risks to public health."

“Tweets about Covid-19 represent an important cultural artefact,” Robert says. “With our new resource, people will be able to explore an archive of pandemic perspectives, deepening their understanding of what preoccupied UK Twitter users during this time.”

Find out more via the TRAC:COVID project page.