PaRI - NeIC’s response to the COVID-19 pandemic

It will soon be two years since the SARS-CoV-2, or COVID-19, pandemic erupted. The pandemic has affected people in all parts of the world, and its various impacts are seen on the news every day. As of 23 September 2021, according to the World Health Organisation’s COVID-19 Dashboard, almost 230 million people have been infected with the new coronavirus.

NeIC is a collaboration that puts great emphasis on contributing to Nordic societies’ well-being and competitiveness. Therefore it was clear that NeIC should do what it does best in the fight against COVID-19: create platforms, improve workflows and find other solutions to facilitate research and help research communities achieve better results by working together. In their April 2020 meeting, the NeIC Board decided to launch a fast-track call for proposals for projects supporting COVID-19-related research. The Nordic Pandemic Research Infrastructure (PaRI) project was launched to complement the NeIC project portfolio by focusing on the collection and use of data in connection with research on pandemics and in particular the COVID-19 pandemic.

Collaboration is key

By embracing and aligning with other European initiatives who contribute to the struggle against COVID-19, such as ELIXIR and EOSC-Life, PaRI facilitates pandemic research for Nordic researchers. The one-year project has built a secure platform for managing and executing tools and workflows for cross-border pandemic data processing. Data processing means collecting, quality controlling, analysing and securely storing data. Making it possible for multiple organisations in the Nordic region to have the same processes enables more seamless and efficient collaboration in further research. One of the PaRI project’s objectives was to implement Nordic pilots and best practices for FAIR data that benefit from these technical results. The project benefits greatly from the outcomes of NeIC’s Tryggve project.

The Nordic collaboration on sensitive bio- and medical data facilitated through NeIC is today widely recognized as a unique, fit-for-purpose competence network of software developers and system administrators.
– PaRI project manager Abdulrahman Azab

The PaRI project team is working in between and across international and national initiatives and health institutes. For example, ELIXIR Norway is in a continuous dialogue with Norway’s National Institute of Public Health (NIPH) to facilitate open sharing of viral genome sequences sampled in Norway. ELIXIR Norway is offering NIPH both technical and data sharing competence. Close collaboration with ELIXIR Sweden allows PaRI to learn and benefit from experiences from Swedish initiatives and in particular the SciLifeLab National COVID-19 Research Program. In Estonia, a consortium called KoroGenoEst-3 has brought the projects on COVID-19 genome sequencing under a single umbrella with the help of the Estonian Scientific Computing Infrastructure (ETAIS) and ELIXIR Estonia, and they are now using common workflows to monitor and publish their data.

Rapid ramp-up and results

As a NeIC project, PaRI is quite unique. Firstly, it is a collaboration of five NeIC member countries and Germany. This is the first time a German institution has participated in a NeIC project. In addition, the Riga Technical University from Latvia has joined the project as an observer.

Secondly, the initiation phase was exceptionally quick: the project started on 1 November 2020, after a short but efficient preparation phase, and produced a first pilot of one project deliverable already in early 2021. In one year, the PaRI team has successfully encouraged researchers and data producers to upload their data to the European Nucleotide Archive (ENA) — the primary European research database for nucleotide sequence information that includes raw sequencing data. This means that more data has been available for researchers studying the COVID-19 virus and its behaviour.

Among the most important achievements of the project and its collaborators is the Galaxy Nordic COVID-19 portal. The portal monitors the public output of viral genome sequencing projects that is submitted to the ENA and makes the data more accessible for further use. The portal also demonstrates the feasibility of tracking the output of national genome surveillance projects. The Nordic COVID-19 portal is part of the international Galaxy network. Galaxy helps researchers perform computational analyses with their data without having to have expertise in informatics, programming or data visualization. The platform is open source and enables accessible, reproducible and transparent computational research.

Another output of the PaRI project is the PaRI dashboard, a visualisation tool designed to help stakeholders such as epidemiologists, state institutions and researchers monitor the pandemic locally, down to regional and municipality levels. The dashboard is based on the NextStrain tool and visualizes Nordic COVID-19 data (raw sequencing data brokered into ENA or consensus sequences from the Galaxy pipeline). The dashboard visualizes the data as well as any connected metadata, allowing for filtering of created visualisation. Furthermore, it even allows adding sensitive metadata to the sequences, with the key feature that this additional metadata will only be stored in the user’s browser and thus preserves any local legislation about sensitive data sharing.

The PaRI dashboard emphasizes the need for harmonized data analysis and is built in close collaboration with the metadata harmonization and Galaxy teams in PaRI. The dashboard can be seen and used as an example on how to make data FAIR and utilize FAIRified data. The PaRI dashboard team is currently working on adding a subsampling scheme similar to the subsampling used in the dashboard at covid19dataportal.org, as well as adding the Pango lineages as a background tree for all submissions.

The PaRI dashboard can be found at https://auspice.biit.cs.ut.ee/ncov/est.

Helping researchers worldwide

The Galaxy community, in collaboration with PaRI, organised a workshop on SARS-CoV-2 Data Analysis and Monitoring with Galaxy. The goal was to build capacity, both in the analysis and data management sides of COVID-19, based on the expertise acquired during the pandemic. On a broader scope, sharing the experience in COVID-19 data analysis as well as fostering the principles of open data, open science and open infrastructure are key aspects in the current and global public health situation.

The four-day workshop consisted of multiple demos, hands-on exercises and lectures with topics varying from general information on Galaxy to more specific ones, such as analysis of public datasets and data export to public and open archives. The workshop was conducted asynchronously, with pre-recorded videos, but also providing live support and Q&A sessions. The workshop was attended by researchers in all career stages and different roles, and in total, there were 767 registrants from 106 countries. The PaRI project is helping both Nordic researchers and researchers around the world to better understand and fight the COVID-19 virus by providing the e-infrastructure – platforms, interoperability, tools and training – that facilitates their work. The infrastructure also makes it easier for researchers to benefit from each other’s work and work more seamlessly across projects and countries.

The project manager sees that motivated project partners and collaborators have been key to the success of PaRI. What is more important, working together with several teams has helped the entire research community to understand the virus better. According to feedback collected by the project partners, the PaRI project has been a key factor in enabling knowledge exchange on technical and data brokering infrastructure issues in the Nordics for pandemic data.