TEN SUCCESS STORIES FROM TEN YEARS OF SUCCESS: Our work on sensitive health data

Year 2022 marks the 10-year anniversary for NeIC. To celebrate this milestone, we will publish ten stories that showcase how NeIC has contributed to developing best-in-class e-infrastructure services beyond national capabilities and enhanced the productivity of research in the Nordic Region.

The Nordic e-Infrastructure Collaboration, also known as NeIC, was established in 2012. NeIC facilitates collaboration on digital infrastructure within the Nordic countries and Estonia by providing experts coming from different countries, organisations and fields opportunities to work together. This Nordic collaboration on digital infrastructure had started already before NeIC was established. Since 2003, the Nordic countries have been collaborating on the Worldwide Large Hadron Collider Computing Grid (WLCG) at CERN, providing research computing and storage for high-energy physicists worldwide. The successful collaboration that started with the services offered by the Nordic Data Grid Facility (NDGF) was after some years expanded into NeIC, which was tasked to run the Nordic WLCG Tier-1 facility, also known as NT1. The initiation of NeIC made it possible to facilitate collaborations to benefit other science areas.

Health research has since 2013 been one of the major fields of science NeIC contributes to. However, working with health data poses an issue: how to share and use it for research purposes while maintaining confidentiality? This is the problem NeIC’s sensitive data activities have tried to solve. To dig deeper into the start of these activities, we interviewed Joel Hedlund, a former Bio- and medical sciences Area coordinator in NeIC’s executive team, who was a key person in setting up the health data collaboration.

Sharing sensitive health data

Sensitive data is information, often personal information, that tells something confidential. Such data could for example concern genetics and health, religion or sexual orientation.¹ Sensitive health data refers to everything that relates to an individual’s health. All of this information is deemed sensitive by law because it is potentially harmful when misused. At the same time, sensitive health data has to be used to provide healthcare and do research to improve healthcare.

Due to the sensitive nature of personal health data, working with it needs to be regulated and done very carefully and in a secure way to honour its confidentiality. This applies within the borders of a country, but even more so when collaborating with other countries. In the Nordic Region, where the countries are not very large and the ways of working as well as mindsets are similar, all research benefits from collaborating. Health data research is no exception.

The many phases of Tryggve

NeIC’s Tryggve project started in October 2014 and ran for two phases and a total of six years, until October 2020. In June 2021, the follow-up project Heilsa Tryggvedottir was launched with all the same project partners plus one new one, ELIXIR Estonia.

At the time of the interview with Hedlund, the Federated European Genome-phenome Archive (FEGA) had just been launched in Barcelona. He was there, as were the Nordic ELIXIR node heads, all of which made a reference to Tryggve and NeIC in their speeches as the starting point for FEGA. Hedlund says it is widely recognised that Tryggve and its successor, Heilsa, have done a great job and influenced many initiatives over the years, and that the technologies developed in them have been used in many European projects.

How did this success story begin, and what has the project achieved?

Nordic collaboration on health data

The efforts to initiate the Tryggve project began in 2013, only a year after NeIC had been established. NeIC’s board had decided that in addition to high-energy physics, biosciences and medical sciences should be a top priority. Letters of interest for collaboration on sharing sensitive data from health data organisations in the Nordics were sent to NeIC, and shortly after in April 2013, Hedlund was headhunted to coordinate these efforts. After investigating common interests among the organisations, Tryggve was initiated a year and a half later. Hedlund reveals that he was the one who came up with the name Tryggve. The name stems from Old Norse “tryggr”, which was used to mean “trusty, faithful, true, safe”. Heilsa Tryggvedottir roughly translates to “health, the daughter of Tryggve”.

The Nordic countries have collaborated on health data for a long time. According to Hedlund, collaboration is fairly easy because the national laws are very similar due to coordinated policy-making and law-making in the region. Different actors in the Nordic countries had worked together also on the technical aspects, but when it comes to e-infrastructure for research, the collaboration started in 2013 with Hedlund’s preparations for Tryggve. Hedlund adds that there might have been initiatives similar to Tryggve in the Nordics and Europe, but none at the same scale or with such an impact. Since Tryggve entered the execution phase in 2014, NeIC has continuously supported activities focusing on sensitive health data.

If you are familiar with Tryggve, you probably know of ELIXIR. According to their website, “ELIXIR is an intergovernmental organisation that brings together life science resources from across Europe. These resources include databases, software tools, training materials, cloud storage and supercomputers.” There are ELIXIR nodes in Denmark, Estonia, Finland, Norway and Sweden, and Hedlund adds that ELIXIR was among the health data organisations that caused NeIC to have interest in health data in the first place. The Nordic ELIXIR nodes, together with the Nordic sensitive data service providers (TSD, Bianca, Computerome, ePouta), took on the execution of Tryggve from the very beginning, and most of the developers that have worked in Tryggve and now Heilsa come from the ELIXIR nodes.

Before Tryggve, there were only two sensitive data systems for the developers to work with, and for example TSD (University of Oslo’s Services for sensitive data) and ePouta (CSC’s system for processing sensitive data) were being built at the time, separately. Tryggve made it possible for the developers to coordinate and learn design ideas from each other as well as find ways to collaborate and to have the systems interoperate. For example, the Swedish system, Bianca, was developed based on ideas from TSD.

Federated EGA and other outcomes

Both the first and the second phase of Tryggve have aimed at strengthening transnational collaboration on sensitive personal data for research in the Nordics, with a focus on large, high-volume computation-intensive projects. Sensitive personal data archiving to ensure that we in the Nordics can pool data to have statistical power was important for Hedlund as the coordinator of Tryggve. In the first phase, there wasn’t enough money to build such an archive, but as the budget was doubled for the second phase, sensitive data archiving was finally possible.

Hedlund says that even though there were similar initiatives on the European level when Tryggve started, none of them had the federated aspect even if the repository would have centralised. This meant that all of the sensitive data had to be sent somewhere else, in most cases to the United Kingdom. With the federated set up, it is possible to keep the sensitive data in your jurisdiction, which means less issues with law and meeting requirements. The above-mentioned Federated EGA technology has been driven heavily by the partners in Tryggve and Heilsa. During the projects, the technology has been made technically ready and, even more importantly, politically ready so that the businesses, policy-makers and lawyers have been able to agree on it.

Many important use cases for research were supported in Tryggve. Hedlund mentions use cases in psychology and inflammatory diseases where it is crucial to have large amounts of sensitive personal data to pool them and interact on the findings. In addition, through Tryggve, NeIC has successfully supported big Nordic efforts on analysing highly sensitive data where the information is so sensitive that it can’t leave the country: genotypes and phenotypes, recordings of psychological therapy sessions, generational death registers, and so on. Tryggve facilitated this type of research and made it possible to do large studies without disclosing sensitive information by developing pipelines that can be executed separately in different countries, so that in the end, only the non-sensitive, partial findings are exchanged across borders.

– If NeIC had not initiated and hosted Tryggve, I don’t think there would have been a celebration in Barcelona last week because there would be no Federated EGA. We would probably also be missing the model for collaborating that will now be used in high-profile EU projects, such as Bigpicture and Beyond One Million Genomes. I don’t think we would have as good systems for processing sensitive data in the Nordics as we do today, Hedlund says.

The legacy lives on

What is the effect of NeIC focusing on sensitive health data? Hedlund says that from a practical point of view, it is the way these projects have put enough resources in so that a significant number of developers have been able to meet up and coordinate on important matters. This has also allowed the inclusion of lots of stakeholders from many different communities that are active in the Nordic Region and discussing priorities, which Hedlund believes has guided concrete design choices for the national services. The collaboration has helped the different actors in the Nordics to better understand the needs of international research. The efforts that have followed this are huge.

Today, the legacy of the Federated EGA part of Tryggve can most concretely be seen in the Heilsa Tryggvedottir project that runs under the NeIC umbrella until June 2024. Heilsa aims to further improve and productise this technology developed in Tryggve and create a federated network with interoperable services. Soon, a researcher with the appropriate authorisations may be able to combine data from two or more datasets in different Nordic locations. The collaboration between the Nordic countries today is flourishing, and broader than before with ELIXIR Estonia being an official partner in Heilsa.

At the same time, the Bigpicture EU-project is participated by some of the Heilsa partners in Finland and Sweden. Bigpicture takes the technologies from Federated EGA and other services for sensitive personal data that developed in Tryggve and adapts them to be used in another field, in digital pathology while focusing on collecting and standardising pathology data. The project has been able to get more communities involved and to extend the standards developed for genomic and genetic data to also cover pathologies and diagnostics – which, Hedlund adds, is extremely hard and would most likely not be possible without the technologies that were matured during Tryggve. Another initiative called EUCAIM (for European Cancer Imaging) will be using the Federated EGA software integrated into Bigpicture but opened up to be used in radiology.

– Tryggve’s outcomes are being broadened to serve also other research communities and to enable more integrational, multimodal, multi-centre research. To me, that is a great legacy, Hedlund concludes.

¹ Read more about sensitive data on European Commission’s website