2017-06-27

Saving lives with a secure system for sharing biomedical data

Before: Biomedical researchers often exchanged sensitive data about people’s health between institutions or across borders via poorly protected means such as USB drives or CDs sent via registered mail. Now: Researchers can exchange data at great speed – and save more lives – via a secure Nordic data sharing service.

Hundreds of Nordic research groups and projects are already making use of data sharing systems that have been developed with assistance from NeIC’s project Tryggve, which takes its name from an old Norse word meaning “trusty, faithful, true, safe”. The modern form “trygg” is still used in most Nordic languages, so anybody who hears about the project immediately understands that this is about building something trustworthy.

Tryggve started in 2013 with a stakeholder meeting at Arlanda airport in Sweden. Four years later, the project is offering a secure means of transferring data between researchers and across borders.

“The transfer itself is encrypted, and no temporary copies are generated underway. Most users gain access to the secure data environments by logging in to a remote desktop, similar to logging in for online banking. The system also gives them access to a range of scientific software and reference datasets,” explains NeIC senior advisor Joel Hedlund. He is Tryggve’s project owner and an application expert in Bioinformatics at the Swedish National Supercomputer Centre at Linköping University. He cooperates closely with Tryggve’s project leader Antti Pursula at the CSC – IT Center for Science in Finland.

Protection from mistakes

“The system gives collaborating researchers access to a shared computer where they can store and analyse their data. Access is only granted to one’s own data, and only the person who is legally responsible for the research project can delegate permission to take data out of the system. Thus you can easily enter data into the system, but the system ensures that you cannot copy-paste data out by mistake. The only thing Tryggve cannot protect you from is someone taking a photograph of your computer screen,” Dr Hedlund adds.

Tryggve focuses on biomedical research, often just called medical research, and is all about finding the causes for disease, diagnosing diseases and preferably curing them. One of the characteristics of modern biomedical research is that researchers usually need to gather massive amounts of data from large population groups, in order to uncover both the causes for disease and the possible cures.

These data are sensitive because they contain information about the health of individual persons, and their use is regulated by strict national and international provisions on privacy. Researchers have traditionally spent a lot of time navigating through complicated legal waters before obtaining permission to gather and share data or gain access to already existing data sets.

Time is of the essence

The swine flu epidemic that hit the Nordic countries in 2009 illustrates the importance of speed when it comes to biomedical research and sharing of sensitive data. The Danish health authorities decided to recommend public vaccination because the epidemic was expected to escalate. Later, however, when doctors reported miscarriages among pregnant women who had been vaccinated, these authorities decided to start a research project.

Biomedical researchers in the Nordic countries have experienced that it can take months or years to gain access to all the data they need in such cases, mainly because the data are distributed among several research groups at different institutions. But the Danes had already integrated their health data in the Danish health data authority (Sundhedsdatastyrelsen), so researchers could access the necessary data as soon as the ethics approvals were in place.

“They found out within just a few weeks that the vaccinated mothers had fewer miscarriages than the other mothers, so the vaccination programme was continued,” explains Dr Hedlund. “It was of course important to ascertain this fact before the flu had passed.”

Saving more lives, faster

“NordForsk and NeIC started the Tryggve project together with the Nordic nodes of the European research infrastructure for life science information (ELIXIR), the European Biobanking and Molecular Research Infrastructure (BBMRI) and Euro-Bioimaging, because we all believed that a safe system for data sharing will help biomedical researchers in the Nordics in their efforts to save lives, says Dr Hedlund. “And I really believe that we have helped them, because we have made it possible to perform biomedical research faster than before, without compromising the important legal and ethical perspectives.”

In the past, there have been incidents where researchers who were unaware of the legal and ethical requirements have exchanged packages containing sensitive health data with other collaborators by means which are extremely unsafe – like sending unencrypted CDs or USB drives by registered mail. There is a risk that the personally identifiable sensitive data may be misused or compromised, as such mail may get lost on its way or even be intentionally stolen.

“We have demonstrated that sensitive data can now be processed across Nordic borders via secure cloud resources. This is an important achievement because of the legal, ethical and technical complexities around cross-border use of sensitive data. The Tryggve project also demonstrated that software portfolios could be portable between sensitive data systems,” Dr Hedlund points out.

Tryggve was started as a three-year project with the intention of establishing a Nordic platform for collaboration on sensitive data. Joel Hedlund and Antti Pursula started planning the successor project, Tryggve2, in 2016 with the intention of scaling out to benefit more people.

“Tryggve2 has perspectives far wider than the Nordic countries. The project is funded by NeIC and ELIXIR, and the end game is to develop a system that can be used across all European countries and beyond,” Joel Hedlund explains.

Personalised medicine and rare diseases

International cooperation is especially beneficial when it comes to personalised medicine and research on rare diseases.

“The Nordic countries are all quite small, and researchers can’t study rare diseases in small countries because they need a lot of cases to identify causes and search for cures. But if the Nordic countries pool their resources, we are suddenly similar in population size to a large European country. And if we pool our resources in Europe, we can really make a global impact while maintaining the high ethical standards that gives us Europeans a competitive advantage,” Dr Hedlund states.

“Tryggve2 is going to incorporate pretty much everything that NeIC is about: Nordic infrastructure collaboration, increasing stakeholder dialogue, sharing resources, pooling competencies and securing long term funding. And in addition, we are helping researchers in their efforts on saving lives,” Dr Hedlund concludes.

Planning a databank that can make biobanks even more valuable

Nordic human biobanks are of great value to researchers trying to find a cure for diseases such as cancer, diabetes, cardiovascular disease and more. Imagine if researchers could analyse readymade datasets instead of biological samples from the biobanks: That would save lots of time and money, and make biobanks even more valuable for public health.

The Nordic countries have a large number of biobanks which contain biological samples taken from healthy people and people suffering from various types of illness. Research groups are constantly requesting withdrawals of samples in order to analyse them for new links between genetic and environmental factors and disease. But most biological samples are small, and each withdrawal reduces the content of the biobank.

Giving away samples, receiving data

Bartlomiej Wilkowski is the Head of IT at the Danish National Biobank, Statens Serum Institut (SSI). He is a member of the SSI team which is exploring a new idea that can save researchers a lot of time and money.

“The idea is that biobanks shouldn’t just give away biological samples: They should also subsequently receive and store the data that researchers generate from analysing the samples. This will serve to empower researchers as a group, because the same data can in most cases be used by other researchers for other purposes in the future,” Dr Wilkowski explains.

“Imagine that a research group wants to sequence or genotype a batch of samples in order to study a specific issue. The data generated from such studies can in most cases also be used to study a lot of other issues. The next time a research group requests the same samples in order to sequence or genotype them, we could provide them with data instead of biological material,” he adds.

Developing a pilot

Bartlomiej Wilkowski is now working on developing the pilot for precisely such a databank together with NeIC’s Joel Hedlund and other partners. The databank should be directly linked to a biobank, so that it is easy to see how specific samples have been studied before.

“We are planning to build a pilot on the Danish National Supercomputer for Life Sciences – the Computerome – in such a way that other Nordic biobanks can copy this setup later. We should be able to build a common platform for researchers in the Nordic region,” Wilkowski suggests.

“In fact,” he continues, “we expect that our collaboration with the LocalEGA efforts, driven by ELIXIR, will ensure that we will be interoperable with databanks across all of Europe. EGA is the European Genome-phenome Archive, and ELIXIR is the European research infrastructure for life science information. In the future, these databanks are probably not going to export data in most cases – because it is a safer solution to give researchers an account that lets them configure and access their own, private space on a supercomputer and perform all the computations inside it.”

There are a number of issues that need to be solved, in terms of both technology and organisation, before this idea becomes a reality.

“The biological samples can, of course, give rise to many different kinds of data, and we want to make sure these data are stored in a way such that they can later be used by as many researchers as possible. This is going to be a challenge, and that is one of the reasons why we need the help of the infrastructure experts in NeIC and their national partners,” Dr Wilkowski states.

“My hope is that this idea can help to advance biomedical research in the Nordic countries. The benefits for researchers are that the planned databank can reduce the need for time-consuming procedures and analyses of biological samples, while also delivering a secure way of handling personal information,” Bartlomiej Wilkowski concludes.

Photo: Terje Heiestad Photo: SSI