2022-06-23

TEN SUCCESS STORIES FROM TEN YEARS OF SUCCESS: The Nordic Data Grid Facility

Year 2022 marks the 10-year anniversary for NeIC. To celebrate this milestone, we will publish ten stories that showcase how NeIC has contributed to developing best-in-class e-infrastructure services beyond national capabilities and enhanced the productivity of research in the Nordic Region.

The Nordic e-Infrastructure Collaboration, also known as NeIC, was established in 2012. NeIC facilitates collaboration on digital infrastructure within the Nordic countries and Estonia by providing experts coming from different countries, organisations and fields opportunities to work together. This Nordic collaboration on digital infrastructure had started already before NeIC was established. Since 2003, the Nordic countries have been collaborating on the Worldwide Large Hadron Collider Computing Grid (WLCG) at CERN, providing research computing and storage for high-energy physicists worldwide. The successful collaboration that started with the services offered by the Nordic Data Grid Facility (NDGF) was after some years expanded into NeIC, which was tasked to run the Nordic WLCG Tier-1 facility, also known as NT1. The initiation of NeIC made it possible to facilitate collaborations to benefit other science areas.

We have now published four stories that have presented various activities and the background of NeIC. This time, the topic is more or less the reason for NeIC’s existence: the Nordic Data Grid Facility project that is nowadays known as the operational Nordic Tier-1 activity. Oxana Smirnova is one of the few people that were involved in the project from the very beginning and still work with us. She is a particle physicist with a Ph.D from Lund University and a team member in the Nordic Tier-1 activity hosted by NeIC.

(Very brief) introduction to particle physics

As a particle physicist, Smirnova studies the most elementary particles in the universe. Almost everything that exists consists of smaller parts. Humans are made of cells, cells have different proteins, proteins can be broken down to molecules, and finally, molecules consist of atoms. Many are familiar with the water molecule H₂O, often illustrated as an upside down Mickey Mouse, which consists of two (2) Hydrogen atoms (H) and one Oxygen atom (O). However, as Smirnova reminds us, even atoms have structure. An atom has a nucleus that can contain some positively-charged protons, some neutral neutrons, and then around the nucleus, some negatively-charged electrons. For instance, the Hydrogen atom in H2O contains one proton in the nucleus and one electron spinning around it.

Electrons are very small and light particles. An electron is elementary, which means it doesn’t have further structure or consist of smaller particles. It is stable, it doesn’t decay to anything, it’s not radioactive. Smirnova adds that electrons are also the reason we have electric current and electricity. Protons, on the other hand, are not elementary. A proton is much heavier than an electron and consists of even smaller particles, which are called quarks. Quarks cannot exist freely: they have to come together to produce protons, neutrons or other particles. Quarks bind together very strongly, and the force that causes the binding is called strong (nuclear) force. An equally strong force needs to be applied to break the bond and to have a look inside the particles.

An accelerator is a device that applies an electric field on charged particles, such as protons. A collider is an alternative name for an accelerator that makes the particles smash against each other. When the particles move at a high speed in the electro-magnetic field in the accelerator, they acquire a very high energy. When they collide, they interact with each other, and it’s possible to see if there is any structure inside them, and even to create new particles, not seen in nature.

– It’s all very exciting. Not only can we finally see what’s inside the existing particles, we may also actually create new ones that don’t otherwise exist. This is possible because quarks can produce new quarks and recombine into completely new kinds of particles and do so in many different ways. They can even create particles that do not consist of quarks, Smirnova explains.

Accelerator science started around the mid-1900’s. In the early stages, many universities had an accelerator because everyone wanted to study the subject, and a huge range of new particles was discovered. The physicists have figured out patterns in how most quarks combine with each other and have come up with a good theory to predict this. Still, the theory doesn’t describe everything. There are some very simple things that can’t be described yet, for example, why a proton doesn’t decay.

– Neutrons, which are very similar to protons and consist of similar quarks, do decay, something that is seen in nuclear energy production. From the theory’s point of view, there is no significant difference between protons and neutrons. We simply don’t know why one decays and the other doesn’t. This is why we need more experiments: to test new theories and better describe the world around us, Smirnova states.

CERN, LHC and WLCG

CERN (European Organization for Nuclear Research) is a huge research facility located in Geneva, Switzerland. CERN was founded in 1954, and it is the largest particle physics laboratory in the world. It is a joint European venture, and Sweden, Denmark and Norway were among the co-funders from the very beginning. Finland also joined later on and is now one of the 23 member states of CERN.

– CERN is a really good investment, and it’s not only strictly particle physics that the member states are paying for. For example, the World Wide Web was invented in CERN in 1989, and there are many other technologies, innovations and applications that benefit societies on a wider scale, Smirnova says.

The Large Hadron Collider (LHC) is one of the accelerators located in CERN’s facilities. It was put into operation in 2008 to test the physicists’ theories by colliding particles, and according to Smirnova, it is the biggest machine ever built. Since physicists already know so much, unseen experiment results are extremely rare. Huge amounts of experiments have to be conducted to detect new phenomena, and all the data produced is important. Therefore, a lot of computing is required. Smirnova says that CERN always used to have the latest computers to keep up with the growing computing needs. Over time, supercomputers became more and more expensive, and at some point, they couldn’t afford the computing and storage infrastructure at the required scale anymore. The internet had developed enough by that time to allow transfer of data away from CERN, and that’s when the idea of the grid was born.

WLCG stands for the Worldwide LHC Computing Grid, and as Smirnova explains, it was created for distributed computing of the data coming from the LHC experiments at CERN. Smirnova says that before the idea of the WLCG was born, there were some computing centres in the world that were already supporting CERN’s computing needs outside of it. However, none of them could scale up to alone deal with the amount of data from the new accelerator, LHC. CERN investigated their options to handle the massive computing needs and came up with the idea of WLCG. It is the largest computing grid in the world and has multiple levels, so called tiers. Tier-0 is at CERN: it is the “top level” on the grid and the starting point of the data that is being distributed. From tier-0, the data is first transferred to tier-1 supercomputing centres for storage and first processing; then to tier-2 sites, such as universities; and maybe even to tier-3 facilities, for instance Smirnova’s computer. As of today, there are 13 tier-1 grid facilities in the world.

Establishing a Nordic Data Grid Facility

Smirnova, who at the time was a post-doc researcher at Lund, was one of the first people to start looking into how to do distributed computing with the data coming from CERN. In 2001, a large group of Nordic LHC physicists sent a project application for a project to the Nordunet2 programme to receive funding, and the application was accepted. This new project, NorduGrid, started looking into the theoretical aspects and existing software.

– One of the developers working on the project came up with the wonderful idea of doing it a little differently. This is how we developed ARC, the Advanced Resource Connector software. We showed that the idea of distributed computing works and that it can be implemented on the Nordic scale, but this was of course only a testbed, Smirnova says.

At this stage, the project involved test computers in Oslo, Copenhagen, Uppsala and Lund. When the testing proved to be successful, physicists in the Nordic Region thought why not to have a distributed Nordic facility, a computing centre located in several places instead of one. The idea was, according to Smirnova, really out of the box. Such distributed operations had not worked for anyone, and the Nordics had only tried it as a testbed for WLCG. However, people were ambitious and wanted to try. There were no data centres big enough in the Nordics that would have been open for international use, and for CERN computing, the facility needed to be open to scientists from around the world without any special allocations. The Nordic Data Grid Facility project was lucky to receive some funding from the national funding agencies. Little by little, as the software improved and more competent people from different countries joined, a distributed Nordic computing centre was created, working at least as efficiently as traditional tier-1 computing centres. Smirnova adds that the Nordic countries didn’t have to provide a tier-1 facility for the WLCG - not all European countries do -, but they wanted to, and they knew it was possible.

– Having a tier-1 facility in the Nordics brings us a lot of benefits. So much competence is developed, which helps everyone up here. Also, the way we are operating our facility is well known and respected these days. ARC is the only European grid computing software that is still in use, and it is used all over the world. It is a community-developed, open-source software, so as long as the community is there, there is always someone to keep it running, Smirnova says.

In Nordic Tier-1, it is not only the hardware that is distributed: also the people operating the hardware are physically located in different sites and countries. The team consists mostly of systems experts, who know how to run the computers, storages and tapes, and software developers. Smirnova herself works as a CERN liaison and makes sure what is done responds to the physicists’ needs. Despite having only a couple of face-to-face meetings a year, Smirnova says that there is a strong feeling of working together in the team every day.

NT1 and the Nordic e-Infrastructure Collaboration

The Large Hadron Collider at CERN will be operational at least until year 2038. However, Smirnova points out that the use of the LHC data doesn’t stop there. The NDGF was first started as a three-year pilot project in 2003 to test setting up a distributed grid facility on the Nordic level, but it was soon understood that the project needed more time and a host organisation. The first host was NORDUnet. The Nordic research funding agencies quickly saw that the need for NDGF was increasing, and to evolve, it would need to be organised differently. Around the same time, Gudmund Høst was involved in the project’s steering group, and the rest is history.

– The Nordic Tier-1 activity was the first activity within NeIC. Other projects come and go, but NT1 stays operational because it has to. If NeIC ceased to exist, we would have to find another host, because we simply have to keep the tier-1 facility running. The Nordic countries are committed to supporting CERN and their computing needs, Smirnova explains.

Smirnova says it has been very beneficial for the Nordic Tier-1 activity to be under the NeIC umbrella because NeIC provides the structure and the framework. It also helps with administration, such as hiring new staff. She adds that for them, NeIC is also a great way to reach out to other communities and let them know about using and operating shared e-infrastructure in scientific research. Not every community needs as much computing or storage resources, but she believes that in the future, there will be an increasing need for the expertise that has been gathered during the almost twenty years the distributed Nordic Data Grid Facility has been operational.