TEN SUCCESS STORIES FROM TEN YEARS OF SUCCESS: Cross-border collaboration on high-performance computing
Year 2022 marks the 10-year anniversary for NeIC. To celebrate this milestone, we will publish ten stories that showcase how NeIC has contributed to developing best-in-class e-infrastructure services beyond national capabilities and enhanced the productivity of research in the Nordic Region.
The Nordic e-Infrastructure Collaboration, also known as NeIC, was established in 2012. NeIC facilitates collaboration on digital infrastructure within the Nordic countries and Estonia by providing experts coming from different countries, organisations and fields opportunities to work together. This Nordic collaboration on digital infrastructure had started already before NeIC was established. Since 2003, the Nordic countries have been collaborating on the Worldwide Large Hadron Collider Computing Grid (WLCG) at CERN, providing research computing and storage for high-energy physicists worldwide. The successful collaboration that started with the services offered by the Nordic Data Grid Facility (NDGF) was after some years expanded into NeIC, which was tasked to run the Nordic WLCG Tier-1 facility, also known as NT1. The initiation of NeIC made it possible to facilitate collaborations to benefit other science areas.
High-performance computing (HPC) solves advanced computation problems and is crucial in enabling high-quality research and the development of society: it makes data handling more efficient and speeds up production of results. NeIC has facilitated collaboration on high-performance computing for six years through two projects: Dellingr and Puhuri. Both of these projects have worked on resource sharing in the Nordic Region, but their goals and achievements have differed from one another.
Sharing resources to enable world-class research
Dellingr: exploring resource exchange possibilities in the Nordics
The Dellingr project was initiated to explore and test whether high-performance computing resources, such as computing, storage and support, could be shared and exchanged between countries and across borders. The approach was to work with the national e-infrastructure providers to define a functional framework for resource sharing that would recognise and build upon the unique strengths of each provider. This would advance research in each country individually, but also within the Nordic Region overall.
The first phase began in 2017 by investigating the readiness and interest in the Nordic Region towards the sharing of e-infrastructure resources. Based on the positive outcome of the first phase, a second phase followed after and ran until March 2020. The focus of Dellingr was mainly on cross-border exchange of computational resources, and it was discovered that while technically feasible, there are often political, legislative and financial restrictions or requirements that hinder or prevent sharing of resources on a long-term basis.
In Dellingr, two pilot projects were run to test resource-sharing methods and policies by offering e-infrastructure resources to researchers free of charge. The pilots produced several documents that can serve as a basis for further discussions on HPC resource sharing in the Nordic Region. Dellingr’s resource-sharing framework proved to be a useful tool for registering resources in compliance with the EOSC service description template, which would reduce future efforts in relation to participation in EU-funded e-infrastructure projects. The outcomes of Dellingr contributed to the Puhuri and EOSC-Nordic projects and in particular to the work on policies, legal issues and sustainability.
Puhuri: building a bridge between users and resources
The Puhuri project was initiated in June 2020 with all of the six NeIC countries represented in the project team as well as steering group. The project built on the experiences from the Dellingr project, and the key driver for the project was the EuroHPC LUMI supercomputer, one of the world’s fastest supercomputers that would be built in Kajaani, Finland. The machine is hosted by a consortium of ten countries, including the Nordics and Estonia.
Puhuri was set up to solve the problem of distributed resource allocation: how to enable seamless cross-border access for researchers and scientists to the resources they need, with the primary focus being on LUMI. The project’s aim was to create solutions for every step between the user and the resources, from user authentication to resource allocation and tracking; to build a bridge between the new supercomputer and the users. One key feature of Puhuri was the creation of portals where allocation, access, and reporting could be handled as well as the ability to integrate with already existing national portals. Pilot users for LUMI were successfully registered and resources were allocated with the Puhuri service.
– Quite a lot was done, but it was clear that for example the portals, reporting and user identity vetting needs to be further developed, says Jarno Laitinen, the project manager for the first phase.
Challenges of resource sharing
When publicly funded resources are shared between nations, legal considerations need to take place. Education, science and research ministries in the Nordic countries mostly fund their own researchers nationally. From the e-infrastructure point of view, some of the most relevant questions when talking about sharing computing resources across borders are:
- Who is accessing the resource?
- Can the data be moved across borders?
- Are users from abroad allowed to use the national resources?
- Who will pay for cross-border usage and how?
The Dellingr project studied the fundamental political and legal issues related to the topic. The team managed to deploy practical implementations on a temporary basis, but no long-term solutions could be found. Same issues were encountered also in NeIC’s Glenna project, a collaboration on cloud computing and cloud services that ran from 2014 to 2020. In order to comply with most national regulations, full cost-coverage must be implemented if non-nationals are to use the resources; the question of economic sustainability must be solved.
In the Puhuri project, legal or political issues related to resource sharing are not directly in the scope. The service enables technical means for distributed resource allocation, and the resource provider defines the shares which the resource allocators can get. However, the work done in Dellingr as well as the community that was built around it paved the way for the formation of the Puhuri project, as necessitated by the inherent resource sharing of the LUMI consortium.
The LUMI supercomputer
EuroHPC LUMI (Large Unified Modern Infrastructure) is the fastest supercomputer in Europe and third fastest on the Top500 list. It was also nominated as the third greenest supercomputer on the Green500 list. LUMI is located in CSC - IT Center For Science’s data centre in Kajaani, Finland, and was inaugurated this summer. The computing power of the machine is equivalent to the combined performance of 1.5 million of the latest laptops. In addition, LUMI is also one of the world’s leading platforms for artificial intelligence. (Source: The LUMI website: LUMI supercomputer)
LUMI is one of the three European supercomputers built by the European High Performance Computing Joint Undertaking (EuroHPC JU). It is a joint initiative between the EU, European countries and private partners to develop a world-class supercomputing ecosystem in Europe. LUMI is hosted by a consortium of ten partners from ten countries: Belgium, Czech Republic, Denmark, Estonia, Finland, Iceland, Norway, Poland, Sweden, and Switzerland. The supercomputer is a key resource for researchers all over Europe and one of the best known scientific instruments in the world. For the Nordic Region, it brings clear benefits, as it enables easier access to better resources and provides extensive user support to enhance research and innovation.
– The Nordic collaboration through NeIC has been important for developing joint high-quality e-infrastructure solutions beyond national capabilities and boosting synergies between stakeholders. This has made a tangible contribution to research excellence, created Nordic added value and paved the way for the Nordics to seize opportunities on a European level as well. Projects like Puhuri demonstrate this, as it enables efficient use of infrastructures where all the Nordic countries are present, such as LUMI, states Kimmo Koski, Managing Director of CSC - IT Center for Science.
Finland is a great location for data centres due to low operating costs, safe environmental and political conditions as well as extremely high security status. It is also ranked as the most stable and sustainable country in the world, and CSC’s data centre in Kajaani is one of the most eco-efficient centres there is. LUMI’s energy consumption is covered with power produced with hydroelectricity, and the waste heat accounts for about 20% of the district heating in the city of Kajaani, which substantially reduces the entire city’s annual carbon footprint. (Source: The LUMI website: Sustainable future)
Second phase of Puhuri
After the two successful first years of Puhuri came to an end in May 2022, a second phase of the project was initiated. For the next three years, the focus of the project team is on sustainability. The team is now led by Anders Sjöström.
The Puhuri solution developed for EuroHPC LUMI is equally apt to serve other resources as well. Therefore it is planned to offer the service for other supercomputers managed by the European High-Performance Computing Joint Undertaking as well as other resources in the future. For example, the Puhuri solution will be used in NeIC’s new project NordIQuEst, which aims to build a Nordic-Estonian e-infrastructure,and also for the European quantum computer resource, LUMI-Q.
Expanding the scope
Puhuri2 aims to consolidate and improve the existing services and functionalities as well as expanding and adding capabilities. The project also aims to widen the customer base to resources within and outside the Nordics, for example to other interested EuroHPC sites. The scope of Puhuri2 has been expanded to include European research infrastructures as an important activity. While entering this phase of expansion both in terms of capabilities and customers, it is of utmost importance to clearly describe what Puhuri offers its customers in terms of added value and how they will benefit from the services offered. The project is in dialogue with multiple service providers interested in using the Puhuri system and expects to add several resources to the service in the coming years.
At the core of the Puhuri project is the federated access to resources. To provide seamless access to resources for researchers across Europe, the level of assurance (LoA) offered by the Identity Providers (IdP) must be such that the user can be sufficiently identified to grant access to the resource. Together with GÉANT (The collaboration of European National Research and Education Networks), the project has developed the MyAccessID federation which is in the process of solving the question of LoA on a European level. This would mean removing one of the major obstacles in the provisioning of resources.
To sustain the project’s outcome, the Puhuri service, the team needs to develop a sustainable business model to balance the income and costs. At the same time, the service itself is constantly being improved to make accessing resources even more seamless and efficient. The long-term sustainability of Puhuri is not just dependent on the existence of a viable business model, but also on the long-term stability of the owner of the Puhuri system. As NeIC cannot sustain projects indefinitely, Puhuri2 is investigating possible recipients of the system after the project ends in 2025.