2022-09-15

TEN SUCCESS STORIES FROM TEN YEARS OF SUCCESS: Training that connects people

Year 2022 marks the 10-year anniversary for NeIC. To celebrate this milestone, we will publish ten stories that showcase how NeIC has contributed to developing best-in-class e-infrastructure services beyond national capabilities and enhanced the productivity of research in the Nordic Region.

The Nordic e-Infrastructure Collaboration, also known as NeIC, was established in 2012. NeIC facilitates collaboration on digital infrastructure within the Nordic countries and Estonia by providing experts coming from different countries, organisations and fields opportunities to work together. This Nordic collaboration on digital infrastructure had started already before NeIC was established. Since 2003, the Nordic countries have been collaborating on the Worldwide Large Hadron Collider Computing Grid (WLCG) at CERN, providing research computing and storage for high-energy physicists worldwide. The successful collaboration that started with the services offered by the Nordic Data Grid Facility (NDGF) was after some years expanded into NeIC, which was tasked to run the Nordic WLCG Tier-1 facility, also known as NT1. The initiation of NeIC made it possible to facilitate collaborations to benefit other science areas.

NeIC strives to be a meeting point for experts around the Nordic region and help them achieve more by collaborating across borders. This Nordic-Estonian pool of great minds also provides individuals new opportunities to develop their own skill sets and, at the same time, increase the competence of their home organisation. NeIC’s training activities have been exemplary in connecting people and building communities. We interviewed Radovan Bast, the manager of the CodeRefinery project, to learn more about them.

Training activities over the years

When NeIC was established as part of NordForsk in 2012, it was tasked to expand the collaboration on digital infrastructure beyond what was done through the NDGF. The first development projects that started under the NeIC umbrella focused on sensitive data, cloud computing and high-performance computing services. These fields were critical and of strategic importance to the Nordic countries. At the same time, other fields existed where the countries would benefit greatly from sharing expertise and working together to improve e-infrastructure.

In 2016, NeIC started its first project that focused on training. This project was the first phase of CodeRefinery, and today, the project is in its third phase with a more focused scope and a wider impact than ever. CodeRefinery was and is led by Bast, who works at UiT - The Arctic University of Norway. In 2017 started a training programme named Ratatosk which ran for two years. Ratatosk was a mobility enhancement programme with the goal to raise competence in staff and end users which leads to more effective and productive work by the Nordic e-infrastructure community. There were some synergies between the two projects, but their functions were different: Ratatosk brought people together and enabled, even sponsored training events arranged in the Nordics; CodeRefinery was and still is mostly about teaching.

In addition to CodeRefinery and Ratatosk, NeIC has since 2019 trained some 160 people in FAIR data management. NeIC’s efforts around FAIR data management impact science in the Nordics by boosting the quality of research data. The EU-funded EOSC-Nordic project that is coordinated by NeIC has also arranged several workshops and training events.

CodeRefinery – a cross-border teaching community

CodeRefinery is a project under the NeIC umbrella: it is partially funded by NeIC and the management is supported by NeIC. Project manager Bast says CodeRefinery is a project that does many different things, but its main function is to organise and deliver training events on tools and techniques that many researchers need but most don’t have access to in their regular curricula.

– These days, many researchers need to work with data and code. In most cases, this and especially the tools they might need is not included in their training. CodeRefinery teaches researchers to use tools such as version control, GitHub, different notebooks, reproducible and modular software development as well as documenting and testing code, Bast explains.

For the training events, the project team develops the lesson material themselves. CodeRefinery collaborates with new events to pool competencies and facilitate new connections. In addition, CodeRefinery provides services such as the source code repository hosting on GitLab. Still, providing cross-border, accessible training and training material is the core of CodeRefinery.

– The scope of CodeRefinery is similar to The Carpentries, which is a big, international organisation consisting of different communities teaching data, coding, and library skills. The Carpentries are an inspiration for a number of tools and techniques which we use in our training. The difference between us and The Carpentries is that in our workshops, a typical participant is a student or researcher who already writes code, and we teach them how to collaborate and use tools; The Carpentries and Data Carpentry workshops are highly recommended to get started with programming and programming tools. CodeRefinery is a good next step to someone who has been to a Carpentries workshop, Bast says.

CodeRefinery teaches how to make software findable, accessible, interoperable and reusable in practice. These days, the FAIR principles get a lot of recognition and, consequently, funding, but usually only when we talk about FAIR data. Bast says that if we want data to be FAIR, the software needs to match that - after all, it is the software that we use to produce, post-process, read and connect data. Both FAIR data and FAIR software are needed to make science more open and useful for future projects.

From a local initiative to training hundreds at a time

CodeRefinery grew out of courses organised by Swedish e-Science Education. Bast, who at the time was located in Stockholm, was involved in preparing them. The courses focused on software development tools and best practices, and participants would come to Stockholm from all parts of Sweden. Two courses were held, first in 2014 and again in 2015, and they were very popular. However, it was clear that the problems the courses were trying to solve were not national. Any country would benefit from such training, and it could probably be done also at the Nordic level. Bast’s colleague Rossen Apostolov decided to write a project proposal and reach out to NeIC.

The CodeRefinery proposal was submitted in 2015, three years after NeIC was established. At the time, NeIC did not have specific calls for new development projects, and the proposal was sent directly to NeIC’s board to discuss and decide on. The proposal was accepted, and CodeRefinery started as a NeIC project in October 2016. They held their first training event two months later, and since then, over 30 training events have been arranged.

The Swedish e-Science Education courses were a starting point for CodeRefinery’s training material, but Bast says that both the content and the form of events have completely changed over the years. The scope of the project has been adjusted and it has evolved more into training and collaboration and less into infrastructure. In the last two and a half years, CodeRefinery had to undergo a major change when they moved from in-person teaching to online teaching. According to Bast, this has opened up the project a lot. Not only have they reached and had more participants than before, they have been able to engage and involve many more volunteers to help in delivering the workshops. Workshops have also been live streamed and recorded so that anyone can follow them.

– Our focus is still on the Nordics because most of our volunteers come from the Nordics, but once again, it has become clear that these problems are not confined to this region. Since the pandemic, the workshops can be and have been attended by people from other countries. We also have really good collaborations with partners from the Netherlands, for example, he says.

Sustainability plans

The third phase of the CodeRefinery is running more on in-kind contributions than the previous two phases. Thus, the community consists more of people and organisations that are volunteering to help rather than paid to do so. Bast says that motivating organisations to join and ensuring that it is beneficial for them is very important for the project. After CodeRefinery’s third phase ends in 2025, their plan is to continue the great efforts and keep the community running as a non-profit organisation. This requires formulating and communicating a clear message to organisations, telling them why they should join CodeRefinery as volunteers and what they get out of it. Some coordination will probably be required, and this could be funded with some kind of sponsorship or membership model. The future is still open, and Bast hopes that as phase three proceeds, the right path towards sustainability becomes clear.

As the project manager, Bast sees CodeRefinery has over the years produced many outcomes worth mentioning - and sustaining. The greatest of them are the active community around the project, the massive number of people trained over the years – roughly 1,500! –, the brand that has been created and made stronger over the years, and lastly, their extensive operational manual for documentation. Other impacts and some statistics of CodeRefinery can be seen here.

Bast also lists some challenges that CodeRefinery will have to try and tackle in the coming years. The project needs to work on making their open-source material easily citable, so that the people who contribute get the credit they deserve. They should also communicate better than before about their needs; what are the project’s short and long-term goals, and what exactly can volunteers and organisations do to help in achieving them.

– In ten years, I hope that our teaching material has been absorbed into courses and training that all students and researchers have access to. I hope that we have grown to a diverse, welcoming, and inclusive community around research software engineering, training, and learning, where community members - both individual and institutional - find value in participating and contributing, Bast says.

Training leads to more efficient use of infrastructure

Sharing software and making it easier to share is an essential part of the CodeRefinery training. Even though CodeRefinery is not really focusing on the infrastructure part, Bast states that training and infrastructure go hand in hand. CodeRefinery wants to teach students and researchers that use research infrastructure in their work to use it in an efficient way. Research infrastructure is often publicly funded, and when it is used efficiently, it maximises the return of investment - our investment. The sooner researchers get their results, the more they get done and the more research they publish.

– It’s very difficult to measure the number of papers published per a 1000 NOK investment, but whatever that number is, we are trying to make it bigger. Training and mentoring are very important in this, especially training on efficient use of computation, data, storage, sensitive data, the tools, programming languages and approaches. This goes beyond what we do in our project, but we hope that through our network, we can attract and bring forward training efforts that focus on other aspects than programming tools, Bast says.

If a community is only given an infrastructure without any training, documentation, mentoring or support, the infrastructure is not very useful. Bast adds that also the staff needs to be trained on the latest tools and approaches, and they need to be trained on how to train the users. This is something CodeRefinery does by training instructors with the goal of making teaching more inclusive, accessible and useful. Training is a part of research, a part of being a researcher, not only to be able to use the infrastructure but to learn about and navigate the FAIR landscape.

Connecting people

When asked about the most important outcomes of NeIC’s training activities in general, connecting people comes to Bast’s mind first. He says Ratatosk was particularly efficient in this in the way they connected national training coordinators and managed to pool competencies, and CodeRefinery has also arranged lots of workshops and other events and facilitated new networks and collaborations such as the Nordic Research Software Engineers. Ratatosk also managed to bootstrap The Carpentries workshops in the Nordics, which has been very beneficial for CodeRefinery.