Scientific OpenStack is a cloud computing platform that is designed to support scientific research and data-intensive applications.Scientific OpenStack is a variant of OpenStack that is customised to meet the specific needs of scientific research and development. Research and scientific disciplines were some of the earliest and most prevalent use cases for OpenStack clouds and OpenStack today provides compelling solutions for the challenges of delivering flexible infrastructure for high-performance computing (HPC) and high-throughput computing (HTC), with the development community rapidly expanding existing and new services to meet future demands.
Scientific OpenStack provides a range of features and tools that enable researchers to store, manage, and perform complex analysis of large volumes of data. Some of the key features of Scientific OpenStack include:
High-performance computing capabilities – Scientific OpenStack is designed to support high-performance computing workloads, such as simulation and modelling, data analysis, and machine learning.
Customisable infrastructure – Scientific OpenStack allows researchers to customise the infrastructure to meet their specific needs, including compute, storage, and networking resources.
Containers – Scientific OpenStack supports containerization technologies such as Docker and Kubernetes. This enables researchers to build and run complex workflows involving multiple applications in a reproducible and scalable manner.
Security – Scientific OpenStack is designed to meet the highest standards of data security and confidentiality.
An OpenStack cloud (named 'Arcus') is currently the base infrastructure provider for all services deployed by the Research Computing division at the West Cambridge Data Centre, including provisioning of new high-performance CSD3 HPC clusters, Secure Research Computing environments, storage, hypervisor and web application platform services.Principal Investigators, research group members and group IT support staff are able to submit an application to rent a portion of the available cloud resources, on which to build their own research computing applications without needing to first provision physical hardware in their home department. If you would like to investigate this possibility, email us at firstname.lastname@example.org
ExCALIBUR (Exascale Computing Algorithms & Infrastructures Benefiting UK Research) is a UK research programme that aims to deliver the next generation of high-performance simulation software for the highest-priority fields in UK research. It started in October 2019 and will run through until March 2025, redesigning high priority computer codes and algorithms to meet the demands of both advancing technology and UK research.
RCS is actively working on ExCALIBUR’s ExCALIData, a project which has a number of milestones and deliverables anticipated in the next three years.
Redesigning codes and workflows to take advantage of exascale simulation requires addressing I/O and workflow problems which arise throughout a sequence of activities that may begin and end with deep storage and will involve both simulation and analysis phases.
Many simulations begin by exploiting data which need to be extracted from deep storage. Such data are needed to provide ancillary data, initial conditions, and boundary conditions for simulations. In weather and climate use case some such data may need to be extracted from within large datasets, and even from within ‘files’ inside such datasets. There may be some processing required to do the extraction, possibly including decompression, subsetting, regridding (‘re-meshing’) and reformatting. Depending on volumes and available tools, such processing may be best handled adjacent (and/or within) the storage system itself or handled during the simulation workflow itself.
We address storage interfaces at the application level and tools for managing information about what is where across tiers and between institutions; we examine the possibilities for accelerating I/O using both network fabric and advanced burst buffers; we compare and contrast examples of generic and domain specific I/O-middleware.
This work recognises the difficulties that users have in managing the location of their data across tiered storage, and of configuring their workflows to take advantage of newly available acceleration options which otherwise require rather arcane knowledge. Recognising that there are different problems in the analysis and simulation phases, we target both the analysis and simulation phase. We specifically look at the impact of the various technologies on synthetic tests, and on real use-cases from both Weather and Climate, and Fusion (including techniques for AI/ML and other analysis on the huge datasets expected).