Advanced Computer Science and Data Visualization
Fermilab is continually expanding its capabilities for handling, storing, processing, and analyzing scientific data to meet the needs of the U.S. community’s exciting and ambitious particle physics research program. To keep the community at the forefront of discovery and enable new directions in physics research, Fermilab pursues an advanced R&D program for software and computing solutions, specifically targeting the application of industry standards and emerging technologies. This strategy is based on a close partnership with the DOE Advanced Scientific Computing Research (ASCR) program to enable the particle physics community to take advantage of the latest developments in High-Performance Computing (HPC) and to contribute to the evolution of the National Strategic Computing Initiative towards exascale computing. The following four major initiatives comprise the lab’s strategy for Advanced Computer Science, Visualization, and Data.
art Software Framework
Software is an integral part of the scientific process and is at the heart of every experiment. From recording, storing, and retrieving detector data to performing physics reconstruction and analysis, software needs to be efficient and user-friendly to enable experiments to extract physics results. The scientific workflow framework art provides the community with a common software layer to store and access scientific data and to schedule algorithms in an efficient and convenient way. The system grows to meet the needs of individual experiments through minimal experiment-specific customizations that don’t require expert knowledge. All of Fermilab’s operating neutrino and muon experiments now use the art framework for their experiment’s software. NOvA has been using art for offline data handling and data analysis. MicroBooNE uses the framework extensively for their online and offline data handling and processing. Fermilab will continue to evolve art to keep pace with major software and computing hardware developments like the increased diversity of grid, cloud, and supercomputer resources. These will require changes in the underlying framework but will benefit many experiments simultaneously, thus minimizing the effort required from each individual project. Furthermore, the community has started to benefit from derived products that are based on art, such as artdaq, a common data acquisition system, and LArSoft, a common reconstruction algorithm for liquid-argon neutrino detectors. The evolution of art and its software ecosystem is a significant cornerstone of Fermilab’s strategy to enable future scientific workflows. Fermilab’s approach of common solutions for the community is successful and spans a wide range of scientific software. In addition to art, Fermilab is providing common solutions for workload management and data management for many experiments. Together with HTCondor and the Open Science Grid, Fermilab is also providing a common solution for resource provisioning (glideinWMS).
Particle physics requires copious computing resources to extract physics results. Such resources are delivered by various systems: local batch farms, grid sites, private and commercial clouds, and supercomputing centers. Grid resources will continue to play a central role in Fermilab’s strategy to provide access to computing resources for experiments. Elasticity and temporary significant increases of resources needed for individual experiments is becoming more and more important. Sharing unused grid resources opportunistically is vital to allow for more elasticity. In the future, this will not be sufficient and new forms of computing resources like clouds and HPC centers will need to be integrated. Fermilab’s strategy is to provide easy access to all these resources. Historically, expert knowledge was required to access and concurrently use all these resources efficiently. Fermilab is pursuing a new paradigm in particle physics computing through a single managed portal (HEPCloud) that will allow more scientists, experiments, and projects to use more resources to extract more science, without the need for expert knowledge. HEPCloud will provide cost-effective access by optimizing usage across all available types of computing resources and will elastically expand the resource pool on short notice (e.g. by renting temporary resources on commercial clouds). This new elasticity, together with the transparent accessibility of resources, will change the way experiments use computing resources to produce physics results. The CMS collaboration was amongst the first users of HEPCloud, which is now also used by Mu2e and NOvA. CMS was able to exploit elasticity at high scales in a SuperComputing 2016 demonstration in partnership with Google. Using the Google cloud, CMS doubled its computing resources to over 300,000 cores in peaks of 8 hours. The samples produced were used to generate scientific results presented at 2017 winter conferences. The partnership with ASCR HPC facilities is becoming a cornerstone of the HEPCloud resource portfolio. Experiments gain access through HEPCloud to HPC resources and are using them successfully for a variety of processing tasks. CMS and Mu2e were amongst the first to exploit these capabilities through HEPCloud. Both the U.S. neutrino and muon physics programs and the U.S. CMS program received allocations on ASCR HPC facilities during 2017, which they plan to use through HEPCloud. It is Fermilab’s continuing goal to engage ASCR researchers and facilities experts to deepen the necessary partnerships both for porting HEP applications to these resources and for developing software and security infrastructure to access them.
Active Archive Facility
Fermilab is a world leader in the handling of many petabytes of scientific data for the national and international particle physics community. This includes transferring large amounts of data into the Fermilab facility, archiving the data on Fermilab’s exabyte-scale tape system, and using high-performance disk caches to provide efficient and secure access to the data, either locally or from off-site. A similar need for large-scale data handling has also recently arisen in other scientific disciplines. Fermilab’s Active Archive Facility (AAF) will be developed to enable non-particle physics experiments and collaborations to benefit from Fermilab’s leadership, expertise, and facilities. AAF will open Fermilab’s storage, archiving, and data access capabilities to other research communities, with methods implemented to recover costs. The Simons Foundation continues to be one of the first non-particle physics science organizations benefiting from AAF capabilities. Additional genomics research data have been stored on tape media at the AAF and are regularly and securely accessed from around the world.
New computing architectures are emerging because of advancements in computing hardware. The exascale era of computing will drive these innovative hardware and software technologies. Particle physics will need to adapt so that researchers can benefit from the increased performance of these architectures. Synergies between the DOE/HEP program and the DOE/ASCR program will be essential to allow the U.S. particle physics community to take advantage of this new computing era. Fermilab will seed, cultivate, and coordinate cross-cutting development efforts between DOE/HEP experimental programs and DOE/ASCR institutions to maximize the benefit from exascale computing. With a recently funded LDRD effort on “Exascale-era computing and HEP”, Fermilab is starting on the path to evolve particle physics software and computing to be part of the exascale future.