In today’s landscape, High Performance Computing (hashtag#hpc) plays a pivotal role across industries from finance to pharma and from advertising to aerospace, enabling intricate simulations, mathematical modeling, and data-intensive tasks.
Nevertheless, legacy HPC infrastructure, comprising outdated server, storage, and interconnect architectures hosted in end-of-life
hashtag#datacentres, can significantly impede an organization’s capacity to meet the escalating demands of today’s business environment. In the world of Time To Results, there are no prizes for second place.
But how to plan a successful a pathway for your organisation to modernise? Read on to find out.
Legacy HPC Challenges
The limitations of legacy HPC infrastructure are evident:
- Older bare-metal servers lacking scalability often result in queuing and delays due to increased workloads.
- Storage systems are unable to keep up with the surge in data, leading to performance bottlenecks.
- Network interconnects, like InfiniBand or OmniPath, hamper overall system performance.
- Limited ability to leverage modern technologies such as
- Maintenance costs and energy inefficiencies rise as aging components require more attention.
- On-premise data centre power and cooling inadequacies, expense and uptime performance
These limitations translate into inefficient resource utilization, an inability to cater to user needs, and decreased competitiveness as the technology gap widens. As exemplified by a recent
BNP Paribas
case study, upgrading from an outdated commodity cluster to contemporary HPC infrastructure yielded “an increase of nearly 30% in total capacity, reduced energy consumption by more than 50 percent, and decreased CO2 output by 85 percent.”
HPC Modernization Approaches
Transitioning to modern infrastructure is imperative, yet it presents considerable technological challenges due to existing dependencies. There are five primary approaches:
- Refresh
- Consolidate
- Gradually
- Cloud migration
- Adopt
HPC Data Center and Cooling Considerations
The density and power requirements of new HPC architectures, incorporating advanced CPUs, GPUs, FPGAs, and interconnects, present cooling and space challenges in conventional data centers. High-performance data centers, purpose-built for HPC, offer specialized infrastructure to support these dense configurations.
Key features include high cooling capacity using chilled water or two-phase immersion solutions, robust support for heavy rack loads, redundant power delivery to each rack, and scalable space for future expansion. For instance,
direct liquid cooling, provided by companies like
CoolIT Systems, can handle heat densities exceeding 60kW per rack while enhancing energy efficiency and enabling heat reuse for other stakeholders eg district heating providers.
By deploying modern HPC systems in suitable high-performance data centers, organizations can fully exploit the performance and efficiency gains promised by new technologies. The choice of data center environment significantly influences HPC modernization planning.
Each approach involves trade-offs that must align with modernization objectives and timelines. Often, a hybrid strategy is necessary to balance cost, risk, and business impact.
Efficient Job Scheduling and Workload Management
A key component of any HPC environment is the job scheduler, which queues, prioritizes, and allocates compute jobs across the available resources. Legacy HPC systems often used basic schedulers like PBS, LSF, or SGE.
Modern workloads and infrastructure require more advanced scheduling capabilities:
- Integration with cloud burst capabilities for dynamic scaling
- Support for diverse architectures like GPUs and FPGAs
- Optimizing job distribution across complex interconnect topologies
- Machine learning algorithms to automate optimization
- Integrated monitoring dashboards and reporting
When assessing modernization, scheduling needs to be evaluated. Upgrading the legacy scheduler or implementing an advanced solution like
Altair
Grid Engine may significantly boost throughput and utilization. If you are selecting an HPC partner their scheduling expertise is key for properly complementing your policies, queues, and algorithms.
Executing a Successful HPC Transition
Migrating from legacy to modern HPC infrastructure involves:
- Auditing existing systems and workloads to understand dependencies, risks, and requirements.
- Defining clear objectives, metrics, and milestones aligned with business needs.
- Creating a detailed migration plan and roadmap, including a business case.
- Piloting proposed changes on test systems to validate performance, stability, and integrity.
- Executing a phased rollout of new architectures while maintaining legacy access.
- Providing extensive technical support for troubleshooting and co-optimization of software stacks.
Well-defined validation criteria and rollback plans are essential to de-risk transition and ensure continuity of production workloads.
Evaluating HPC Service Providers
For organizations lacking internal HPC expertise, partnering with experienced HPCaaS providers can accelerate transition while mitigating costs and risks. When evaluating providers, consider:
- Demonstrated HPC technical expertise across architecture, administration, optimization, scheduling and configuration.
- Experience with successful legacy HPC migrations in your industry.
- Access to cutting-edge infrastructure, including servers, interconnects, storage, and accelerators.
- Support for hybrid cloud and on-premises infrastructure.
- Responsiveness to custom engagements and strong performance track record.
Leveraging external expertise enables IT leaders to focus on HPC strategy rather than technical implementation.
Conclusion
Legacy HPC infrastructure hampers competitiveness as demands escalate. A proactive modernization strategy is imperative to transition to flexible, scalable platforms harnessing cloud, AI, and accelerated computing. The process requires careful planning and execution to minimize disruptions. Partnering with experienced service providers offers access to turnkey solutions and expertise that complement internal capabilities. Organizations that modernize HPC infrastructure today will be best positioned for competitive advantages in the future.