Contact - 8871646344/ 8817345514
HPC Engineer On-Prem & Cloud
Job Description Role Summary/Purpose
The HPC Engineer works with other Engineering Team members and collaborates in the design, development, installation, and maintenance of simulation software & job scheduler software for the High-Performance Computing (HPC) systems.
The HPC Engineer is responsible for supporting the planning, implementation, availability, performance, security, maintenance, and repair of high-performance computing infrastructure (On Prem & Cloud).
The HPC Engineer participates in multi-vendor management, security, and network/Internet protocols for the Wabtec EngineeringIT organization.
Roles and Responsibilities
- Supports day-to-day operations for the HPC team by monitoring computing resource performance, managing configurations, and addressing security administration. Applies revisions to system firmware and software. Engages and collaborates with vendors to assist with support activities as required.
- • Develops new HPC software deployment plans, custom scripts, and testing procedures to ensure operational reliability for Wabtec Engineering Team. Trains HPC Team members & Engineering Teams in the use of new software and hardware. •
- Maintains and manages HPC user accounts. Installs, modifies, and maintains various software applications for access on HPC clusters. Provides support and documentation for software applications and programs.
- • Designs, installs, configures, and maintains documentation for cluster infrastructure, including operating systems, job schedulers, resource managers, provisioning managers, configuration managers, network devices, and other components.
- • Investigates, debugs, and addresses Engineering user inquiries and requests efficiently through a customer issue ticketing system. Communicates complex technical concepts in simple, straightforward language.
- • Explores emerging technologies and technical developments to address expanding analytical requirements. Identifies new services and develops implementation plans. Stays current with best practices in the HPC field. Maintains collaborative relationships with peer Engineering and IT teams.
- • Contributes to an inclusive environment that values differences by building and maintaining collaborative relationships with team members, peers and leaders. Actively embodies values and behaviors including accountability, ethics, and best-in-class customer service. Contributes to a culture of trust and transparency by sharing information broadly, openly, and deliberately.
Desired Candidate Profile
• Bachelor’s degree in a relevant field such as computer science, computer information systems, etc., or equivalent combination of education, training, and experience
. • Two years of experience in one of the following fields: information technology, system administration, or high-performance computing and cloud Technologies
• Familiarity with low-latency/high-bandwidth, interconnected infrastructure (including Infiniband, 10/100GigE, and others).
- Expertise with HPC system software cluster management tools, job schedulers, and other HPC tools including Slurm, Ansible, and Ansys, Altair, LyDyna.
- Proficiency with fundamental programming skills (Bash, Python, C/C++ or similar languages). Expertise with administration, monitoring, and maintaining secure Linux/Unix operating systems (CentOS).
- Knowledge of HPC storage (FC, SAS) principles, file systems (NFS, Lustre, BeegFS, ZFS, etc.), and compute node storage.
- Familiarity with shared and distributed memory parallelism (OpenMP, MPI), and accelerators (GPUs). • Excellent written and oral communication skills, and the ability to establish strong, positive working relationships and rapport with diverse groups of team members. Ability to drive technical leadership and management of complex, large-scale computing system projects
- Proficiency with multi-vendor management, security and network/Internet protocols.
• Demonstrated expertise in design configuration and planning, with excellent organization
skills, and the ability to identify and resolve problems and manage performance.
• Excellent written and oral communication skills, with experience presenting technical topics to nontechnical audiences.
• Ability to establish processes for maintaining system performance and managing best-in-class standards. Desired Characteristics Business Acumen:
- Demonstrates the initiative to explore alternate technology and approaches to solving problems • Skilled in breaking down problems, documenting problem statements and estimating efforts
• Demonstrates awareness about competitors and industry trends • Has the ability to analyze impact of technology choices
Contact - 8871646344/ 8817345514
Role:Configuration and Deployment Management
Salary: Not Disclosed by Recruiter
Department:IT & Information Security
Role Category:IT Infrastructure Services
Employment Type:Full Time, Permanent
Sigma Allied Services
Product Based Company
150 Years Old
Work In Locomotive Segment
Provide Digital Expertise
Contact Company:Sigma Allied Services