It Infiniband/Gpu -Sr Staff Systems Engineer

It Infiniband/Gpu -Sr Staff Systems Engineer
Empresa:

Cadence Design Systems


Detalles de la oferta

IT InfiniBand/GPU -Sr Staff Systems EngineerCadence is looking for a Sr Staff Systems Engineer who accelerates strategic customer deployments and ensures on-time bring-up and deployment of HPC infrastructure and troubleshooting and supports technical roles supporting HPC, InfiniBand, and GPU at our San Jose location!
The successful candidate will be a hands-on technical candidate within the infrastructure team and be exposed to customer interfaces dealing with the Windows and Linux OS.
The System Engineer will need experience in Linux environments and proficiency in tasks such as shell scripting.
Must Haves15+ years of experience in system administration and engineering.Minimum five years overall experience in technical roles supporting HPC, InfiniBand, and GPU.Strong knowledge of Linux operating systems and networking and security concepts.Document and drive acceptance and qualification test plans, procedures, and reports.Customer deployments and ensure on-time bring-up of GPU Servers. InfiniBand fabric bring-up, configuration, and subnet management on the IB switch.Participate in engagements with various SW and FW (BMC/SBIOS/OS/drivers etc.) teams to develop best-in-class practices and tools; you will be analyzing, debugging, and resolving critical firmware and software issues for the workload performance at scale.Provide engineering solutions to enable large-scale performance strategies for Datacenter GPU Computing products and software stacks, ensure technical relationships with internal and external engineering teams, and assist systems engineers in building creative solutions.RequirementsAccelerate strategic customer deployments and ensure on-time bring-up and deployment of HPC infrastructure.Participate in engagements with various SW and FW (BMC/SBIOS/OS/drivers etc.) teams to develop best-in-class practices and tools; you will be analyzing, debugging, and resolving critical firmware and software issues for the workload performance at scale.Provide engineering solutions to enable large-scale performance strategies for performance for Datacenter GPU Computing products and software stacks.Development and implementation of server and rack-level telemetry aspects, collaborate and establish continuous improvements in our design flows.Recent experience in critical data center technologies such as server architectures, software containers, job schedulers, and parallel computing.Cluster management for HPC and actively connect with management regarding any problems with the equipment and propose a resolution.Establish and maintain IT infrastructure and procedures for customer-facing and internal systems.Actively establish the technical relationship with our customer's engineers, management, and architects at focus accounts.Create and develop test plans for new features on each product. Recommend improvements to enable automated scripting for testing and archiving of results.Provide remote cluster support to large environments, including scalability/flexibility and troubleshooting end-user issues involving job submission, runtime, and resource access.InfiniBand fabric configuration and administration on Red hat/Centos/Linux experience in configuring PKeys and troubleshooting the end-to-end InfiniBand environment.InfiniBand fabric bring-up, configuration, subnet management, and monitoring on the IB switch and client side for multi-tenancy setup.Performance comparison of the InfiniBand network with cluster interconnects and debugging the InfiniBand performance-related issues.Automate configuration management, software updates, and system availability maintenance and monitoring using modern DevOps tools (Ansible, Gitlab, etc.).Be a technical specialist on GPU computing and networking products, directly supporting GPU customers.Direct experience and strong knowledge of parallel programming, GPU CUDA/ROCm development, and applications.Actively partner with the R&D teams delivering services to our infrastructure to gather their service requirements to live within this infrastructure.Automate repetitive tasks and implement custom solutions using scripting/programming languages such as bash or python.Configure and troubleshoot a heterogeneous (QDR, FDR, EDR) InfiniBand network and associated subnet manager.Experience with High-performance computer interconnects (e.g. 10 and 40 Gigabit Ethernet, InfiniBand).Able to move 50+ pounds.#LI-MA1
The annual salary range for California is $130,200 to $241,800. You may also be eligible to receive incentive compensation: bonus, equity, and benefits. Sales positions generally offer a competitive On Target Earnings (OTE) incentive compensation structure. Please note that the salary range is a guideline and compensation may vary based on factors such as qualifications, skill level, competencies and work location. Cadence is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, basis of disability, or any other protected class.

#J-18808-Ljbffr


Fuente: Jobleads

Requisitos

It Infiniband/Gpu -Sr Staff Systems Engineer
Empresa:

Cadence Design Systems


Supervisor De Maquinaria Y Riego

Descripción: Quién ocupe el cargo será el responsable de gestionar la operación, proyectos y mantenimientos del área de Maquinarias y Riego velando por el ef...


Desde Manage Resources - Santa Fe

Publicado a month ago

Senior Principal Piping Designer - Buenos Aires

Job Description - Senior Principal Piping Designer - Buenos Aires (BAH002C):Construyendo sobre nuestro pasado. Listos para el futuroWorley es una firma globa...


Desde Worley - Santa Fe

Publicado a month ago

Ref. . Técnico De Mantenimiento Mecánico. Z/San Justo.

Aptitud Estratégica SRL®, empresa Consultora especializada en Búsqueda y Selección de Capital Humano, busca para importante industria de San Justo a Técnico ...


Santa Fe

Publicado a month ago

Ref. 18934 - Proyectista Mecánico Para Imp. Metalúrgica/ Santa Fé

ADN – Recursos Humanos estamos en la búsqueda de un Proyectista Mecánico para Imp. Metalúrgica / San Carlos Prov. Santa Fé para Importante Empresa. Reque...


Desde Adn - Recursos Humanos - Santa Fe

Publicado a month ago

Built at: 2024-09-21T03:08:00.399Z