Fuku logo

Infra Support Engineer

Fuku · Taiwan, TW · 2 months ago

Infra Support Engineer – GMI Global Infrastructure Team

Preferred Location

  • Taiwan
  • Malaysia

Responsibilities

  • Provide first and second-line technical support to customers for AI Infrastructure, including GPU/CPU nodes, networking, storage, orchestration, and platform services. Support is delivered via ticketing systems, emails, Slack, or other messaging platforms.
  • Support GPU cluster delivery, including system provisioning, image deployment, network validation, BIOS/firmware updates, and GPU driver/runtime installation.
  • Monitor system health and service-level indicators using alerts and dashboards; respond to alerts 24x7 as scheduled.
  • Triage incidents by gathering context, verifying scope and impact, and following standard operating procedures and runbooks to perform immediate mitigations.
  • Escalate incidents to global SRE engineers with clear, concise incident notes and relevant logs/traces.
  • Maintain incident logs, update status pages, and communicate timely updates to stakeholders during incidents.
  • Perform routine operational tasks such as log checks, health checks, capacity checks, and simple automated fixes.
  • Participate in postmortems and contribute actionable follow-ups to reduce recurrence of incidents.
  • Help maintain and improve standard operating procedures (SOP), run periodic runbook validation, and document new procedures.
  • Work collaboratively with developers and SRE teams to improve system reliability.

Qualifications

  • Bachelor’s degree in Computer Science or a related field.
  • Over 2 years of experience in IT operations, server administration, SRE, DevOps, or technical support.
  • Hands-on Linux experience, including shell, kernel, and log management.
  • Basic networking knowledge, including TCP/IP, DNS, HTTP, and VLANs.
  • Familiarity with monitoring, alerting, and logging tools such as Prometheus, Grafana, and AlertManager.
  • Experience with Nvidia GPU infrastructure and Kubernetes.
  • Comfortable collecting diagnostics, reading logs, and interpreting traces.
  • Strong troubleshooting mindset and ability to follow runbooks under pressure.
  • Excellent written and verbal communication skills for customer-facing incident handling.
  • Willingness to work shifts and participate in on-call rotations.
  • Bilingual in English and Chinese.
  • Visit the company's website for more information
  • Visit website

Headquarters

Taiwan

Work Location

on-site

Job Category

IT - Network / Systems / DB Admin

Application Deadline

Not specified

Job Type

full-time

Experience Level

entry-level

Application Method

Apply via Website

Salary

Not specified

Quick Search Fuku Company in Taiwan

Related Jobs

No related jobs found