Bristol or Cambridge or London, United Kingdom

We’re hiring customer focused Systems Engineers and Developers to apply infrastructure support and site reliability engineering approaches to significant projects embracing emerging ML compute technology.

This team helps customers extend their data center and cloud provisioning ecosystems to incorporate our ML compute products, helps define and build pipelines for migration and production operations, and provides site reliability engineering expertise, automation and technical support throughout.

As a platform provider, we offer you the opportunity to work across all sectors – including research organisations, universities, technology vendors and enterprises – encountering a diversity of ecosystems and best practices. You’ll empower others to develop new capabilities and accomplish things that were not previously possible, embracing emerging advances in machine intelligence.

We’re open to shaping a role around you that builds upon your strengths and interests – just let us know.

This may entail aspects such as; integrating with and adapting to the inner workings of network and IP infrastructure; supporting and iterating solutions in production; and analysis, debugging and telemetry up to the transport or application layers. You may stretch your product coding skills too.

Of particular interest are skills applied to domains such as; site reliability engineering at scale; OpenStack admin or development; data center, SDN/NFV, HPC, scientific, grid or cloud computing infrastructure; and/or developing Linux-based systems for novel IP-based protocols.

We’re hiring an all-new team, including a lead engineer, and we’ll be pleased to explore the possibilities with you.

A flavour of work within this team

  • Interfacing between customers, industry partners and our domain experts
  • Defining and building effective infrastructure provisioning solutions
  • Designing and implementing compute workload migration pathways
  • Guiding on adapting and optimising software for new processors and systems
  • Designing, building and refining production pipelines and tooling
  • Contributing to aspects of our SDK product and virtual-IPU tools in Python and/or C++

Salary and benefits:

  • Compelling salary – talk with us about what you need
  • Stock options in a high growth potential start-up
  • Flexible and inclusive working environment – UK hours, work at the times that suit you
  • Discretionary relocation assistance
  • Optional four day week or part time working
  • Flexible amount of holiday + UK national/public holidays
  • 10% CPD time in your calendar, with supporting budget – in addition to the L&D of your role

Matched personal pension | healthcare | life assurance | dental | health cash plan | income protection

  • Someone customer focused and solution oriented
  • A solid understanding of Computing, Maths or Engineering – accrued through formal education or equivalent applied practice
  • Linux configuration and management with shell scripting, Python or similar
  • Optionally; strong Python and/or C++ applied to Linux systems, infrastructure, or back-end development
  • Experience of configuring and managing hardware platforms, and infrastructure for clusters
  • Knowledge of Ethernet and IP networking standards
  • Production admin skills with two or more of; Kubernetes, Docker, Grid Engine, Slurm, OpenStack, public/private cloud etc.
  • Comfortable debugging across multi-layer solutions
  • Familiarity with modern CI/CD and orchestration methods
  • An aptitude for trouble-shooting and a pragmatic application of engineering rigour: from the basic symptoms through to analysis and resolution with code fixes, work-arounds, improved documentation, tutorials, and collaboration with other teams

You may also bring – or may optionally like to gain – skills around:

  • Running novel protocols on IP fabrics
  • HPC or hardware acceleration technologies
  • Data center infrastructure, storage, network, security, virtualisation
  • Compilers and Linux kernel driver development, debugging and system configuration
  • Linux OS’s and memory management

Source: Python.org Jobs Feed