ECMWF is both a research institute and a 24/7 operational service, producing global numerical weather predictions and other data for its Member and Co-operating States and the broader community. ECMWF carries out scientific and technical research to improve its forecasts, runs one of the largest supercomputer facilities in Europe and manages a long-term archive of meteorological data.
For details, see www.ecmwf.int
The Platforms and Services Section forms part of ECMWF’s Computing Department, and is responsible for delivering a wide range of services including mission-critical virtual and bare-metal server infrastructure, data centre and wide area networks, security, monitoring and analytics, enterprise ICT, as well as the toolchain to support software development, integration and testing, and automated deployment for in-house or community software developers.
Within the Platforms and Services Section, the CD Applications Team is responsible for containerised environments, service monitoring, centralised logging and analytics, identity and access management, continuous integration and deployment, and web application hosting.
Summary of the role
Our Reliability Engineers (RE) are responsible for ensuring that ECMWF and community-developed applications operate reliably and with good performance on our infrastructure. The RE will engage with, advise, steer and support all functions involved in the lifecycle of application deployment and hosting, including business, technical strategy, design, infrastructure, software development, tooling, service transition, service operation, and use.
This role requires experience of both IT systems and software development with a focus on maintaining effective operations. Key skills include cloud native, automation, continuous integration and deployment, service monitoring, and application performance.
Day-to-day, you will be working as a bridge between the Computing Department and in-house and community application developers, advocating good practice and building a greater understanding of architecture and design to enable reliable and performant operations. There will also be services to develop and you will provide support on-call, for health monitoring and our containerised virtual environment.
Main duties and key responsibilities
- Providing engineering support for cloud platforms and applications
- Providing operational input to the development of cloud platforms and software systems
- Advocating for Reliability Engineering with ECMWF and community application developers
- Deploying open source, commercial, and proprietary software to containers, VMs, or bare metal
- Participating in regular 24-hour on-call rotas for mission-critical services
- Excellent interpersonal and communication skills
- Strong analytical and problem-solving skills, with a proactive approach
- Self-motivated, and able to work with minimal supervision
- Dedication and enthusiasm to work in a geographically distributed team
- Ability to work efficiently and complete diverse tasks in a timely manner
Qualifications and experience required
- A university education to degree standard or equivalent industry experience.
- Demonstrated relevant professional experience.
- Experience in designing and developing applications in an operational Linux based Cloud environment.
- Experience in configuring network, server and storage infrastructures.
- Experience in deploying operational monitoring and performance analysis systems.
- This role would suit IT professionals with either a software development or IT operations background who are prepared to develop into a Reliability Engineering role.
Knowledge and skills:
Demonstrable experience of the following is required:
- Cloud Native (Kubernetes, Docker)
- Cloud IaaS (Amazon, Google, OpenStack, VMware, Terraform)
- DevOps: Continuous Integration, Software Engineering, and Automation pipelines
- Monitoring and analytics applications
- Puppet, Ansible, or similar modular configuration management
- Linux system administration
- Programming (any language) or scripting (Python, Ruby, Perl)
- The server, storage, and networking components required to support Cloud applications.
A working knowledge in some of the following is desirable:
- Go, Python, PHP, Java, Perl, Django, Tomcat.
- NOSQL (MongoDB), SQL (PostgreSQL or MySQL)
- Service-oriented architecture (e.g. RabbitMQ, Istio, service mesh)
- Geographic Information Systems and web mapping
- VMware vSphere, Vagrant, Containers (e.g. Docker, Kubernetes)
- git, Atlassian Bamboo
- AJAX, JSON, XML, HTTP protocol
- Web content management systems (e.g. Drupal)
- Microsoft Active Directory
- Identity and Access Management (e.g. OpenID Connect, SAML)
Please provide clear examples of your knowledge and experience in the space provided on the application form.
Candidates must be able to work effectively in English and interviews will be conducted in English.
Ideally, some knowledge of one of the Centre’s other working languages (French or German) would be desirable but not essential.
The successful candidate will be recruited at the A2 grade, according to the scales of the Co-ordinated Organisations and the annual basic salary will be £60,590.64 net of tax. This position is assigned to the employment category STF-C as defined in the Staff Regulations.
Full details of salary scales and allowances are available on the ECMWF website at www.ecmwf.int/en/about/jobs, including the Centre’s Staff Regulations regarding the terms and conditions of employment.
Starting date: As soon as possible.
Length of contract: Four years, with the possibility of a further contract.
Location: The position will be based in the Reading area, in Berkshire, United Kingdom.
Successful applicants and members of their family forming part of their households will be exempt from immigration restrictions.
Please make your application via the apply button.
At ECMWF, we consider an inclusive environment as key for our success. We are dedicated to ensuring a workplace that embraces diversity and provides equal opportunities for all, without distinction as to race, gender, age, marital status, social status, disability, sexual orientation, religion, personality, ethnicity and culture. We value the benefits derived from a diverse workforce and are committed to having staff that reflect the diversity of the countries that are part of our community, in an environment that nurtures equality and inclusion.
Applications are invited from nationals from ECMWF Member States and Co‑operating States:
Austria, Belgium, Bulgaria, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Hungary, Germany, Greece, Iceland, Ireland, Israel, Italy, Latvia, Lithuania, Luxembourg, Montenegro, Morocco, the Netherlands, Norway, North Macedonia, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey and the United Kingdom.
Applications from nationals from other countries may be considered in exceptional cases.