Skip to main content
Enregistré

Ops Engineer



Apply now

Détails du poste


In order to achieve highly available, highly reliable and performance efficient production systems an SRE is given a very broad mandate and authority to operate. SRE responsibilities cover:


1. Monitoring & Troubleshooting

a. Monitoring the performance of our production systems using a host of monitoring tools

b. Proactively identifying and troubleshooting issues such as software bugs, misconfigurations, performance bottlenecks and coordinating the fix of those issues

2. Availability & Reliability

a. Increasing availability and reliability of our production systems

b. Coordinating Chaos Testing

3. Capacity Planning

a. Coordinating capacity assessment and capacity planning with IT Engineering and IT Architecture

4. Technical Risk and Health Assessment

a. Constantly running technical state health assessments on production infrastructure and systems to identify CIs deviating from baseline

5. Service Level Management

a. Actively monitoring SLAs and ensuring that services perform within promised SLAs

b. Holding IT Engineering, Security and Architecture accountable for the remediation of any SLA degradation

6. IT Key Controls

a. Ensuring that IT is ‘in CONTROL’ by holding IT groups accountable for adherence

b. Collating and providing necessary evidence to Auditors for these controls

7. Runners, Automation & Tooling

a. Architecting, creating and automatically managing an army of ‘runners or bots’ that fully automate tasks across infrastructure and applications – e.g. extracting production data, generating production reports, trigger event responses etc.

b. Identifying and automating manual operational tasks

c. Building and integrating tools that will assist in improving system availability, reliability and performance

8. Incident & Problem Management

a. Coordinating incident management and service restoration.

b. SREs are part of the on-call team of engineers that support production systems.

c. Work with BizDevOps squads on post mortems & assist in identifying and fixing reliability issues

9. Disaster Recovery (DR) & Business Continuity Planning (BCP)

a. Plan and Manage Disaster Recovery (DR) Runbook and DR testing

10. Production Reporting

a. Gather relevant data and provide accurate production reporting for availability, reliability, performance and capacity.

11. Service Request Management

a. A small part of the job requires coordinating response to the occasional service request from our business partners. For e.g. if a business unit requests restore of a particular backup

If you are good at:

Systems administration

  • RHEL Linux

Virtualisation

  • VMWare

Cloud Concepts, Platforms, Technologies & Tools

  • OpenShift
  • Containers & Orchestration – Docker & Kubernetes

Web/App Servers

  • NGINX, JBoss etc.

CICD

  • Git, Jenkins, Ansible, etc.

Scripting

  • Bash, Go, Python etc.

Programming Languages

  • Java

Databases

  • SQL server administration
  • Ability to write complex queries

Logging & Analytics

  • o ELK

Networking

  • Load Balancing – F5, HA Proxy
  • Firewalls

Monitoring

  • Prometheus, Grafana, DataDog, StatusPage etc.

If you are:

  • Enthusiastic, curious and self-driven on improving system reliability, availability and performance.
  • Analyze trends to pro-actively prevent incidents, understand and capture key data from log.
  • Understand traffic flows and key dependencies between services.
  • Strong sense of ownership of problems.
  • Solve complex problems while remaining cool under pressure.
  • Effective collaboration and communication skills.

If you’d like to work with:

  • IT Engineering Team
  • Experts from Europe

Apply now

Questions? Just ask
ING Recruitment team

Apply now

Chez ING, nous voulons libérer tout le potential de nos collaborateurs, grâce notamment à une culture inclusive où tout le monde peut se développer et avoir un impact sur nos clients et sur la société. Nous veillons à ce que la diversité, l'équité et l'inclusion soient prioritaires. En tant qu'employeur souscrivant au principe de l'égalité des chances, nous ne tolérons aucune forme de discrimination, qu’elle soit liée à l'âge, au sexe, à l'identité sexuelle, à l'origine culturelle, à l'expérience, à la religion, à la race, à l'origine ethnique, au handicap, aux responsabilités familiales, à l'orientation sexuelle, à l'origine sociale ou à tout autre statut protégé par la législation. Si vous avez besoin d'aide lors du processus de candidature et/ou d'entretien, veuillez contacter le (la) recruteur(se) du poste concerné. Nous serons heureux de vous accompagner pour garantir un processus équitable et accessible. Apprenez-en plus sur notre engagement en faveur de la diversité, de l’inclusion et de l’appartenance.

Plus pour vous

The latest jobs straight to your inbox

Interested In

  • IT Engineering, Makati City, National Capital Region, PhilippinesRemove

By submitting your information, you acknowledge that you have read our privacy policy and consent to receive email communication from ING.