Site Reliability Engineer/ System Administrator
2025-06-26T16:15:21+00:00
ENGIE Energy Access
https://cdn.greatzambiajobs.com/jsjobsdata/data/employer/comp_8841/logo/ENGIE%20Energy%20Access.png
https://www.engie.com/en/taxonomy/term/1467
FULL_TIME
Lusaka
Lusaka
10101
Zambia
Business Management and Administration
Computer & IT
2025-07-10T17:00:00+00:00
Zambia
8
Job Purpose/Mission
We are seeking a talented and experienced System Administrator/Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services. You will collaborate with cross-functional teams to implement and maintain robust infrastructure solutions, focusing on automation, monitoring, and incident response. The ideal candidate is passionate about optimizing and enhancing system reliability, possesses strong problem-solving skills, and is committed to driving excellence in operational practices.
Responsibilities
1.Infrastructure Automation:
- Develop and maintain automation tools and scripts for provisioning, configuration, and deployment.
- Implement infrastructure as code (IaC) practices to ensure consistency and reproducibility.
2. Monitoring and Incident Response:
- Set up and maintain monitoring systems to detect and respond to performance issues and outages.
- Participate in on-call rotations and respond promptly to incidents, troubleshoot, and implement solutions to prevent recurrence.
3. Performance Optimization:
- Optimize system performance through continuous analysis and tuning.
4. Reliability Engineering:
- Implement best practices for reliability, such as error budgeting, SLIs/SLOs, and blameless post-mortems.
- Work towards minimizing manual intervention through automation.
5. System Administration:
- Manage and maintain server infrastructure, including installation, configuration, and troubleshooting of operating systems.
- Implement and maintain security measures, such as firewalls and intrusion detection systems.
- Perform regular system backups and recovery procedures.
6. Collaboration and Communication:
- Collaborate with cross-functional teams to align infrastructure and operational requirements.
- Provide technical guidance and support to colleagues in areas related to reliability.
Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- Proven experience as a Site Reliability Engineer or System Administrator.
- Strong Linux and Bash scripting skills.
- Proficiency in cloud platforms (e.g., AWS, Azure, GCP, Linode, DigitalOcean).
- Experience with container orchestration tools (e.g., Kubernetes, Docker, LXD).
- In-depth knowledge of networking, security, and system administration.
- Familiarity with infrastructure as code tools (e.g., Terraform, Ansible).
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.
Preferred Qualifications:
- Experience with CI/CD pipelines and related tools.
- Knowledge of distributed systems and microservices architecture.
- Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack).
- Familiarity with programming languages (e.g., Python, Ruby).
We thank all applicants for their interest, however due to the large volume of applications we receive, only shortlisted candidates will be contacted.
Responsibilities 1.Infrastructure Automation: Develop and maintain automation tools and scripts for provisioning, configuration, and deployment. Implement infrastructure as code (IaC) practices to ensure consistency and reproducibility. 2. Monitoring and Incident Response: Set up and maintain monitoring systems to detect and respond to performance issues and outages. Participate in on-call rotations and respond promptly to incidents, troubleshoot, and implement solutions to prevent recurrence. 3. Performance Optimization: Optimize system performance through continuous analysis and tuning. 4. Reliability Engineering: Implement best practices for reliability, such as error budgeting, SLIs/SLOs, and blameless post-mortems. Work towards minimizing manual intervention through automation. 5. System Administration: Manage and maintain server infrastructure, including installation, configuration, and troubleshooting of operating systems. Implement and maintain security measures, such as firewalls and intrusion detection systems. Perform regular system backups and recovery procedures. 6. Collaboration and Communication: Collaborate with cross-functional teams to align infrastructure and operational requirements. Provide technical guidance and support to colleagues in areas related to reliability.
Experience with CI/CD pipelines and related tools. Knowledge of distributed systems and microservices architecture. Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack). Familiarity with programming languages (e.g., Python, Ruby).
Experience with CI/CD pipelines and related tools. Knowledge of distributed systems and microservices architecture. Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack). Familiarity with programming languages (e.g., Python, Ruby).
No Requirements
JOB-685d7219e6607
Vacancy title:
Site Reliability Engineer/ System Administrator
[Type: FULL_TIME, Industry: Business Management and Administration, Category: Computer & IT]
Jobs at:
ENGIE Energy Access
Deadline of this Job:
Thursday, July 10 2025
Duty Station:
Lusaka | Lusaka | Zambia
Summary
Date Posted: Thursday, June 26 2025, Base Salary: Not Disclosed
Similar Jobs in Zambia
Learn more about ENGIE Energy Access
ENGIE Energy Access jobs in Zambia
JOB DETAILS:
Job Purpose/Mission
We are seeking a talented and experienced System Administrator/Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services. You will collaborate with cross-functional teams to implement and maintain robust infrastructure solutions, focusing on automation, monitoring, and incident response. The ideal candidate is passionate about optimizing and enhancing system reliability, possesses strong problem-solving skills, and is committed to driving excellence in operational practices.
Responsibilities
1.Infrastructure Automation:
- Develop and maintain automation tools and scripts for provisioning, configuration, and deployment.
- Implement infrastructure as code (IaC) practices to ensure consistency and reproducibility.
2. Monitoring and Incident Response:
- Set up and maintain monitoring systems to detect and respond to performance issues and outages.
- Participate in on-call rotations and respond promptly to incidents, troubleshoot, and implement solutions to prevent recurrence.
3. Performance Optimization:
- Optimize system performance through continuous analysis and tuning.
4. Reliability Engineering:
- Implement best practices for reliability, such as error budgeting, SLIs/SLOs, and blameless post-mortems.
- Work towards minimizing manual intervention through automation.
5. System Administration:
- Manage and maintain server infrastructure, including installation, configuration, and troubleshooting of operating systems.
- Implement and maintain security measures, such as firewalls and intrusion detection systems.
- Perform regular system backups and recovery procedures.
6. Collaboration and Communication:
- Collaborate with cross-functional teams to align infrastructure and operational requirements.
- Provide technical guidance and support to colleagues in areas related to reliability.
Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- Proven experience as a Site Reliability Engineer or System Administrator.
- Strong Linux and Bash scripting skills.
- Proficiency in cloud platforms (e.g., AWS, Azure, GCP, Linode, DigitalOcean).
- Experience with container orchestration tools (e.g., Kubernetes, Docker, LXD).
- In-depth knowledge of networking, security, and system administration.
- Familiarity with infrastructure as code tools (e.g., Terraform, Ansible).
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.
Preferred Qualifications:
- Experience with CI/CD pipelines and related tools.
- Knowledge of distributed systems and microservices architecture.
- Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack).
- Familiarity with programming languages (e.g., Python, Ruby).
We thank all applicants for their interest, however due to the large volume of applications we receive, only shortlisted candidates will be contacted.
Work Hours: 8
Experience: No Requirements
Level of Education: bachelor degree
Job application procedure
- ENGIE is an equal opportunity employer, promoting diversity and committed to creating an inclusive environment for all. All applications are screened based on business needs, job requirements and individual qualifications, without any regard to origin, age, name, sexual identity, orientation or preference, religion, marital status, health, disability, political opinions, union involvement or citizenship. Our differences are our strengths!
- Business Unit: GBU Flexible Gen & Retail
- Division: Energy Access
- Legal Entity: FENIX INTERNATIONAL UGANDA LIMITED COMPANY
- Professional Experience: Junior (experience < 3 years)
- Education Level: Bachelor’s Degree
- To apply for this job please visit jobs.engie.com.
All Jobs | QUICK ALERT SUBSCRIPTION