Site Reliability Engineer (P673)
About Us:
As a SR Site Reliability Engineer at Kenility, you’ll join a tight-knit family of creative developers, engineers, and designers, who strive to develop and deliver the highest quality products into the market.
Technical Requirements:
- Bachelor's degree in Computer Science or Information Technology, or a comparable qualification.
- Extensive hands-on experience with Linux systems, including system optimization and troubleshooting.
- Strong programming background with Python, focusing on automation, scripting, and system management.
- Deep understanding of container orchestration with Kubernetes (K8s); experience with Slurm is a plus.
- Hands-on experience with IaC tools like Terraform, Helm, and Ansible.
- Proficiency in containerization technologies such as Docker and Podman for reliable deployments.
- Experience working with CI/CD pipelines (GitLab preferred, but GitHub and Git are also relevant).
- Familiarity with monitoring and logging solutions like Prometheus, Grafana, and the ELK stack to track system health and performance.
- Strong knowledge of relational and distributed databases (PostgreSQL, DynamoDB, Cassandra) and their performance tuning.
- Exposure to cloud platforms like AWS or GCP is highly beneficial.
- Understanding of networking principles, distributed systems, and security best practices.
Soft Skills:
- Responsibility
- Proactivity
- Flexibility
- Great communication skills