JobSearchGulf

Senior Site Reliability Engineer - Jobs in United Arab Emirates, Dubai

0.00 to 2.00 Years   United Arab Emirates, Dubai   29 Nov, 2023
Job LocationUnited Arab Emirates, Dubai
EducationNot Mentioned
SalaryNot Mentioned
IndustryOther
Functional AreaNot Mentioned

Job Description

About DWSDecentralized Web Services (DWS) is a cloud infrastructure platform that unites the best of both worlds - affordable prices and high security. Providing developers with an easy-to-use cloud platform where they can deploy GPU, Compute and Storage resources quickly at highly competitive prices. DWS storage is 30 times more affordable than traditional cloud, compute is 80% cheaper and GPUs are half the cost whilst also having a wider geographic coverage compared to traditional cloud providers.Our mission is to democratize access to decentralized cloud infrastructure.The company was founded in 2022, successfully raising its first seed round in 2023 and was selected as part of the Intel Ignite accelerator programme. DWS is building an ambitious product with global expansion in its sights, which is why were seeking entrepreneurial-minded people to join our mission as part of the initial founding team and become a key part of our success in overtaking the cloud market.DWS is a dynamic start-up company, and our successful candidate must have the ability and desire to work in a fast-paced environment. As a distributed team, we hire anywhere in the world, and at various levels of experience (entry, senior, staff). We look for people with unique perspectives and diverse backgrounds.Who are we looking forThe Senior Site Reliability Engineer will be the most senior member of our SRE team who acts as the guardian of our digital products and platforms. Our customers expect a reliable and efficient experience whilst powering through the thousands of amazing use cases for the DWS cloud platform. We already have a global reach, but were growing that even more, and we need expertise to support us in building the best cloud experiences. And of course, when things go wrong, because nothing is perfect, youll be on hand to be the escalation point for any major incidents.Youll play the lead role in developing patterns for infrastructure builds that speed up our Product teams ability to deploy services repeatably.You have a focus on delivering operational excellence and leading reviews of platforms to look for continual improvements to service and stability using automation wherever it gives an advantage.You thrive working in a continuous delivery environment where high rates of technology change are the norm and youre able to engage with product teams as they develop new products with a focus on building supportable products and systems.Responsibilities

  • Working with the DWS engineering team to support the infrastructure they need and the platforms on which their services run
  • Observing our platforms and services to measure reliability, find areas for improvement, and discover any risks to the stability or security of our systems
  • Maintaining new and existing infrastructure with code, by writing well-designed Terraform modules, to make the best use of our cloud
  • Doing proof-of-concepts on new and emerging tech to see how it could fit at DWS
  • Taking part in honest and transparent blame-free post-mortems on incidents we have, so we can learn from them and prevent them from happening again
  • Sharing your work and talking about it within the Platform and Engineering team, to spread knowledge and be an ambassador for good site reliability practices
  • Deploying innovative new tools to help accelerate engineers and make their lives easier, giving them more time to focus on what they are building
  • Documenting and driving the adoption of engineering best practices across the wider Engineering team
  • Demonstrating ownership of all initiatives from concept to launch, and embodying unwavering commitment and reliability, with a genuine willingness to contribute and address challenges
Requirements
  • Demonstrated experience in deploying, managing, and operating scalable and fault-tolerant Linux/Kubernetes/JVM-based infrastructure in AWS, GCP, Microsoft Azure and other public clouds
  • Expertise in Linux Operating Systems, Networking, and Database concepts
  • Expertise in Longhorn, OpenStack, Kubernetes, Helm
  • Expertise in virtualization: KVM, OpenVZ, VMware, VStack
  • Expertise in cloud providers, such as Amazon Web Services, Microsoft Azure, and GCP
  • Experience with configuration management systems such as Ansible or Puppet
  • Experience in Go, Ruby or Python
  • Excellent problem-solving, critical thinking, and communication skills
  • BSc or MSc in Computer Science, related field, or equivalent professional experience
Nice to have
  • Expertise in CNI: Calico / Cilium
  • Experience with Github actions, ArgoCD/FluxCD
  • Deep knowledge in SQL and Redis
  • Expertise in HAProxy, keepalive, Metallb
What do we have to offer you
  • Hybrid work and relocation depending on the preference
  • Generous bonuses tied to your performance
  • 24 days PTO
  • Become part of the founding team
  • Real career opportunities with the opportunity to grow quickly in seniority as the team scales
  • Disrupting the industry and being part of the AI/ML & Web3 revolution
  • Work colleagues that are as smart, hardworking and driven with backgrounds from FAANG companies and leading universities
  • Transparent company culture, open to feedback where you can wear multiple hats at once
  • Support in Learning and Development

Keyskills :
Jvm Sql Redis Gcp Linux Ansible Microsoft Azure Puppet Ruby Kubernetes Python Haproxy AWS Go

About Company

Cresco Holding Ltd

APPLY NOW

Related Jobs

© 2020 JobSearchGulf All Rights Reserved