Sr. Systems Development Engineer, AWS Managed Operations (MO) ID-5034

About the position

Do you love decomposing problems to develop products that impact millions of people around the world? Would you enjoy identifying, defining, and building software solutions that revolutionize how businesses operate? AWS Utility Computing (UC) provides product innovations - from foundational services such as Amazon's Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS's services and features apart in the industry. As a member of the UC organization, you'll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS. Within AWS UC, Amazon Dedicated Cloud (ADC) roles engage with AWS customers who require specialized security solutions for their cloud services. Would you enjoy diving deep into operating and improving some of the largest software systems humanity has ever built? Do the challenges that come of driving technical, business, and cultural change to improve the reliability, performance, and efficiency excite you? The AWS Managed Operations (MO) organization was founded in April 2023, with the objective to reduce operational load and toil through long-term engineering projects. Managed Operations (MO) is building the best-in-class engineering and operations team that will own the day-to-day operations for AWS Regions; improving the availability, reliability, latency, performance and efficiency to operate AWS regions. Amazon is looking for highly motivated Senior Systems Development Engineers who can balance the day-to-day operations of AWS' software systems with long-term software engineering to reduce operational toil. We need engineers who enjoy constantly learning and diving deep into the wide range of systems and technologies that make up one of the world's largest cloud providers.

Responsibilities

  • Operate production systems and make long-term improvements to the reliability, availability, and performance of software systems.
  • ,
  • Root cause analysis of failed deployments and implement fixes.
  • ,
  • Design solutions for common problems identified during operations.
  • ,
  • Investigate and update Service Level Objectives (SLOs) to ensure they remain useful.
  • ,
  • Develop software to optimize hardware types for better performance and reduced carbon emissions.
  • ,
  • Execute time-critical changes to production systems.
  • ,
  • Collaborate with team members to drive improvements and reduce human error in operations.

Requirements

  • 6+ years of deploying and operating in a Linux/Unix environment experience
  • ,
  • Experience with Linux/Unix
  • ,
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
  • ,
  • Experience leading the design, automation, deployment, and support of large-scale infrastructure
  • ,
  • Experience with CI/CD pipelines build processes

Nice-to-haves

Benefits

  • Mentorship and career growth resources
  • ,
  • Work-life balance initiatives
  • ,
  • Employee-led affinity groups fostering inclusion
  • ,
  • Ongoing learning experiences and events
Back to blog