Amazon Search creates powerful, customer-focused search and advertising solutions and technologies. Whenever a customer visits an Amazon site worldwide and types in a query or browses through product categories, Product Search services go to work. We design, develop, and deploy high performance, fault-tolerant distributed search systems used by millions of Amazon customers.
Our Search Operations team operates one of the Internet's largest product search infrastructures, made up of thousands of servers, serving millions of customers performing hundreds of millions of queries - all delivered in milliseconds. Members of this team are continuously working to maximize resilience, availability, and find innovative ways to monitor, detect and analyze complex production issues.
We are hiring a software developer to enhance the site reliability of Search through internal automation, tooling, and simplification of processes.
· Develop software to simplify and automate the function of operations and site reliability – this includes tooling, automated response systems, building/improving frameworks, etc.
· Partner with engineering teams on Search software projects – everything from design, coding and testing to operational readiness.
· Project ownership of engineering initiatives from inception, actively engaging during design reviews and development efforts to ensure a sound deployment plan and mitigation of operational burden.
· Lead investigation efforts and propose high impact development initiatives and projects – lead the effort by working with other ops or search development engineers.
· Diagnose and mitigate critical failures in high pressure situations. Communicate and update status on high-severity events while oncall.
· Perform troubleshooting deep-dives on system and application issues, driving root cause resolution with a sense of urgency.
· Daytime on-call support, monitoring, and deployment management as part of a worldwide shared rotation. This is 8x7 daytime on-call once every 5-7 weeks.