Hundreds of millions of customers, billions of transactions, petabytes of data… How to use the world’s richest collection of e-commerce and device usage data to acquire new customers, target existing customers, and predict customer behavior? Amazon’s Consumer Marketing Analytics team seeks a Data Engineer to lead the Marketing Attribution team that is at the forefront of large scale, algorithmically driven foundation for marketing attribution. You will build data analytical solutions that will address increasingly complex business questions.
We are a high-energy and innovative group that drives hundreds of millions of dollars in sales on Amazon.com. Our goal is to bring customers to the Amazon web site and provide them with the very best experience. We also deliver business intelligence solutions on profitability, cash flow, margin and operational performance to a diverse community of internal customers.
Amazon.com has a culture of data-driven decision-making and demands business intelligence that is timely, accurate, and actionable. This team provides a fast-paced environment where every day brings new challenges and new opportunities.
As a Data Engineer you will be working in one of the world's largest and most complex data warehouse environments. You should be passionate about working with huge data sets and be someone who loves to bring datasets together to answer business questions. You should have deep expertise in creation and management of datasets.
You should be expert at implementing and operating stable, scalable data flow solutions from production systems into end-user facing applications/reports. These solutions will be fault tolerant, self-healing and adaptive.
You will be working on developing solutions that provide some of the unique challenges of space, size and speed. You will implement data analytics using cutting edge analytics patterns and technologies that are inclusive of but not limited to Star Schema and Hive. You will extract huge volumes of data from various sources and message streams and construct complex analyses. You will write scalable queries and tune performance on queries running over billion of rows of data. You will implement data flow solutions that process data real time on message streams from source systems.
You should be detail-oriented and must have an aptitude for solving unstructured problems. You should work in a self-directed environment, own tasks and drive them to completion.
You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions and to build data sets that answer those questions. You own customer relationship about data and execute tasks that are manifestations of such ownership, like ensuring high data availability, low latency, documenting data details and transformations and handling user notifications and training.
You will work with distributed machine learning and statistical algorithms upon a large EMR cluster to harness enormous volumes of online data at scale to serve our customers.