Interested in Amazon Echo and the future of intelligent interfaces? Alexa Data Services is looking for a Data Engineer to join our Data and Annotation Science team in the area of speech and language data processing. ADS’s mission as a provider of high-quality labeled data at high-speed and low-cost for Machine Learning technologies that enable Alexa’s expansion across Devices, Countries, Languages, and Domains is: To help Alexa AI Invent foundational machine learning technologies for anyone to build intelligent conversational interfaces for any device, application, language, and environment. We’re building the speech and language solutions behind Amazon Echo and other Amazon products and services. We’re working hard, having fun, and making history; come join us!
We are seeking a talented, self-directed Data Engineer to build a scalable data platform and build scalable, self-service spoken language understanding analytics tools for use by the Alexa Machine Learning organization. She or he should be driven to provide inventive, simple solutions to complex problems and have a passion for language and language data analysis.
The ideal candidate relishes working with large volumes of data, enjoys the challenge of highly complex technical contexts, and, above all else, is passionate about data and analytics. He/she is an expert with data modeling, ETL design and business intelligence tools and passionately partners with the business to identify strategic opportunities where improvements in data infrastructure creates outsized business impact. He/she is a self-starter, comfortable with ambiguity, able to think big (while paying careful attention to detail) and enjoys working in a fast-paced team. The ideal candidate need to possess exceptional technical expertise in large scale data warehouse and BI systems with hands-on knowledge on SQL, Distributed/MPP data storage, and AWS services (S3, Redshift, EMR, RDS).
Specifically, the Data Engineer will:
· Design, implement, and support a platform providing ad hoc access to large datasets
· Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL
· Implement data structures using best practices in data modeling, ETL/ELT processes, and SQL, and Redshift
· Model data and metadata for ad hoc and pre-built reporting
· Interface with business customers, gathering requirements and delivering complete reporting solutions
· Build robust and scalable data integration (ETL) pipelines using SQL, Python and Spark.
· Build and deliver high quality datasets to support business analyst and customer reporting needs.
· Continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for customers
· Participate in strategic & tactical planning discussions, including annual budget processes