Data Engineer
amgen
Job Description
- Design, develop, and maintain complex ETL/ELT data pipelines in Databricks using PySpark, Scala, and SQL to process large-scale datasets.
- Build highly efficient data pipelines to migrate and deploy complex data across systems, with an understanding of biotech/pharma/manufacturing or related domains.
- Design and implement solutions to enable unified data access, governance, and interoperability across hybrid cloud environments.
- Ingest and transform structured and unstructured data from databases (PostgreSQL, MySQL, SQL Server, MongoDB, etc.), APIs, logs, event streams, images, PDFs, and third-party platforms.
- Ensure data integrity, accuracy, and consistency through rigorous quality checks and monitoring.
- Innovate, explore, and implement new tools and technologies to enhance efficient data processing.
- Proactively identify and implement opportunities to automate tasks and develop reusable frameworks.
- Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value.
- Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories.
- Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle.
- Collaborate and communicate effectively with product teams and cross-functional teams to understand business requirements and translate them into technical solutions.