Data Engineer

tavant

Hyderabad 7 Years Exp Posted 1d ago

Job Description

Lakehouse Architecture & Engineering

  • Design and implement scalable end-to-end Lakehouse solutions using Azure, Databricks, Python, and PySpark.
  • Develop enterprise-grade data pipelines sourcing data from ERP, CRM, and field-device/on-field operational applications.
  • Implement Medallion Architecture (Bronze, Silver, Gold) with focus on scalability, governance, and maintainability.
  • Design robust Silver Layer semantic data models aligned with enterprise common data models.

Metadata & Semantic Layer Engineering

  • Develop and maintain semantic data layers including:
    • Data Catalogs
    • Data Dictionaries
    • Business Glossaries
  • Implement metadata-driven transformation and engineering frameworks.
  • Manage metadata lifecycle, lineage, and governance processes.
  • Handle schema drift and schema evolution through regenerative metadata and automated data dictionary approaches.

AI-Driven Automation & Productivity Engineering

  • Design AI-assisted solutions for extracting and generating metadata, data dictionaries, and semantic mappings using:
    • Source datasets
    • Application code
    • Defined business rules
  • Develop Databricks-based automation frameworks capable of dynamically generating transformation notebooks using metadata definitions.
  • Leverage AI capabilities to improve engineering productivity, automation, governance, and operational efficiency.
  • Continuously explore and adopt emerging Azure and Databricks AI capabilities.

Data Processing, Modeling & Optimization

  • Develop scalable data transformations, functions, procedures, and reusable business computation logic.
  • Build curated aggregates, summaries, snapshots, and analytics-ready datasets for downstream consumption.
  • Optimize distributed processing workloads for:
    • Performance
    • Scalability
    • Cost efficiency
  • Implement data quality frameworks including validation, reconciliation, observability, and governance controls.

Self-Service BI & Consumption Enablement

  • Enable governed Self-Service BI capabilities using Databricks Genie/API/embed tools.
  • Securely expose semantic/silver-layer datasets to business and functional stakeholders.
  • Capture and analyze usage telemetry/logs to continuously enhance data platform usability and adoption.

Required Skills & Expertise

Core Technical Skills

  • Strong expertise in:
    • SQL
    • Data Modeling
    • Python
    • PySpark
    • Azure
    • Databricks
  • Strong understanding of:
    • Medallion Architecture
    • Lakehouse Architecture
    • Metadata-driven engineering
    • Distributed data processing
    • Semantic data layers
  • Experience integrating data from:
    • ERP systems
    • CRM platforms
    • Field-device / operational applications
  • Experience with Databricks system tables and platform observability.

AI & Advanced Engineering Capabilities

  • Exposure to AI/ML-enabled data engineering solutions.
  • Experience leveraging AI capabilities for engineering productivity and automation.
    • Understanding of metadata automation and AI-assisted semantic enrichment approaches preferred.

Similar Openings for You