Data Engineer

tavant

Hyderabad 7 Years Exp Posted 1d ago

Lakehouse Architecture & Engineering

Design and implement scalable end-to-end Lakehouse solutions using Azure, Databricks, Python, and PySpark.
Develop enterprise-grade data pipelines sourcing data from ERP, CRM, and field-device/on-field operational applications.
Implement Medallion Architecture (Bronze, Silver, Gold) with focus on scalability, governance, and maintainability.
Design robust Silver Layer semantic data models aligned with enterprise common data models.

Metadata & Semantic Layer Engineering

Develop and maintain semantic data layers including:
- Data Catalogs
- Data Dictionaries
- Business Glossaries
Implement metadata-driven transformation and engineering frameworks.
Manage metadata lifecycle, lineage, and governance processes.
Handle schema drift and schema evolution through regenerative metadata and automated data dictionary approaches.

AI-Driven Automation & Productivity Engineering

Design AI-assisted solutions for extracting and generating metadata, data dictionaries, and semantic mappings using:
- Source datasets
- Application code
- Defined business rules
Develop Databricks-based automation frameworks capable of dynamically generating transformation notebooks using metadata definitions.
Leverage AI capabilities to improve engineering productivity, automation, governance, and operational efficiency.
Continuously explore and adopt emerging Azure and Databricks AI capabilities.

Data Processing, Modeling & Optimization

Develop scalable data transformations, functions, procedures, and reusable business computation logic.
Build curated aggregates, summaries, snapshots, and analytics-ready datasets for downstream consumption.
Optimize distributed processing workloads for:
- Performance
- Scalability
- Cost efficiency
Implement data quality frameworks including validation, reconciliation, observability, and governance controls.

Self-Service BI & Consumption Enablement

Enable governed Self-Service BI capabilities using Databricks Genie/API/embed tools.
Securely expose semantic/silver-layer datasets to business and functional stakeholders.
Capture and analyze usage telemetry/logs to continuously enhance data platform usability and adoption.

Required Skills & Expertise

Core Technical Skills

Strong expertise in:
- SQL
- Data Modeling
- Python
- PySpark
- Azure
- Databricks
Strong understanding of:
- Medallion Architecture
- Lakehouse Architecture
- Metadata-driven engineering
- Distributed data processing
- Semantic data layers
Experience integrating data from:
- ERP systems
- CRM platforms
- Field-device / operational applications
Experience with Databricks system tables and platform observability.

AI & Advanced Engineering Capabilities

Exposure to AI/ML-enabled data engineering solutions.
Experience leveraging AI capabilities for engineering productivity and automation.
- Understanding of metadata automation and AI-assisted semantic enrichment approaches preferred.