Job Description
Position Overview
We are seeking an exceptional Principal / Head of Data Engineering to establish and lead our data engineering function from the ground up. This role reports to the Head of Data and AI Engineering and is responsible for the complete design, development, and implementation of a world-class modern data platform. You will drive the strategic evolution of our data infrastructure, enabling both structured and unstructured data workflows at scale. You will spearhead the upgrade and modernization of our existing Azure Data Factory pipelines to next-generation orchestration tools, implement efficient data ingress and egress patterns, establish AI/LLM-native data capabilities through advanced vector indexing and streaming architectures, and build a strong data engineering organization from the ground up. You will collaborate closely with cloud engineering, network engineering, and data products teams to architect a unified data lake and comprehensive data governance framework that supports diverse analytical and operational needs across our portfolio.
Key Responsibilities
Organization Building & Team Leadership
Build and scale the data engineering organization from inception, defining team structure, roles, and responsibilities across the function
Establish engineering culture emphasizing technical excellence, collaboration, ownership, and continuous learning
Recruit, mentor, and develop high-performing data engineers with expertise in modern data platforms, ETL/ELT, orchestration, streaming, and vector databases
Partner with Human Resources on recruitment strategy, hiring processes, and organizational scaling as the firm grows
Strategic Vision & Roadmap
Establish a comprehensive, multi-year data engineering strategy aligned with firm objectives
Define technical roadmaps for data infrastructure, platform capabilities, and technology adoption
Establish governance frameworks for data engineering decisions, standards, and best practices
Lead technology evaluation and vendor selection processes with clear ROI and strategic fit
Platform Architecture & Modernization
Design and architect a modern, scalable data platform leveraging Databricks on Azure that supports both structured and unstructured data at petabyte scale
Lead the modernization of legacy Azure Data Factory (ADF) pipelines to production-grade orchestration platforms such as Prefect, or Apache Airflow
Develop a comprehensive upgrade and migration roadmap for ETL/ELT pipelines, ensuring zero data loss, minimal downtime, and improved observability
Lead the implementation of serverless and Zero ETL patterns to eliminate infrastructure management overhead and reduce time-to-insight
Own cost optimization initiatives across the data platform, balancing performance, reliability, and operational efficiency
ETL/ELT & Orchestration Excellence
Build deep expertise in Directed Acyclic Graph (DAG) principles and modern workflow orchestration patterns for reliable, scalable pipeline management
Evaluate, select, and implement best-in-class orchestration tools (Prefect, Airflow) that provide superior visibility, error handling, and data lineage tracking
Establish patterns for dynamic DAG generation, conditional execution, and advanced error recovery strategies
Design and enforce data quality frameworks within orchestration tools to catch issues at the pipeline level
Create monitoring, alerting, and observability solutions for 100%+ visibility into pipeline health and data freshness
Data Movement & Integration Patterns
Architect efficient data ingress patterns supporting high-volume, real-time, and batch data inflows from diverse sources (APIs, databases, cloud services, SaaS platforms)
Design sophisticated data egress patterns enabling secure, efficient data distribution to downstream systems, analytics tools, and external stakeholders
Implement change data capture (CDC) patterns and incremental processing strategies to optimize resource usage and reduce latency
Establish governance frameworks for data movement including encryption, authentication, and audit trails
Streaming & Real-Time Data Capabilities
Evaluate and implement streaming platforms (Kafka, Event Hubs, Kinesis) to support real-time analytics and operational use cases
Design event-driven architectures that enable low-latency decision-making and automated workflows
Build streaming ingestion pipelines that efficiently funnel data into the lakehouse while maintaining data quality and lineage
AI & LLM-Native Data Infrastructure
Design and build vector database infrastructure to support LLM applications, including efficient indexing, similarity search, and retrieval-augmented generation (RAG) workflows
Establish patterns for embedding generation, vector storage optimization, and integration with vector databases
Build data pipelines that prepare unstructured data (documents, images, audio) for embedding and LLM consumption
Create governance and provenance tracking for embeddings and vector data to ensure transparency and compliance
Data Lake & Catalog Implementation
Lead the development and governance of a unified data lake, establishing data quality standards, lineage tracking, and compliance frameworks
Support implementation of a modern data catalog solution that enables data discovery, governance, and self-service analytics across the enterprise
Establish data engineering best practices, testing frameworks, production deployment pipelines, and operational standards
Cross-Functional Collaboration & Stakeholder Management
Partner with cloud engineering, and infrastructure teams to define overall data and technology strategy
Work closely with cloud engineering teams to optimize Azure cloud utilization, cost efficiency, security, and operational resilience
Collaborate with network engineering to design network architecture supporting high-throughput data flows, low-latency access patterns, and hybrid connectivity
Partner with data products leadership to translate business requirements into technical implementations for analytics, AI/ML, and real-time intelligence
Communicate data engineering strategy and priorities to executive leadership and the broader organization
Required Qualifications
Technical Expertise
Advanced proficiency with Databricks, including Delta Lake, Unity Catalog, and Apache Spark optimization
Deep expertise in Microsoft Azure, including Azure Data Factory, Synapse Analytics, Azure Storage (Data Lake Storage Gen2), Azure Event Hubs, and Azure compute services
Production experience migrating and modernizing Azure Data Factory pipelines to modern orchestration platforms
Expert-level understanding of Directed Acyclic Graphs (DAGs), workflow orchestration concepts, and production DAG-based platforms
Deep hands-on experience with Prefect, Apache Airflow, or similar orchestration tools in enterprise environments
Strong experience designing data ingress and egress patterns for diverse data sources and consumers
Demonstrated expertise in streaming architectures (Kafka, Event Hubs, Kinesis) and event-driven data processing
Experience building and optimizing vector databases and similarity search solutions for LLM/AI applications
Strong understanding of embedding generation, vector indexing strategies, and RAG (Retrieval-Augmented Generation) pipelines
Proficiency with data engineering technologies: Python, SQL, Scala, and hands-on experience with modern data transformation tools
Experience with data governance, metadata management, and data catalog solutions
Leadership & Organization Building
10+ years of data engineering experience, with at least 5+ years in senior leadership or principal technical roles
Proven track record building and scaling data engineering organizations from the ground up, developing talent and establishing technical culture
Experience successfully leading enterprise platform migrations and large-scale modernization initiatives
Demonstrated ability to define strategic vision, communicate priorities to executive stakeholders, and execute on multi-year roadmaps
Strong track record designing and implementing enterprise-scale data platforms supporting 100+ users and petabyte-scale datasets
Demonstrated ability to partner effectively across infrastructure, security, networking, product, and executive teams
Excellent communication skills; ability to explain complex technical concepts to both engineers and non-technical executives
Preferred Qualifications
Hands-on experience building and operating AI/ML platforms and the data engineering to support machine learning workflows
Expertise in change data capture (CDC) patterns and incremental processing strategies
Experience with cost optimization strategies for cloud data platforms
Background in data quality frameworks, testing strategies, and observability for data pipelines
Experience with unstructured data processing, computer vision, or natural language processing pipelines
Reporting Relationships
Head of Data and AnalyticsCompensation
The anticipated base salary range for this position is listed below. Total compensation may also include a discretionary performance-based bonus. Note, the range takes into account a broad spectrum of qualifications, including, but not limited to, years of relevant work experience, education, and other relevant qualifications specific to the role.
$300,000 - $350,000
The firm also offers robust Benefits offerings. Ares U.S. Core Benefits include Comprehensive Medical/Rx, Dental and Vision plans; 401(k) program with company match; Flexible Savings Accounts (FSA); Healthcare Savings Accounts (HSA) with company contribution; Basic and Voluntary Life Insurance; Long-Term Disability (LTD) and Short-Term Disability (STD) insurance; Employee Assistance Program (EAP), and Commuter Benefits plan for parking and transit.
Ares offers a number of additional benefits including access to a world-class medical advisory team, a mental health app that includes coaching, therapy and psychiatry, a mindfulness and wellbeing app, financial wellness benefit that includes access to a financial advisor, new parent leave, reproductive and adoption assistance, emergency backup care, matching gift program, education sponsorship program, and much more.
There is no set deadline to apply for this job opportunity. Applications will be accepted on an ongoing basis until the search is no longer active.