Over 6298 new job opportunities are waiting for you!

Staff Data Engineer


Job Description

Cority is looking for an experienced Staff Data Engineer to lead the design, development, and optimization of our data infrastructure and analytics platforms. As a technical leader, you'll architect scalable data solutions, mentor engineering teams, and drive strategic decisions around our data technology stack. This role requires deep technical expertise combined with the ability to translate business needs into robust data systems that support analytics, machine learning, and operational workloads.

Key Responsibilities:
You will lead the end-to-end data architecture, designing and implementing data pipelines, warehouses, and lakes that handle petabyte-scale datasets. Leading cross-functional initiatives, you'll collaborate with product teams to enable data-driven decision-making across the organization. Your role includes establishing best practices for data quality, governance, and security while mentoring senior engineers and conducting technical reviews. You'll evaluate and adopt emerging technologies, optimize performance and cost efficiency, and ensure our data infrastructure scales with business growth. 

Required Qualifications:
We're looking for someone with 10+ years of software engineering experience, including 5+ years specifically in data engineering roles, with at least 2 years in a staff or principal capacity. You should have proven experience architecting and operating production data systems at scale, along with deep expertise in distributed systems, database internals, and data modeling. You must have demonstrable experience making significant architectural contributions to a large-scale data platform, with measurable business impact such as improved performance, reduced costs, enhanced data qualty, or enabled new capabilities. Strong programming skills in Python, C#, or Java are essential, as is experience with both SQL and NoSQL databases. You'll need excellent communication skills to work with both technical and non-technical stakeholders, and a track record of leading technical initiatives from conception through delivery. 



Technical Qualifications:
  • Core Data Storage & Processing
  • Knowledge of data lake architectures and lakehouse patterns
  • Apache Iceberg or Delta Lake for table formats with ACID transactions
  • PostgreSQL for operational databases and metadata stores
  • ClickHouse or Apache Druid for real-time analytics and OLAP queries
  • Familiarity with real time processing frameworks (Apache Flink, Spark Streaming)
  • Cloud Platforms & Infrastructure
  • Google Cloud Platform (BigQuery, Cloud Storage, Dataflow, Pub/Sub, Cloud Composer)
  • AWS (S3, Redshift, EMR, Glue, Kinesis) or Azure alternatives
  • Terraform or Pulumi for infrastructure as code
  • Kubernetes for containerized workload orchestration
  • Data Integration & Orchestration
  • Apache Airflow or Nifi for workflow orchestration
  • dbt for transformation and analytics engineering
  • Airbyte, Fivetran, or custom connectors for data ingestion
  • Kafka or Pulsar for real-time streaming
  • Data Quality & Governance
  • Great Expectations or similar for data quality testing
  • Apache Atlas or Datahub for metadata management and lineage
  • Git-based workflows for version control and CI/CD
  • Analytics & Business Intelligence
  • Experience with modern BI tools (Looker, Tableau, Power BI)
  • Familiarity with embedded analytics frameworks (cube-js, Superset, Metabase)
  • Programming
  • Python (pandas, PySpark), Java for JVM-based data processing
  • C# to understand data structure from different applications
  • SQL and understanding of columnar formats (Parquet) 

  • Nice to Have:
  • Experience designing HIPAA-compliant data systems (encryption at rest/in transit, audit logging, access controls, BAA management)
  • Familiarity with data privacy regulations (GDPR, CCPA)
  • Experience with vector and graph databases (PostgreSQL pgvector, Neo4j)
  • Experience with ML infrastructure (MLflow, feature stores)
  • Understanding of cost optimization for cloud data platforms
  • Experience with multi-tenant Data Warehouse or Data Lake.
  • The ideal candidate will have hands-on experience with several technologies from each category rather than superficial knowledge of everything, along with the ability to evaluate and adopt new tools as the ecosystem evolves.