🚀 Data Engineer (Python, SQL, ETL, Airflow, Snowflake, BigQuery)
Full-Time | Remote | U.S. Business Hours
💡 About the Role
We’re hiring a highly technical Data Engineer to build and maintain scalable data pipelines, cloud data infrastructure, and analytics-ready datasets that power business decision-making.
This role is focused on:
✅ ETL/ELT pipeline development
✅ Data warehouse architecture
✅ SQL optimization
✅ Cloud-based data infrastructure
✅ Pipeline reliability & monitoring
✅ Scalable analytics systems
You’ll work closely with:
- Data Analysts
- Data Scientists
- Engineering Teams
- BI & Leadership Teams
to ensure the organization always has accurate, clean, and trustworthy data.
If you:
- enjoy building robust data systems,
- love optimizing pipelines and queries,
- and care deeply about data quality and scalability,
this role is a strong fit.
🔥 What You’ll Own
ETL / ELT Pipeline Development
- Build and maintain scalable ETL/ELT pipelines using:
- Ingest data from:
- APIs
- SaaS platforms
- relational databases
- cloud applications
- streaming systems
- Develop reliable workflows for:
- data extraction
- transformation
- loading
- validation
Workflow Orchestration & Automation
- Manage orchestration platforms such as:
- Apache Airflow
- Prefect
- Dagster
- Luigi
- Monitor:
- pipeline health
- failed jobs
- scheduling reliability
- Build automated workflows with:
- retries
- alerting
- dependency management
Data Warehousing & Modeling
- Design and optimize cloud data warehouses using:
- Snowflake
- BigQuery
- Redshift
- Develop:
- star schemas
- snowflake schemas
- analytics-ready data models
- Improve:
- query performance
- clustering
- partitioning
- warehouse efficiency
Data Quality & Governance
- Implement:
- validation checks
- anomaly detection
- logging systems
- lineage tracking
- Use tools such as:
- Ensure:
- consistent naming conventions
- clean transformations
- audit-ready datasets
- Support compliance requirements:
- GDPR
- HIPAA
- industry-specific governance standards
Streaming & Real-Time Data
- Build and maintain streaming pipelines using:
- Support:
- real-time ingestion
- event-driven processing
- low-latency analytics workflows
Infrastructure & DevOps
- Containerize services using:
- Build CI/CD workflows with:
- GitHub Actions
- Jenkins
- GitLab CI
- Manage cloud infrastructure using:
- Improve scalability, reliability, and deployment automation
Cross-Functional Collaboration
- Partner with:
- analysts
- data scientists
- BI teams
- product teams
- Deliver curated datasets for:
- dashboards
- analytics
- machine learning workflows
- Support BI tools such as:
- Maintain documentation for:
- pipelines
- schemas
- workflows
- data definitions
✅ Required Experience & Skills
- 3+ years of Data Engineering or backend engineering experience
- Strong proficiency with:
- Experience with:
- Snowflake
- BigQuery
- Redshift
- Familiarity with:
- Airflow
- Prefect
- workflow orchestration tools
- Strong understanding of:
- ETL pipelines
- data modeling
- cloud infrastructure
- warehouse optimization
⭐ Ideal Experience
- Experience using:
- dbt
- Great Expectations
- data lineage tools
- Streaming experience with:
- Experience with:
- AWS Glue
- GCP Dataflow
- Azure Data Factory
- Background in:
- healthcare
- fintech
- regulated environments
- Experience optimizing large-scale warehouse costs and performance
🧠 What Makes You a Great Fit
- You care deeply about clean and reliable data
- You enjoy debugging complex pipeline and infrastructure issues
- You think about scalability and long-term maintainability
- You combine engineering rigor with analytical thinking
- You communicate effectively across technical and non-technical teams
📅 What a Typical Day Looks Like
- Review Airflow/Prefect pipeline health and resolve failures
- Build connectors for new APIs or SaaS platforms
- Optimize SQL queries and warehouse performance
- Collaborate with analysts and data scientists on datasets
- Improve validation and monitoring systems
- Document pipelines and warehouse structures
- Reduce warehouse costs and improve pipeline reliability
In short:
You build the data infrastructure that powers analytics, reporting, automation, and business intelligence across the organization.
📊 Key Success Metrics (KPIs)
- Pipeline uptime ≥ 99%
- Data freshness within SLA
- Zero critical data quality issues reaching production
- Query performance & warehouse cost optimization
- Reliable and scalable pipeline infrastructure
- Positive feedback from analysts, BI teams, and leadership
🌟 Why This Role Stands Out
- Work on modern cloud-native data infrastructure
- Build scalable ETL and analytics systems
- Exposure to:
- streaming pipelines
- cloud data platforms
- orchestration frameworks
- warehouse optimization
- Opportunity to grow into:
- Senior Data Engineer
- Analytics Engineering
- Platform Engineering
- Data Architecture
- Fully remote flexibility with collaborative engineering teams
🧪 Interview Process
- Initial Phone Screen
- Video Interview with Pavago Recruiter
- Technical Task
(Build a small ETL pipeline or optimize a SQL query) - Client Interview with Engineering/Data Team
- Offer & Background Verification
👉 Apply Now
If you:
- love building scalable data systems,
- enjoy solving complex pipeline problems,
- and want to work with modern data infrastructure,
this role is a strong fit for you.