Senior Data Engineer – GCP, DBT

Job not on LinkedIn

🕒 May 22

🗣️🇧🇷🇵🇹 Portuguese Required

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Leega

Leega

201 - 500 employees

Founded 2010

🔌 API

🤖 Artificial Intelligence

API • Artificial Intelligence • Cloud Solutions

Leega is a leading technology solutions provider in Latin America, specializing in data analytics and cloud solutions. As the first company in the region certified by Google Cloud for Data Analytics, Leega offers a range of services including application development, machine learning, and risk management analytics. The firm partners with major cloud services such as AWS and Microsoft Azure to help businesses enhance their data management and transition effectively to the cloud, ultimately driving digital transformation and innovation.

📋 Description

• **Analysis and Planning of Loads/Pipelines:** • Evaluate the data warehouse architecture and requirements. • Map data, transformations and processes to GCP services (Cloud Storage, BigQuery, Dataproc). • Define data migration strategy (full load, incremental, CDC). • Develop a data architecture plan on GCP. • **Design and Data Modeling on GCP:** • Design table schemas in BigQuery, considering performance, cost and scalability. • Define partitioning and clustering strategies for BigQuery. • Model data zones in Cloud Storage (Bronze, Silver and Gold). • **ELT/ETL Pipeline Development:** • Create data transformation routines using Dataproc (Spark) or Dataflow to load data into BigQuery. • Translate business logic and existing transformations into GCP. • Implement data validation and data quality mechanisms. • **Provisioning and Infrastructure Management:** • Use IaC tools (Terraform) to provision and manage GCP resources (BigQuery datasets/tables, Cloud Storage buckets, Dataproc clusters). • Configure and optimize Dataproc clusters for different workloads. • Manage networking, security (IAM) and access on GCP. • **Performance and Cost Optimization:** • Optimize queries in BigQuery to reduce costs and improve performance. • Tune and optimize Spark jobs on Dataproc. • Monitor and optimize GCP resource usage to control costs. • **Data Security and Governance:** • Implement and ensure data security in transit and at rest. • Define and apply IAM policies to control access to data and resources. • Ensure compliance with data governance policies. • **Monitoring and Support:** • Troubleshoot performance and functional issues of data pipelines and GCP resources. • **Documentation:** • Document the architecture, data pipelines, data models and operational procedures. • **Communication:** • Communicate effectively with team members, stakeholders and other business areas. • Ensure clear communication between architecture definitions and software components, and support the evolution and quality of the team's developments. • **Jira / Agile Methodologies:** • Familiarity with agile methodologies, ceremonies and proficiency with the Jira tool.

🎯 Requirements

• **Google Cloud Platform (GCP):** • **BigQuery:** Deep knowledge in data modeling, query optimization, partitioning, clustering, data loading (streaming and batch), security and data governance. • **Cloud Storage:** Experience managing buckets, storage classes, lifecycle policies, access control (IAM) and data security. • **Dataproc:** Skills in provisioning, configuring and managing Spark/Hadoop clusters, job optimization, and integration with other GCP services. • **Dataflow/Composer/DBT:** Knowledge of orchestration and data-processing tools for ELT/ETL pipelines. • Proven minimum of 3 years of experience in GCP. • Proven minimum of 3 years of experience in DBT (preferred). • Proven minimum of 3 years of experience in PySpark. • Proven experience with GitFlow. • **Cloud IAM (Identity and Access Management):** Implementation of security policies and fine-grained access control. • **VPC, Networking and Security:** Understanding of networks, subnets, firewall rules and cloud security best practices. • **Programming Languages:** • **Python and PySpark:** Essential for automation scripts, pipeline development and integration with GCP APIs. • **SQL (advanced):** For BigQuery, DBT and data transformations. • **Shell Scripting:** For task automation. • **Version Control:** • Git / GitHub / Bitbucket.

🏖️ Benefits

• 🏥 Porto Seguro Health Plan • 🦷 Porto Seguro Dental Plan • 💰 Profit Sharing (PLR) • 👶 Childcare Assistance • 🍽️ Alelo Meal and Food Vouchers • 💻 Home Office Allowance • 📚 Partnerships with Educational Institutions • 🚀 Support for Certifications, including Cloud • 🎁 Livelo Points • 🏋️‍♂️ TotalPass • 🧘‍♂️ Mindself

Apply Now

Similar Jobs

🕒 May 21

Compass

10,000+ employees

🏠 Real Estate

📱 Media

Senior Data Engineer specializing in Oracle GoldenGate and Azure to develop scalable data pipelines. Collaborating with data architects and stakeholders on innovative data solutions.

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

Cloud

Oracle

Spark

🕒 May 21

Wildlife Studios

1001 - 5000

Data Engineer at Wildlife Studios enhancing data accessibility and quality for operations and analytics. Collaborating across teams to support data-driven decision-making in mobile gaming.

🇧🇷 Brazil – Remote

💰 Secondary Market on 2022-05

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

Python

SQL

🕒 May 20

CI&T

5001 - 10000

🤖 Artificial Intelligence

☁️ SaaS

Azure

Cloud

🕒 May 20

CI&T

5001 - 10000

🤖 Artificial Intelligence

☁️ SaaS

Skilled Senior Data Engineer uniting human expertise with AI for scalable tech solutions. Collaborating on a global data transformation initiative with a focus on data architecture and ingestion processes.

Azure

Cloud

ETL

PySpark

SQL

🕒 May 18

Méliuz

201 - 500

🛍️ eCommerce

👥 B2C

Data Engineer position at Méliuz, a global tech company impacting 43 million users, focusing on data pipelines and cloud technologies.

🇧🇷 Brazil – Remote

🔥 Funding within the last year

💰 $32.5M Post-IPO Equity - Méliuz on 2025-06

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

🗣️🇧🇷🇵🇹 Portuguese Required

Airflow

AWS

ETL

Google Cloud Platform

Python

Scala

Spark

SQL

Terraform

Go