2 days ago
• Develop, optimize, and maintain data processes • Troubleshoot issues to ensure data accuracy and integrity • Assist in driving data initiatives by designing, implementing, and maintaining effective data solutions that align with user requirements and organizational goals • Design, build, and optimize scalable ETL pipelines using Apache Spark on Amazon EMR • Work closely with data scientists, analysts, and other engineering teams to define, implement, and maintain high-performance data infrastructure • Develop and maintain automated data workflows and processes for efficient data ingestion, transformation, and loading • Implement best practices for data engineering, including monitoring, logging, and alerting for data pipelines • Collaborate with stakeholders to understand business requirements and translate them into technical solutions • Optimize performance of data processing jobs and troubleshoot issues with large-scale distributed systems • Drive innovation in data infrastructure, evaluating and integrating new tools, frameworks, and approaches
• 5+ years of experience in data engineering, with at least 3+ years working with Apache Spark and Amazon EMR • Strong programming skills in Python and Scala with a focus on performance tuning and optimization for Spark jobs • Proven experience working with SQL for data management, querying, and optimization is required • Deep understanding of distributed computing concepts, data partitioning, and resource management in large-scale data processing systems • Proficiency in building and maintaining ETL pipelines for structured and unstructured data • Hands-on experience with AWS services such as S3, Lambda, EMR, Glue, and RDS • Strong problem-solving skills and ability to debug complex systems • Preferred Qualifications: Experience with DevOps practices, including CI/CD, infrastructure as code (e.g., Terraform, CloudFormation), and containerization (e.g., Docker) • Experience with Kubernetes and container orchestration for Spark jobs • Familiarity with streaming data processing using tools like Kafka, Kinesis, or Flink • Experience with modern data lake architectures, including Delta Lake or Iceberg • AWS Certification (e.g., AWS Certified Big Data – Specialty, AWS Certified Solutions Architect) is a plus.
• Professional development opportunities with international customers • Collaborative work environment • Career path and mentorship programs
Apply Now2 days ago
11 - 50
Develop decentralized infrastructure and scalable blockchain solutions for Chainlink.
3 days ago
51 - 200
Senior Software Engineer optimizing games at Kokku for performance and robustness.