October 17
Airflow
Apache
Bash
Cloud
Distributed Systems
Docker
ETL
Google Cloud Platform
Java
Kafka
Kubernetes
NoSQL
OpenShift
Python
RDBMS
Scala
Scikit-Learn
Spark
SQL
Subversion
Tensorflow
Go
• Join our team as a Data Engineer and help build cutting-edge data pipelines that drive smarter business decisions. • If you're passionate about optimizing data flows, we want you!
• Must Have - relevant experience on: Implementing and designing scalable, optimized data pipelines for (pre-) processing ETL for machine learning models. • Hands-on technologies and frameworks used in ML, like sklearn, MLFlow, TensorFlow • Building complex data pipelines e.g. ETL • Experience working in cloud environment, data cloud platforms (e.g. GCP) • Understanding of code management repositories like GIT/SVN • Familiar with software engineering practices like versioning, testing, documentation, code review • Experience with Apache Airflow • Experience in setting up both SQL as well as noSQL databases • Experience with monitoring and observability (ELK stack) • Deployment and provisioning with automation tools e.g. Docker, Kubernetes, Openshift, CI/CD • Knowledge of MLOps architecture and practices • Relevant work experience in ML projects • Knowledge of data manipulation and transformation, e.g. SQL, Setting up/troubleshoot SQL and NoSQL databases • Nice to Have: • Affinity with Advanced Analytics, Data Science, NLP • Experience with distributed systems and clusters for both batch as well as streaming data (S3/Spark/Kafka/Flink) • Programming in Python • System design and architecture • Bash scripting and Linux systems administration • Programming in a statically typed language, e.g. Scala, Java • Experience with building distributed, large scale and secure applications • Good understanding of databases including RDBMS, non-SQL and time-series databases • Experience with working in an agile/scrum way • Being a committer to Open-Source projects is a strong plus
Apply Now