Machine Learning • Data Engineering • Data Analytics • Artificial Intelligence • Data Science
51 - 200
June 1, 2023
Airflow
Apache
AWS
Azure
Big Data
Cassandra
Cloud
DevOps
ETL
GCP
Git
Hadoop
HTML
JavaScript
Kafka
LESS
MongoDB
NoSQL
Postgres
Python
Spark
SQL
Machine Learning • Data Engineering • Data Analytics • Artificial Intelligence • Data Science
51 - 200
• Create and maintain optimal data pipeline architecture across multiple data sources, including licensed and scraped data. • Assemble large, complex data sets that meet functional needs across Data Teams. • Design and develop optimal data processing techniques: automating manual processes, data delivery, data validation and data augmentation. • Develop any necessary ETL processes to optimize analysis and performance. • Manage analytics tools that provide actionable insights into usage, customer acquisition, operational efficiency and other key business performance metrics. • Design and develop a API integrations in order to feed different data models. • Architect and implement new features from scratch, partnering with AI/ML engineers to identify data sources, gaps and dependencies. • Identify bugs and performance issues across the stack, including performance monitoring and testing tools to ensure data integrity and quality user experience. • Build a highly scalable infrastructure using SQL and AWS big data technologies. • Keep data secure and compliant with international data handling rules.
• 3+ Professional experience shipping high-quality, production-ready code. • Strong computer science foundations, including data structures & algorithms, OS, computer networks, databases, algorithms, object-oriented programming. Experience in Python. • Experience in setting up data pipelines using relational SQL and NoSQL databases, including Postgres, Cassandra or MongoDB. • Experience with any cloud services for handling data infrastructure such as: Snowflake, GCP, Azure, Databricks or AWS. • Proven success manipulating, processing and extracting value from large heterogeneous datasets. • Strong analytic skills related to working with unstructured datasets. • Experience with extracting and ingesting data from websites using web crawling tools. • Experience with big data tools, including Hadoop, Spark, Kafka, etc. • Expertise with version control systems, such as Git. • Excellent english communication skills and the ability to have in-depth technical discussions with both the engineering team and business people. • Self-starter and comfort working in an early-stage environment. • Strong project management and organizational skills. Nice To Have: • BSc in Computer Science, Mathematics or similar field; Master’s or PhD degree is a plus. • Experience with design of ETLs using Apache Airflow. • Experience with real-time scenarios, low-latency systems and data intensive environments is a plus. • Experience with extracting and ingesting data from Google Analytics or Twitter. • Understanding of AI/ML models and experience working with AI or Data Science Teams. • Experience developing scalable RESTful APIs. • Proficiency in HTML, CSS and JavaScript. • Experience with consumer applications and data handling. • Familiarity with data privacy regulations and best practices. • Advanced degree (Master’s or PhD) in computer science.
• Support for career growth and professional development • Transparent and collaborative workplace culture • Opportunity to work with passionate and talented professionals • Investment in employees' career growth • Fun and friendly work environment • Opportunity to work on meaningful projects with a positive impact
Apply Now