April 25
β’ Build data pipelines, text analysis algorithms, query engines, and decision making engines β’ Apply robust and fault-tolerant approaches to create scalable ingestion and data-processing systems β’ Debug, profile and optimize distributed data-intensive applicating, improving their latency, accuracy, resource consumption, and throughput β’ Work with existing applications built with Spark, S3, Timescale, Python and Rust β’ Directly implement services and features that leverage the results of your data pipelineImplement and improve machine learning and data pipelines
β’ 5+ years of experience as an engineer with a strong understanding of key concepts in distributed systems β’ 3+ years of extensive experience in building and deploying data applications β’ Fluency in at least one, and ideally more than one, of these languages: Java/Scala/Kolin, Python, Go, Rust, or C++ β’ Good understanding of following concepts: partitioning, replication, map-reduce, indexing, and CAP β’ Experience with distributed storage systems (S3, HDFS, Hive, ClickHouse, Elastic, etc), distributed processing engines (Spark, etc), and message queues (Kafka, SQS, etc) β’ Passion for building large-scale ML applications and improving software engineers' productivity β’ Some understanding of key concepts in natural language processing, machine learning, or statistical analysis β’ Some experience with machine learning stack (pandas, PyTorch, numpy, sci-kit, transformers, etc)
β’ Unlimited PTO β’ Competitive salary and equity β’ Work-life balance β’ Flexibility to be fully or partly remote β’ Few meetings, so you can ship fast and focus on building β’ One Medical membership on us! β’ Top-notch medical, dental, vision, short-term disability, long-term disability, and life insurance β’ All insurance is 100% company-paid ($0 premiums) for employees and highly subsidized for dependants β’ FSA, HSA with company contributions, and pre-tax commuter benefits β’ 401(k) plan β’ Paid parental leave ( up to 12 weeks)
Apply Now