Engineering Member - Pre-training / Data

October 31, 2024

Apply Now
Logo of Poolside

Poolside

Poolside is an accelerator specifically designed for web3 founders and builders. It provides support for projects in decentralized finance, gaming, governance, infrastructure, and NFTs. With a robust ecosystem of 20,000 members, including mentors, investors, and web3 builders, Poolside has co-launched and supported over 110 projects. The accelerator offers unique access to mentorship and technical expertise to help web3 projects scale and achieve successful launches. Poolside also engages with leading companies and protocols to drive growth and innovation in the web3 space.

Blockchain • Accelerator • Incubator • Hub • Fundraising

📋 Description

•About Poolside: Building AI for valuable work and progress. •Remote-first team across Europe and North America. •Hands-on role in improving pretraining dataset quality. •Collaborate with teams on data quality. •Stay updated with research in dataset design and pretraining. •Deliver high-quality natural language and source code datasets for AI training.

🎯 Requirements

•Strong machine learning and engineering background •Experience with Large Language Models (LLM) •Good knowledge of Transformers is a must •Knowledge/Experience with cutting-edge training tricks •Knowledge/Experience of distributed training •Trained LLMs from scratch •Knowledge of deep learning fundamentals •Experience in building trillion-scale pretraining datasets, in particular: •Ingest, filter and deduplicate large amounts of web and code data •Familiar with concepts making SOTA pretraining datasets: multi-linguality, curriculum learning, data augmentation, data packing, etc •Run data ablations, tokenization and data-mixture experiments •Develop prompt engineering pipelines to generate synthetic data at scale •Fine-tuning small models for data filtering purposes •Experience working with large-scale GPU clusters and distributed data pipelines •Strong obsession with data quality •Research experience •Author of scientific papers on any of the topics: applied deep learning, LLMs, source code generation, etc, is a nice to have •Can freely discuss the latest papers and descend to fine details •Is reasonably opinionated •Programming experience •Strong algorithmic skills •Linux •Git, Docker, k8s, cloud managed services •Data pipelines and queues •Python with PyTorch or Jax •Nice to have: •Prior experience in non-ML programming, especially not in Python •C/C++, CUDA, Triton

🏖️ Benefits

•Fully remote work & flexible hours •37 days/year of vacation & holidays •Health insurance allowance for you and dependents •Company-provided equipment •Wellbeing, always-be-learning and home office allowances •Frequent team get togethers •Great diverse & inclusive people-first culture

Apply Now

Discover 100,000+ Remote Jobs!

Join now to unlock all jobs

Discover hidden jobs

We scan the internet everyday and find jobs not posted on LinkedIn or other job boards.

Head start against the competition

We find jobs as soon as they're posted, so you can apply before everyone else.

Be the first to know

Daily emails with new job openings straight to your inbox.

Choose your membership

Loved by 10,000+ remote workers
🎉$6 / week

Cancel anytime

MOST POPULAR
🥳$18 / month
$24
Save 25% vs weekly

Cancel anytime

BEST VALUE
🥰$54 / year
$216
Save 75% vs monthly

Cancel anytime

Wall of Love

Frequently asked questions

We use powerful scraping tech to scan the internet for thousands of remote jobs daily. It operates 24/7 and costs us to operate, so we charge for access to keep the site running.

Of course! You can cancel your subscription at any time with no hidden fees or penalties. Once canceled, you’ll still have access until the end of your current billing period.

Other job boards only have jobs from companies that pay to post. This means that you miss out on jobs from companies that don't want to pay. On the other hand, Remote Rocketship scrapes the internet for jobs and doesn't accept payments from companies. This means we have thousands more jobs!

New jobs are constantly being posted. We check each company website every day to ensure we have the most up-to-date job listings.

Yes! We’re always looking to expand our listings and appreciate any suggestions from our community. Just send an email to Lior@remoterocketship.com. I read every request.

Remote Rocketship is a solo project by me, Lior Neu-ner. I built this website for my wife when she was looking for a job! She was having a hard time finding remote jobs, so I decided to build her a tool that would search the internet for her.

Why I created Remote Rocketship

Choose your membership

Loved by 10,000+ remote workers
🎉$6 / week

Cancel anytime

MOST POPULAR
🥳$18 / month
$24
Save 25% vs weekly

Cancel anytime

BEST VALUE
🥰$54 / year
$216
Save 75% vs monthly

Cancel anytime

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com