Halo Media LLC is a company that specializes in solving complex problems by combining creative talent with subject matter expertise. They showcase their work through various case studies, indicating a focus on tailored solutions for their clients.
Web Design β’ Web Development β’ Internet β’ iPhone β’ Applications
November 5, 2024
Halo Media LLC is a company that specializes in solving complex problems by combining creative talent with subject matter expertise. They showcase their work through various case studies, indicating a focus on tailored solutions for their clients.
Web Design β’ Web Development β’ Internet β’ iPhone β’ Applications
β’ We are seeking an experienced AI/LLM Data Engineer to build and maintain the data pipeline for our Generative AI platform. β’ The ideal candidate will be well-versed in the latest Large Language Model (LLM) technologies and have a strong background in data engineering, with a focus on Retrieval-Augmented Generation (RAG) and knowledge-base techniques. β’ This role sits in the AI COE within DX Tech & Digital. β’ You will work on highly visible strategic projects, collaborating with cross-functional teams to define requirements and deliver high-quality AI solutions. β’ The ideal candidate will have a passion for Generative AI and LLMs, with a proven track record of delivering innovative AI applications. β’ Responsibilities: Design, implement, and maintain an end-to-end multi-stage data pipeline for LLMs, including Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) data processes. β’ Identify, evaluate, and integrate diverse data sources and domains to support the Generative AI platform. β’ Develop and optimize data processing workflows for chunking, indexing, ingestion, and vectorization for both text and non-text data. β’ Benchmark and implement various vector stores, embedding techniques, and retrieval methods. β’ Create a flexible pipeline supporting multiple embedding algorithms, vector stores, and search types (e.g., vector search, hybrid search). β’ Implement and maintain auto-tagging systems and data preparation processes for LLMs. β’ Develop tools for text and image data crawling, cleaning, and refinement. β’ Collaborate with cross-functional teams to ensure data quality and relevance for AI/ML models. β’ Work with data lake house architectures to optimize data storage and processing. β’ Integrate and optimize workflows using Snowflake and various vector store technologies.
β’ Master's degree in Computer Science, Data Science, or a related field β’ 3-5 years of work experience in data engineering, preferably in AI/ML contexts β’ Proficiency in Python, JSON, HTTP, and related tools β’ Strong understanding of LLM architectures, training processes, and data requirements β’ Experience with RAG systems, knowledge base construction, and vector databases β’ Familiarity with embedding techniques, similarity search algorithms, and information retrieval concepts β’ Hands-on experience with data cleaning, tagging, and annotation processes (both manual and automated) β’ Knowledge of data crawling techniques and associated ethical considerations β’ Strong problem-solving skills and ability to work in a fast-paced, innovative environment β’ Familiarity with Snowflake and its integration in AI/ML pipelines β’ Experience with various vector store technologies and their applications in AI β’ Understanding of data lakehouse concepts and architectures β’ Excellent communication, collaboration, and problem-solving skills. β’ Ability to translate business needs into technical solutions. β’ Passion for innovation and a commitment to ethical AI development. β’ Experience building LLMs pipeline using framework like LangChain, LlamaIndex, Semantic Kernel, OpenAI functions. β’ Familiar with different LLM parameters like temperate, top-k, and repeat penalty, and different LLM outcome evaluation data science metrics and methodologies.
β’ US employees benefit package.
Apply NowNovember 4, 2024
Support Snowflake data warehouse performance and optimize queries at Continuus Technologies.
November 3, 2024
Data Engineer role at IMCS Group, requiring expertise in Java and big data.
October 17, 2024
Develop data solutions using Google Cloud Platform and Python scripting.
September 19, 2024
Data Engineer for Grass, building data pipelines and scalable infrastructure.
πΊπΈ United States β Remote
π΅ $100k - $140k / year
β° Full Time
π‘ Mid-level
π Senior
π° Data Engineer
September 17, 2024
Join BayApps, Inc. as a Snowflake Data Engineer. Build efficient data storage solutions remotely.
Discover 100,000+ Remote Jobs!
We use powerful scraping tech to scan the internet for thousands of remote jobs daily. It operates 24/7 and costs us to operate, so we charge for access to keep the site running.
Of course! You can cancel your subscription at any time with no hidden fees or penalties. Once canceled, youβll still have access until the end of your current billing period.
Other job boards only have jobs from companies that pay to post. This means that you miss out on jobs from companies that don't want to pay. On the other hand, Remote Rocketship scrapes the internet for jobs and doesn't accept payments from companies. This means we have thousands more jobs!
New jobs are constantly being posted. We check each company website every day to ensure we have the most up-to-date job listings.
Yes! Weβre always looking to expand our listings and appreciate any suggestions from our community. Just send an email to Lior@remoterocketship.com. I read every request.
Remote Rocketship is a solo project by me, Lior Neu-ner. I built this website for my wife when she was looking for a job! She was having a hard time finding remote jobs, so I decided to build her a tool that would search the internet for her.