ML Engineer - LLM Evaluation

August 27

Apply Now

Description

β€’ Own LLM evaluation processes and methods with a focus on generating benchmarks representative of real-world usage and safety vulnerabilities. β€’ Generate high quality synthetic data, curate labels, and conduct rigorous benchmarking. β€’ Deliver robust, scalable, and reproducible production code. β€’ Push the envelope by developing methods for benchmarking that revamps how we assess the best LLMs for harmlessness and helpfulness. Your research will directly empower our customers to more feasibly deploy safe and responsible LLMs. β€’ Co-author papers, patents, and presentations with our research team by integrating other members’ work with your vertical.

Requirements

β€’ Domain knowledge in LLM evaluation and data curation techniques. β€’ Extensive experience in designing and implementing LLM benchmarking, extending previous methods. Comfortability with leading end-to-end projects. β€’ Adaptability and flexibility. In both the academic and startup world, a new finding in the community may necessitate an abrupt shift in focus. You must be able to learn, implement, and extend state-of-the-art research. β€’ Preferred: past research or projects in benchmarking LLMs.

Apply Now

Similar Jobs

Built byΒ Lior Neu-ner. I'd love to hear your feedback β€” Get in touch via DM or lior@remoterocketship.com