January 3
Airflow
Apache
AWS
Azure
Cloud
Distributed Systems
Docker
Google Cloud Platform
Hadoop
IoT
Kubernetes
Neo4j
Python
PyTorch
Spark
Tensorflow
• Oomnitza offers the industry’s most versatile Enterprise Technology Management platform that orchestrates and automates key business processes for IT. • Our SaaS solution, with agentless integrations, best practices and low-code workflows, enables enterprises to leverage their existing infrastructure systems and automate processes such as offboarding, onboarding, audit readiness, refresh forecasting and more. • Team Oomnitza is seeking an experienced AI & ML Site Reliability Engineer who is passionate about AI, machine learning, and data science to support our innovations in AI and Data product management. • In this role, you will be responsible for architecting and maintaining infrastructure that supports machine learning (ML), artificial intelligence (AI), and data-driven solutions. • You will help stand up the foundational systems that enable large-scale AI deployment, including developing and managing Oomnitza’s big data analytics platform, developing AI architecture, implementing vector databases, building knowledge graphs, and optimizing systems for ML model deployment and inference. • You will collaborate closely with data scientists, infrastructure engineers, product management teams, and UX designers to ensure our customers realize meaningful business value by streamlining workflows, ensuring scalability, and managing the complete lifecycle of AI systems from development to production.
• Education: Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field • Experience: 5+ years of experience in site reliability engineering, dev ops, ML Ops or similar roleExperience with cloud platforms such as AWS, GCP, or Azure, including AI/ML services (e.g., SageMaker, Google Colab, Vertex AI).Proficient in deploying machine learning models such as regressions, decision trees, neural networks, recommendations systems, etc., into production and managing model lifecycle. • Technical Skills: Experience with data processing tools such as Apache Spark, Hadoop, or Airflow for large-scale data processing.Experience with AI/ML tools and frameworks (e.g., TensorFlow, PyTorch, LangChain, Hugging Face).Strong understanding of vector databases (e.g., Pinecone, Milvus, Chroma) and knowledge graph tools (e.g., Neo4j, RDF).Experience with RAG (Retrieval-Augmented Generation) techniques and GraphRAG.Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).Proficiency in programming languages such as Python, Bash, and experience with ML tools and libraries.Experience implementing CI/CD for ML pipelines and working with ML version control systems (e.g., DVC, MLflow).Experience in on-call incident response in high-uptime environments • Behavioural Skills: Intellectual curiosity with a hunger to know how things work and question established ideas, concepts and frameworks • Spirit of service: with a “how can I serve” attitude that is centered around delivering value to the greater team, the overall company, and for our broader community of customers • Ability to embrace ambiguity and apply structured thinking and problem-solving skills • Entrepreneurial spirit with an enthusiasm to take on new challenges • Excellent communication and collaboration skills
• Dental & Vision Insurance • Employee equity plan • Health Insurance for your spouse and dependents • Pension, Life insurance and Income protection • Remote working & flexible work schedules Working from home equipment allowance • Choice of preferred equipment, Mac or PC. • Regular, fun social events and workshops.
Apply NowNovember 27, 2024
Join Yahoo as a DevOps Engineer to design, deploy, and automate products on AWS.
🇮🇪 Ireland – Remote
💰 $4.8M Series B on 1995-11
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
November 27, 2024
Lead DevOps management and reliability for Globalization Partners’ cloud-native solutions remotely.