
Software Engineer II, AI Infra
ClariPosted 5/15/2025

Software Engineer II, AI Infra
Clari
Job Location
Job Summary
Clari is seeking a highly skilled AI Engineer to join their team in Bengaluru, India. As an AI Engineer at Clari, you will design and ship micro-services that combine retrieval augmented generation (RAG), prompt pipelines, and tool calling agents for various use-cases such as summarization, Q&A, topic-detection, and sentiment scoring. You will work closely with product managers, designers, and others in a cross-functional environment on multiple projects. The ideal candidate should have 2+ years of total engineering experience with at least 1 year building or integrating LLM/GenAI features, solid Python skills, and familiarity with one ML/LLM framework such as PyTorch, Hugging Face Transformers, or LangChain/Llama. You will also be hands-on with vector databases/search stacks, have proven backend fundamentals, and understand prompt engineering, temperature/top-p tuning, embeddings, and token-cost optimization. Clari offers flexible working hours, hybrid work opportunities, life and accidental coverage, mental health support, pre-IPO stock options, well-being and professional development stipends, 100% paid parental leave, discretionary paid time off, monthly 'take a break' days, and Focus Fridays. This is a fully remote opportunity that can be worked from any location in India.
Job Description
Responsibilities
- Applied LLM Engineering: Design and ship micro-services that combine retrieval augmented generation (RAG), prompt pipelines, and tool calling agents for use-cases such as summarisation, Q&A, topic-detection, and sentiment scoring
- Embeddings & Retrieval: Build ingestion jobs that chunk, embed (OpenAI / HuggingFace / in-house), and index millions of calls and emails into Elasticsearch or a vector store; optimize latency, recall, and cost
- Evaluation & Safety: Own offline + online eval harnesses (BLEU, ROUGE, human-in-the-loop, red-teaming) and put model guard-rails in place (PII filtering, toxicity checks)
- Fast Iteration: Pair with PMs & designers to run A/B tests, ship weekly behind feature flags, and iterate based on usage signals
- Technical Depth: Contribute Python production code (FastAPI/Ray) plus tests; review PRs, write design docs, and champion best practices for other product squads integrating LLMs
Qualifications
- 2+ years total engineering experience with at least 1 year building or integrating LLM/GenAI features (e.g. OpenAI, Bedrock, Claude, Llama2)
- Solid Python skills (typing, async, pytest) and familiarity with one ML/LLM framework: PyTorch, Hugging Face Transformers, or LangChain/LlamaIndex
- Hands‑on with vector databases / search stacks (Elasticsearch k‑NN, Pinecone, FAISS, or equivalent)
- Proven backend fundamentals: REST/GraphQL, distributed queues (Kafka/SQS)
- Understanding of prompt engineering, temperature / top‑p tuning, embeddings, and token‑cost optimisation
- Comfort with reading research papers / blog posts and rapidly prototyping ideas
- Clear written & verbal communication; able to explain trade‑offs to both engineers and non‑technical stakeholders
- Exposure to Ray, Triton, or other high‑throughput inference stacks is a plus
Perks and Benefits @ Clari
- Flexible working hours and hybrid work opportunities
- Life and accidental coverage
- Mental health support provided by Silver Oak Health
- Pre-IPO stock options
- Well-being and professional development stipends
- 100% paid parental leave
- Discretionary paid time off, monthly ‘take a break’ days, and Focus Fridays
- Focus on culture: Charitable giving match, plus in-person and virtual events