A high-growth organization in the Enterprise Generative AI and Applied Machine Learning sector building production-grade LLM services, conversational agents, and retrieval-augmented generation (RAG) solutions for enterprise customers. We deliver secure, low-latency GenAI features that combine vector search, prompt orchestration, and API-driven workflows. This is an on-site engineering role based in India focused on shipping scalable GenAI systems into production.
Primary Title: Generative AI Engineer
Role & Responsibilities
- Architect, implement, and deploy end-to-end GenAI features: embeddings ➜ vector indexing ➜ retrieval ➜ LLM orchestration for production use-cases.
- Build and integrate LLM workflows and agents using LangChain (or equivalent), OpenAI/Hugging Face APIs, and custom orchestration logic.
- Fine-tune, evaluate, and optimize models and embedding pipelines to improve relevance, latency, and cost (incl. quantization and batching strategies).
- Design and operate scalable vector search solutions (FAISS/Pinecone/Milvus) with efficient indexing, sharding, and query optimization.
- Containerize and productionize inference services with monitoring, CI/CD, observability, and GPU-aware deployment patterns.
- Collaborate with product, data, and ML teams to translate requirements into reliable GenAI features and document engineering best practices.
Skills & Qualifications
- Python
- PyTorch
- Hugging Face Transformers
- Prompt engineering
- LangChain
- FAISS
- OpenAI API
- Model fine-tuning
- Docker
Benefits & Culture Highlights
- On-site role with access to GPU resources and hands-on production deployment experience.
- Fast-paced learning environment with opportunities to lead GenAI projects and shape technical direction.
- Competitive compensation, focused mentorship, and budget for technical training and conferences.