Banking · Insurance · Fintech · 7+ Years
I design and build distributed backend systems that handle real financial workloads — concurrent transactions, fraud detection, event-driven pipelines, and AI-augmented decision making. My work sits at the intersection of enterprise Java engineering and applied AI integration.
I specialise in the hard parts of backend engineering — designing systems that stay correct under concurrency, stay available under failure, and stay maintainable as they grow. Most of my production experience is in the BFSI domain, where data integrity and regulatory compliance are non-negotiable.
Over the past year I've been going deep on AI integration — not surface-level API calls, but production patterns: RAG pipelines with vector search, agentic systems with tool-calling, circuit breakers on LLM calls, caching strategies for AI responses. The goal is AI that behaves predictably in a distributed system, not just a chatbot bolted onto an API.
Core
Data
Infrastructure
AI & Resilience
A microservices platform where an AI agent autonomously evaluates insurance claims using RAG — retrieving historical fraud cases, policy coverage rules, and compliance guidelines from a vector store before making a decision. No hardcoded business rules. The agent reasons through context and acts.
The architecture problem I solved:
Most AI integrations in this space are stateless — send claim text, get a risk score back. That doesn't scale when you need decisions grounded in your own historical data, auditable reasoning trails, and downstream action execution. I built a dedicated claims-decision-engine service that owns the entire AI reasoning pipeline, isolated from the claims submission service so each can evolve independently.
Technical highlights:
- Spring AI 1.0 tool-calling — agent has 7 annotated tools (3 RAG search, 4 action).
@Tooldescriptions drive LLM behaviour — the decision logic lives in the model's reasoning, not in application conditionals - pgvector with HNSW index — chose pgvector over Pinecone/Weaviate to avoid operational overhead; cosine similarity search across 768-dimension embeddings generated by Ollama locally
- Fully async pipeline —
ai-claims-servicefires a non-blocking WebClient call to the decision engine with.subscribe(), returns the submission response immediately. Decision arrives out-of-band via email with complete agent reasoning - Dual Kafka topics —
claim-eventsfor submission notifications,claim-decisionsfor agent decision notifications.notification-serviceconsumes both independently - Resilience4j circuit breaker on all Groq calls — rule-based fallback scoring if AI is unavailable, claim submission never blocked
Stack: Java 17 Spring Boot 3.4 Spring AI 1.0 Groq llama-3.3-70b Ollama nomic-embed-text pgvector PostgreSQL Kafka Redis Resilience4j Docker Compose JWT
An event-driven insurance claims microservice with real-time AI fraud scoring. This was the foundation project before the agentic system — focused on reliable AI API integration patterns in a production microservice context.
Technical highlights:
- Groq AI fraud scoring (llama-3.3-70b-versatile) — structured prompt returns fraud risk (LOW/MEDIUM/HIGH), priority, and reasoning summary
- Circuit breaker with fallback — if Groq is down, rule-based scoring kicks in automatically. Claim submission never fails due to AI unavailability
- Redis caching on AI responses — identical claim descriptions return cached assessments, reducing Groq API calls by ~60%
- Kafka event pipeline —
CLAIM_SUBMITTED,CLAIM_APPROVED,CLAIM_SETTLEDevents trigger downstream notification service with zero coupling - Idempotency + rate limiting — prevents duplicate claims and abuse of the AI scoring endpoint
Stack: Java 17 Spring Boot MySQL Redis Kafka Groq AI Resilience4j OpenFeign Docker JWT
A production-grade loan lifecycle management system built for the banking domain, designed around the concurrency and data integrity challenges that make financial systems genuinely hard to build correctly.
The architecture problem I solved:
Loan approval is a classic distributed systems problem — two concurrent requests approving the same loan simultaneously must not both succeed, and a user submitting the same application twice must not create duplicate records. I solved both with pessimistic locking at the database level and idempotency keys at the API level, then stress-tested both paths.
Technical highlights:
- Pessimistic row locking (
SELECT FOR UPDATE) on loan approval — prevents double disbursement under concurrent load - Idempotency keys with unique constraint enforcement — duplicate submissions return the existing record, never create a new one
- Redis caching on credit score lookups and eligibility results — eliminates repeated DB calls for the same user profile
- Bucket4j rate limiting — per-user application limits enforced at the API gateway layer
- Full Kubernetes deployment on minikube — Deployments, Services, ConfigMaps, PersistentVolumeClaims; proper resource limits and health probes
- Loan lifecycle state machine:
APPLIED → UNDER_REVIEW → APPROVED → DISBURSED → CLOSEDwith audit trail
Stack: Java 17 Spring Boot MySQL Redis Docker Kubernetes Eureka Resilience4j Bucket4j JWT
Distributed Systems Microservices, service mesh, inter-service communication patterns
Data Integrity Transactions, ACID guarantees, row locking, idempotency at scale
AI Integration RAG pipelines, agentic systems, vector search, LLM reliability patterns
Resilience Engineering Circuit breakers, bulkheads, retries, graceful degradation
Performance Redis caching, DB indexing, query optimisation, connection pooling
Infrastructure Docker, Kubernetes, CI/CD, observability
Open to Senior Java Backend Engineer roles in Banking, Insurance, and Fintech.
