Building production AI systems at Nobility RCM — architecting agentic workflows, medical compliance automation, and LLM integrations that replace hours of manual work with a single pipeline.
AI Engineer | Nobility RCM | Islamabad, Pakistan | March 2025 – Present
- Developed a medical compliance agent that reads clinical notes and auto-extracts ICD-10/CPT/HCPCS billing codes using a fine-tuned Phi-3 model with a 4-agent LangGraph architecture and 3-layer anti-hallucination validation — replacing hours of manual medical coding with a single pipeline.
- Built LLM-powered browser automation agents (Playwright + Browser Use) that navigate billing portals, extract claim data, and auto-fill forms — turning repetitive manual workflows into agentic AI processes, cutting processing time by 15% and boosting accuracy by 10%.
- Designed RAG pipelines with hybrid retrieval (FAISS vectors + BM25 keyword matching) and MCP server integrations to connect LLMs with internal billing systems, enabling natural-language querying across claim databases and accelerating decision-making by 10%.
class AIEngineer:
def __init__(self):
self.mission = "Democratizing AI accessibility"
self.passion = "Building intelligent systems that augment human potential"
self.mantra = "The best way to predict the future is to invent it!"
def current_focus(self):
return [
"Advanced Transformer Architectures",
"Multimodal AI Systems",
"Production MLOps",
"Open Source Contributions"
]|
LinkedIn's API is locked down and most job tools are half-baked wrappers. I built a full MCP (Model Context Protocol) server that gives any LLM-based assistant a structured tool interface to LinkedIn — search jobs, generate tailored resumes and cover letters, and manage applications programmatically. Plugs directly into Claude Desktop, Cursor, or any MCP-compatible client. No scraping, no hacks — clean protocol-level access. Stack: |
Most OCR tools choke on messy scans — broken tables, mixed layouts, multiple records per page. Built an enterprise extraction system that auto-detects document type, generates adaptive schemas on the fly (zero-shot), and extracts multiple records per page using a local VLM (Qwen3-VL 8B). Features cross-page duplicate detection, checkpoint/resume for large batches, 4x parallel speedup, and consolidated Excel + JSON + Markdown exports. Integrates Mem0 — the system learns from corrections and gets smarter over time. Stack: |
|
Job applications are repetitive — same forms, same clicks, hundreds of times. Built a full-stack AI platform (FastAPI + React + Redis) that discovers jobs, scores resumes against postings using FAISS vector matching, tailors documents via LLM, and submits applications through Playwright browser automation. Features a human-in-the-loop review workflow, batch processing queue, real-time WebSocket progress tracking, ATS scoring engine, and analytics dashboard. Supports OpenAI, Groq, and Gemini via LiteLLM routing. Stack: |
A prediction model is useless if it silently goes stale. Built a full production ML system that fetches live EUR/USD forex data, engineers 33 features, and retrains automatically every 2 hours via Airflow. Serves predictions through FastAPI with real-time drift detection on every request. Full observability: Prometheus metrics (latency P95/P99, drift ratio), Grafana dashboards with 13 alert rules, and MLflow experiment tracking. Runs as an 8-service Docker Compose stack with DVC data versioning, 5 CI/CD workflows, and 37 unit tests. Stack: |
- 🔒 Closed issue #5 in Rayyan9477/Solace-AI





