Vinay Kumar Reddy Budideti
|
I build full-stack applications and ship AI into them. React · Java Spring Boot · Python · LangChain · RAG · Docker · GCP — owning every layer from database to UI, end-to-end, production-grade.
Professional Highlights
Summary
Full-Stack Engineer with hands-on experience across the complete product lifecycle — React/TypeScript frontends to Java Spring Boot APIs, AI integration with LangChain & RAG pipelines, and cloud deployment on GCP & AWS. Shipped production features at Vosyn and the University of Kansas NCCS Research Lab — owning every feature end-to-end, working across Agile sprint cycles, and consistently delivering measurable results.
35+ Features Shipped
End-to-end, requirements to release
40% Fewer Bugs
Via versioned APIs & test coverage
15 Agile Sprints
Delivered across full sprint cycles
4 Research Papers
Published peer-reviewed research
Skills & Tech Stack
Frontend
Backend
AI & NLP
DevOps & Cloud
Work Methodology
How I approach software development
Discovery & Planning
Understanding business goals before writing code.
Clean Architecture
Maintainable, testable, and scalable systems.
Async Collaboration
Clear documentation & written communication.
Continuous Improvement
Code reviews, refactoring, performance tuning.
Work Experience
Software Engineer (Full-Stack & AI) · Internship
Vosyn — Illinois, United States · Remote
- Architected and deployed RAG pipelines using LangChain, GPT-4, and FAISS/Pinecone vector databases, reducing document processing time by 40% across 6 enterprise clients and enabling sub-second semantic search over 50,000+ documents
- Built Python backend services integrating LLM APIs with retry logic, token management, and structured error handling — maintained 99.5% pipeline uptime across production workloads processing 10,000+ daily requests
- Developed a shared React/TypeScript component library adopted by 3 product teams, cutting UI development time by ~30% and enforcing consistent design patterns across the platform
- Designed and shipped versioned REST APIs with backward compatibility guarantees, reducing integration breakages by ~40% and eliminating cross-team deployment conflicts
- Engineered n8n automation pipelines orchestrating 5+ webhook endpoints for event-driven workflows, eliminating 4+ hours/week of manual data processing and Dockerizing all services for reproducible deployments
- Implemented structured logging and observability via Splunk, reducing root-cause isolation time from hours to under 30 minutes and enabling proactive alerting on pipeline anomalies
Information Technology Support Specialist
The University of Kansas — Lawrence, Kansas, United States · On-site
- Resolved 100+ support tickets per week across 6 months diagnosing software, hardware, and account issues for 500+ university users
- Maintained 95% first-call resolution rate consistently across the full 6-month tenure using structured troubleshooting on every ticket (tracked via ServiceNow resolution metrics)
- Supported 20+ users daily via LogMeIn remote sessions resolving technical issues without requiring in-person escalation
- Managed complex multi-issue tickets simultaneously while maintaining full SLA compliance and quality across all cases
- Received consistent positive feedback from users and supervisors by maintaining professional customer-first communication on every interaction
Full-Stack Software Developer & Research Assistant
The University of Kansas — Lawrence, Kansas, United States · Hybrid
- Developed internal research web applications using React.js, integrating RESTful APIs to display real-time service data with async state management and comprehensive error handling
- Implemented structured logging and Splunk-based monitoring that reduced average debugging time by 35% across the research team's application suite
- Co-authored 4 peer-reviewed publications in deep learning and natural language processing, contributing novel neural architecture experiments and evaluation frameworks
- Built IoT security monitoring dashboards and embedded systems interfaces, processing real-time sensor data for cybersecurity research applications
Selected Projects
AgentFuse — LLM Agent Cost Optimization Runtime
Production-grade Python SDK that enforces per-run LLM budgets with semantic caching, graduated cost policies, and unified observability — across 12 LLM providers including OpenAI, Anthropic, Gemini, and DeepSeek. Published on PyPI as agentfuse-runtime.
- 87.5% cache hit rate on repeated and paraphrased prompts via two-tier Redis + FAISS semantic cache
- Graduated budget policies — auto-downgrade model at 80%, compress context at 90%, graceful terminate at 100%
- Supports 22+ models across 12 providers with hot-reloadable pricing
- 260 unit tests, 86% core coverage
- Full framework integrations — LangChain, CrewAI, OpenAI Agents SDK
TradeFlow — Event-Driven Loan Processing Platform
Production-grade microservices fintech platform handling loan applications end-to-end — from submission through automated underwriting, manual review workflows, real-time notifications, and CQRS analytics — powered by Apache Kafka event streaming.
- Full microservices architecture — API Gateway, Application Service, Underwriting Service, Notification Service, Reporting Service
- Event-driven with Apache Kafka + Avro schemas + Confluent Schema Registry
- CQRS pattern — separate write and read models for high-performance dashboards
- Outbox pattern for guaranteed at-least-once Kafka delivery
- JWT auth at gateway level with role-based access (Admin, Manager, Analyst, Applicant)
- 4-stage CI/CD pipeline via GitHub Actions to AWS ECR to AWS ECS rolling deployment
- Kubernetes-ready with HPA auto-scaling (2-10 replicas at 70% CPU)
- AI-powered underwriting explanations via Anthropic Claude API
Intent Atoms — Semantic LLM Caching Engine
Semantic caching system that minimizes LLM API costs through a three-tier matching architecture. Instead of caching full queries, Intent Atoms decomposes compound requests into atomic semantic units, enabling granular cache reuse — achieving 87.5% cache hit rate and 71.8% cost reduction on production benchmarks.
- Three-tier hybrid matching — direct hits (>0.85 similarity), adapted responses (0.70-0.85 via Haiku), and atomic decomposition for novel queries
- 87.5% cache hit rate with 71.8% cost savings on 100 real Anthropic API calls
- Atomic intent decomposition — breaks compound queries into reusable semantic units for granular cache reuse
- FAISS vector search with sentence-transformers all-mpnet-base-v2 embeddings (768 dimensions)
- React dashboard with Recharts visualizations for real-time cache performance monitoring
- REST API with query processing, stats, atom browsing, eviction, and health endpoints
CodeMind — Chat with Any GitHub Repository
Production-grade RAG application that lets you have natural language conversations with any GitHub codebase. Ingests repositories, chunks and embeds code with context-aware strategies, and retrieves precise answers grounded in the actual source — not hallucinated guesses.
- RAG pipeline with context-aware code chunking and pgvector embeddings
- NestJS backend with Redis caching for fast retrieval across large repos
- React frontend with real-time streaming responses
- Supports any public GitHub repository — paste a link and start chatting
JobRadar — AI-Powered Job Search Agent
Personal AI job board that aggregates positions from multiple job APIs, scores each listing against my profile, and ranks opportunities by H-1B sponsorship likelihood. Built to automate the most painful parts of the international job search.
- Multi-source job aggregation from major job listing APIs
- AI-driven profile matching and relevance scoring per listing
- H-1B sponsorship likelihood prediction based on company hiring patterns
- Dashboard with filters for role type, location, sponsorship, and match score
NutriBot — AI Nutrition Assistant
AI-powered personalized nutrition assistant built with Next.js 15 and the Vercel AI SDK. Provides real-time macro tracking, meal recommendations, and nutritional insights through a conversational interface — evolved from an MS capstone into a polished, deployed product.
- Next.js 15 with Vercel AI SDK for streaming conversational responses
- Personalized nutrition tracking with real-time macro data
- Clean, responsive chat UI built with React and Tailwind CSS
- Deployed and live on Vercel — production-ready
Tools & Collaboration Stack
How I work and communicate effectively
Version Control
GitHub · GitLab · Bitbucket
Project Management
Jira · Linear · Trello
Communication
Slack · Zoom · Notion
Design & Testing
Figma · Postman · Jest
Education
Master of Science · Computer Science
The University of Kansas
Jul 2023 – Aug 2025
NxtWave CCBP 4.0 Intensive · Fellow
NxtWave
Jan 2024 – Dec 2024
Bachelor of Technology · Computer Science
Sree Vidyanikethan Engineering College
2019 – 2023
Open to Full-Time & Remote Opportunities
Remote · Hybrid · On-Site · Full-Time
Let's build something amazing together