Vinay Kumar Reddy Budideti
|
I build full-stack applications and ship AI into them. React · Java Spring Boot · Python · LangChain · RAG · Docker · GCP — owning every layer from database to UI, end-to-end, production-grade.
Professional Highlights
Summary
Full-Stack Engineer with hands-on experience across the complete product lifecycle — React/TypeScript frontends to Java Spring Boot APIs, AI integration with LangChain & RAG pipelines, and cloud deployment on GCP & AWS. Shipped production features at Vosyn, the University of Kansas NCCS Research Lab, and Cognizant — owning every feature end-to-end, working across Agile sprint cycles, and consistently delivering measurable results.
35+ Features Shipped
End-to-end, requirements to release
40% Fewer Bugs
Via versioned APIs & test coverage
15 Agile Sprints
Delivered across full sprint cycles
4 Research Papers
Published peer-reviewed research
Skills & Tech Stack
Frontend
Backend
AI & NLP
DevOps & Cloud
Work Methodology
How I approach software development
Discovery & Planning
Understanding business goals before writing code.
Clean Architecture
Maintainable, testable, and scalable systems.
Async Collaboration
Clear documentation & written communication.
Continuous Improvement
Code reviews, refactoring, performance tuning.
Work Experience
AI Software Engineer
Vosyn — Illinois, United States · Remote
- Delivered 4 AI user stories in 2 months building LangChain chains and RAG pipelines on internal product docs and support tickets as knowledge source
- Improved LLM groundedness by ~35% by building 4 RAG pipelines using vector embeddings and semantic retrieval
- Reduced average LLM response latency by ~25% by optimizing LangChain chain architecture and retrieval chunk sizing
- Cut out-of-scope AI outputs by ~40% by implementing schema-based output formatting and post-generation validators across all LLM workflows
- Connected 4 AI services to existing REST APIs and frontend by wiring LangChain agents into full-stack product architecture with clean integration boundaries
- Maintained reliable pipeline runs by applying structured logging and execution tracing across every LangChain chain
Full-Stack Engineering Intern
Vosyn — Illinois, United States · Remote
- Shipped 12 end-to-end features in 3 months owning full workflow from requirements to release in a remote Agile team
- Reduced UI dev time by ~30% by building a reusable React/TypeScript component library
- Cut integration breakages by ~40% by implementing versioned REST APIs with input validation and standardized error handling
- Eliminated 4+ hrs/week of manual work by building n8n automation pipelines across 5+ webhooks with idempotent steps and retry logic
- Accelerated root-cause isolation from hours to under 30 mins by introducing structured logging and traceable execution paths via Splunk
- Zero regressions in flows owned during the internship by adding unit and integration test coverage for every critical flow before each release
NCCS Lab · Full-Stack Software Developer
The University of Kansas — Lawrence, KS
- Delivered 10+ full-stack dashboard features across 15 Agile sprint cycles serving 5–15 active research staff from requirements to production
- Reduced UI feature delivery time by ~25% by building reusable React component modules with consistent state handling
- Supported 4+ core daily research workflows by building Java Spring Boot REST endpoints with full validation, exception handling, and service-layer logic
- Improved data correctness across 3+ dashboard screens by designing integrated data flows across MongoDB for document storage and SQL for reporting
- Reduced system triage time by 40% by instrumenting structured logs and building Splunk observability dashboards
- Built and integrated a Rasa conversational assistant into the dashboard for guided in-app help for 5–15 daily users
- Enabled context-backed self-service answers by building a RAG prototype connected to the React UI retrieving answers from internal research documents in real time
- Maintained regression-safe deployments across all sprints by writing JUnit/Mockito and Jest/RTL tests covering all critical application flows
Information Technology Support Specialist
The University of Kansas — Lawrence, KS
- Resolved 100+ support tickets/week across 6 months, diagnosing software, hardware, and account issues for 500+ university users
- Maintained a 95% first-call resolution rate consistently across the full tenure using structured troubleshooting tracked via ServiceNow
- Supported 20+ users daily via LogMeIn remote sessions resolving issues without in-person escalation
- Maintained 100% SLA compliance across all complex multi-issue tickets simultaneously
Full Stack Developer
Cognizant Technology Solutions — Andhra Pradesh, India
- Engineered end-to-end Java enterprise web applications supporting 10,000+ active business users, delivering a 35% improvement in application performance through backend optimization and API redesign
- Built and maintained 15+ RESTful APIs using Spring Boot and Hibernate, reducing average API response time from ~800ms to under 200ms — a 75% improvement in responsiveness
- Developed responsive front-end interfaces across 5+ application modules, reducing customer-reported UI defects by 40% within the first two release cycles
- Optimized complex SQL queries and indexing strategies on Oracle and MySQL databases, improving data retrieval speeds by 45% and cutting report generation time by 50%
- Drove a 60% reduction in production-critical bugs by introducing JUnit and Mockito unit testing, achieving 80%+ code coverage across core business logic
- Partnered with DevOps to deploy 3 major production releases via Jenkins CI/CD pipelines, reducing deployment time by 50%
- Consistently delivered across 12+ two-week Agile sprints, maintaining a 95% on-time delivery rate
- Mentored 2 junior developers on Java best practices, improving team code review pass rates by 25%
Programming Analyst Intern
Cognizant Technology Solutions — Andhra Pradesh, India
- Contributed to 3 live production modules within the first 60 days, delivering code that passed QA with zero critical defects
- Identified and resolved 20+ legacy code bugs, reducing application downtime by 15%
- Built internal utility tools automating manual validation tasks, saving the team 5+ hours/week
- Converted to full-time before internship completion based on performance
Software Developer — Co-op
Sree Vidyanikethan Engineering College — Rangampeta, India
- Developed 8+ Java backend modules using OOP and Spring Boot principles, improving application workflow efficiency by 45% across core business processes
- Wrote Python automation scripts eliminating repetitive manual data tasks, saving the team 10+ hours/week in operational effort
- Designed and implemented RESTful APIs enabling frontend-backend communication, reducing integration errors by 30%
- Optimized MySQL database queries improving data retrieval speed by 40%, directly impacting application response times
- Delivered 100% of assigned features on time across all 3 project milestone phases
- Maintained 90%+ code quality scores across all peer reviews following clean code standards
- Collaborated within a team of 4 developers across daily standups, code reviews, and sprint deliverables
Selected Projects
AgentFuse — LLM Agent Cost Optimization Runtime
Production-grade Python SDK that enforces per-run LLM budgets with semantic caching, graduated cost policies, and unified observability — across 12 LLM providers including OpenAI, Anthropic, Gemini, and DeepSeek. Published on PyPI as agentfuse-runtime.
- 87.5% cache hit rate on repeated and paraphrased prompts via two-tier Redis + FAISS semantic cache
- Graduated budget policies — auto-downgrade model at 80%, compress context at 90%, graceful terminate at 100%
- Supports 22+ models across 12 providers with hot-reloadable pricing
- 260 unit tests, 86% core coverage
- Full framework integrations — LangChain, CrewAI, OpenAI Agents SDK
TradeFlow — Event-Driven Loan Processing Platform
Production-grade microservices fintech platform handling loan applications end-to-end — from submission through automated underwriting, manual review workflows, real-time notifications, and CQRS analytics — powered by Apache Kafka event streaming.
- Full microservices architecture — API Gateway, Application Service, Underwriting Service, Notification Service, Reporting Service
- Event-driven with Apache Kafka + Avro schemas + Confluent Schema Registry
- CQRS pattern — separate write and read models for high-performance dashboards
- Outbox pattern for guaranteed at-least-once Kafka delivery
- JWT auth at gateway level with role-based access (Admin, Manager, Analyst, Applicant)
- 4-stage CI/CD pipeline via GitHub Actions to AWS ECR to AWS ECS rolling deployment
- Kubernetes-ready with HPA auto-scaling (2-10 replicas at 70% CPU)
- AI-powered underwriting explanations via Anthropic Claude API
Intent Atoms — Semantic LLM Caching Engine
Semantic caching system that minimizes LLM API costs through a three-tier matching architecture. Instead of caching full queries, Intent Atoms decomposes compound requests into atomic semantic units, enabling granular cache reuse — achieving 87.5% cache hit rate and 71.8% cost reduction on production benchmarks.
- Three-tier hybrid matching — direct hits (>0.85 similarity), adapted responses (0.70-0.85 via Haiku), and atomic decomposition for novel queries
- 87.5% cache hit rate with 71.8% cost savings on 100 real Anthropic API calls
- Atomic intent decomposition — breaks compound queries into reusable semantic units for granular cache reuse
- FAISS vector search with sentence-transformers all-mpnet-base-v2 embeddings (768 dimensions)
- React dashboard with Recharts visualizations for real-time cache performance monitoring
- REST API with query processing, stats, atom browsing, eviction, and health endpoints
CodeMind — Chat with Any GitHub Repository
Production-grade RAG application that lets you have natural language conversations with any GitHub codebase. Ingests repositories, chunks and embeds code with context-aware strategies, and retrieves precise answers grounded in the actual source — not hallucinated guesses.
- RAG pipeline with context-aware code chunking and pgvector embeddings
- NestJS backend with Redis caching for fast retrieval across large repos
- React frontend with real-time streaming responses
- Supports any public GitHub repository — paste a link and start chatting
JobRadar — AI-Powered Job Search Agent
Personal AI job board that aggregates positions from multiple job APIs, scores each listing against my profile, and ranks opportunities by H-1B sponsorship likelihood. Built to automate the most painful parts of the international job search.
- Multi-source job aggregation from major job listing APIs
- AI-driven profile matching and relevance scoring per listing
- H-1B sponsorship likelihood prediction based on company hiring patterns
- Dashboard with filters for role type, location, sponsorship, and match score
NutriBot — AI Nutrition Assistant
AI-powered personalized nutrition assistant built with Next.js 15 and the Vercel AI SDK. Provides real-time macro tracking, meal recommendations, and nutritional insights through a conversational interface — evolved from an MS capstone into a polished, deployed product.
- Next.js 15 with Vercel AI SDK for streaming conversational responses
- Personalized nutrition tracking with real-time macro data
- Clean, responsive chat UI built with React and Tailwind CSS
- Deployed and live on Vercel — production-ready
Tools & Collaboration Stack
How I work and communicate effectively
Version Control
GitHub · GitLab · Bitbucket
Project Management
Jira · Linear · Trello
Communication
Slack · Zoom · Notion
Design & Testing
Figma · Postman · Jest
Education
Master of Science · Computer Science
The University of Kansas
Jul 2023 – Aug 2025
NxtWave CCBP 4.0 Intensive · Fellow
NxtWave
Jan 2024 – Dec 2024
Bachelor of Technology · Computer Science
Sree Vidyanikethan Engineering College
2019 – 2023
Open to Full-Time & Remote Opportunities
Remote · Hybrid · On-Site · Full-Time
Let's build something amazing together