Vinay Kumar Reddy Budideti

Full-Stack Engineer · AI Integration

Vinay Kumar Reddy Budideti
Open to Full-Time · Remote · Hybrid · On-Site
3+
Years Experience
50+
Features Shipped
4
Research Publications

Vinay Kumar Reddy Budideti

|

I build full-stack applications and ship AI into them. React · Java Spring Boot · Python · LangChain · RAG · Docker · GCP — owning every layer from database to UI, end-to-end, production-grade.

Lawrence, Kansas, US · Remote · Hybrid · On-Site
US Timezone (CST) · Remote-First

Professional Highlights

Summary

Full-Stack Engineer with hands-on experience across the complete product lifecycle — React/TypeScript frontends to Java Spring Boot APIs, AI integration with LangChain & RAG pipelines, and cloud deployment on GCP & AWS. Shipped production features at Vosyn, the University of Kansas NCCS Research Lab, and Cognizant — owning every feature end-to-end, working across Agile sprint cycles, and consistently delivering measurable results.

35+ Features Shipped

End-to-end, requirements to release

40% Fewer Bugs

Via versioned APIs & test coverage

15 Agile Sprints

Delivered across full sprint cycles

4 Research Papers

Published peer-reviewed research

Skills & Tech Stack

Frontend

React & TypeScript 90%
HTML5 & CSS3 88%
JavaScript 87%
Next.js 85%
Tailwind CSS 85%

Backend

REST APIs 92%
Java & Spring Boot 90%
Hibernate & JPA 87%
FastAPI 85%
PostgreSQL & MongoDB 84%
MySQL & Oracle 84%
Python & Flask 82%
NestJS 82%
Redis 80%
Node.js 78%

AI & NLP

LangChain & RAG 88%
OpenAI & LLM APIs 85%
Rasa NLU 83%
FAISS & Vector Search 82%
Vector Embeddings 80%
NLP Pipelines 80%
Vercel AI SDK 78%

DevOps & Cloud

Git & CI/CD 88%
Jenkins 85%
Docker & GCP 82%
Vercel 82%
AWS & Cloud 80%
Splunk & Observability 78%
Maven & Gradle 78%

Work Methodology

How I approach software development

Discovery & Planning

Understanding business goals before writing code.

Clean Architecture

Maintainable, testable, and scalable systems.

Async Collaboration

Clear documentation & written communication.

Continuous Improvement

Code reviews, refactoring, performance tuning.

Work Experience

AI Software Engineer

Vosyn — Illinois, United States · Remote

Oct 2025 – Nov 2025
  • Delivered 4 AI user stories in 2 months building LangChain chains and RAG pipelines on internal product docs and support tickets as knowledge source
  • Improved LLM groundedness by ~35% by building 4 RAG pipelines using vector embeddings and semantic retrieval
  • Reduced average LLM response latency by ~25% by optimizing LangChain chain architecture and retrieval chunk sizing
  • Cut out-of-scope AI outputs by ~40% by implementing schema-based output formatting and post-generation validators across all LLM workflows
  • Connected 4 AI services to existing REST APIs and frontend by wiring LangChain agents into full-stack product architecture with clean integration boundaries
  • Maintained reliable pipeline runs by applying structured logging and execution tracing across every LangChain chain
LangChainRAGVector EmbeddingsREST APIsPython

Full-Stack Engineering Intern

Vosyn — Illinois, United States · Remote

Aug 2025 – Oct 2025
  • Shipped 12 end-to-end features in 3 months owning full workflow from requirements to release in a remote Agile team
  • Reduced UI dev time by ~30% by building a reusable React/TypeScript component library
  • Cut integration breakages by ~40% by implementing versioned REST APIs with input validation and standardized error handling
  • Eliminated 4+ hrs/week of manual work by building n8n automation pipelines across 5+ webhooks with idempotent steps and retry logic
  • Accelerated root-cause isolation from hours to under 30 mins by introducing structured logging and traceable execution paths via Splunk
  • Zero regressions in flows owned during the internship by adding unit and integration test coverage for every critical flow before each release
ReactTypeScriptREST APIsn8nSplunkAgile

NCCS Lab · Full-Stack Software Developer

The University of Kansas — Lawrence, KS

Jan 2023 – Mar 2024
  • Delivered 10+ full-stack dashboard features across 15 Agile sprint cycles serving 5–15 active research staff from requirements to production
  • Reduced UI feature delivery time by ~25% by building reusable React component modules with consistent state handling
  • Supported 4+ core daily research workflows by building Java Spring Boot REST endpoints with full validation, exception handling, and service-layer logic
  • Improved data correctness across 3+ dashboard screens by designing integrated data flows across MongoDB for document storage and SQL for reporting
  • Reduced system triage time by 40% by instrumenting structured logs and building Splunk observability dashboards
  • Built and integrated a Rasa conversational assistant into the dashboard for guided in-app help for 5–15 daily users
  • Enabled context-backed self-service answers by building a RAG prototype connected to the React UI retrieving answers from internal research documents in real time
  • Maintained regression-safe deployments across all sprints by writing JUnit/Mockito and Jest/RTL tests covering all critical application flows
ReactJavaSpring BootMongoDBSQLRasaRAGSplunkJUnit

Information Technology Support Specialist

The University of Kansas — Lawrence, KS

Apr 2024 – Aug 2024
  • Resolved 100+ support tickets/week across 6 months, diagnosing software, hardware, and account issues for 500+ university users
  • Maintained a 95% first-call resolution rate consistently across the full tenure using structured troubleshooting tracked via ServiceNow
  • Supported 20+ users daily via LogMeIn remote sessions resolving issues without in-person escalation
  • Maintained 100% SLA compliance across all complex multi-issue tickets simultaneously
ServiceNowLogMeInTechnical SupportTroubleshootingSLA Management

Full Stack Developer

Cognizant Technology Solutions — Andhra Pradesh, India

Jul 2022 – Jul 2023
  • Engineered end-to-end Java enterprise web applications supporting 10,000+ active business users, delivering a 35% improvement in application performance through backend optimization and API redesign
  • Built and maintained 15+ RESTful APIs using Spring Boot and Hibernate, reducing average API response time from ~800ms to under 200ms — a 75% improvement in responsiveness
  • Developed responsive front-end interfaces across 5+ application modules, reducing customer-reported UI defects by 40% within the first two release cycles
  • Optimized complex SQL queries and indexing strategies on Oracle and MySQL databases, improving data retrieval speeds by 45% and cutting report generation time by 50%
  • Drove a 60% reduction in production-critical bugs by introducing JUnit and Mockito unit testing, achieving 80%+ code coverage across core business logic
  • Partnered with DevOps to deploy 3 major production releases via Jenkins CI/CD pipelines, reducing deployment time by 50%
  • Consistently delivered across 12+ two-week Agile sprints, maintaining a 95% on-time delivery rate
  • Mentored 2 junior developers on Java best practices, improving team code review pass rates by 25%
JavaSpring BootSpring MVCHibernateJPAREST APIsReact.jsMySQLOracleJUnitMockitoJenkinsCI/CDMavenGitAgileSwagger

Programming Analyst Intern

Cognizant Technology Solutions — Andhra Pradesh, India

Jan 2022 – Jun 2022
  • Contributed to 3 live production modules within the first 60 days, delivering code that passed QA with zero critical defects
  • Identified and resolved 20+ legacy code bugs, reducing application downtime by 15%
  • Built internal utility tools automating manual validation tasks, saving the team 5+ hours/week
  • Converted to full-time before internship completion based on performance
JavaSpring BootREST APIsMySQLJUnitGitAgile

Software Developer — Co-op

Sree Vidyanikethan Engineering College — Rangampeta, India

Jun 2020 – Dec 2020
  • Developed 8+ Java backend modules using OOP and Spring Boot principles, improving application workflow efficiency by 45% across core business processes
  • Wrote Python automation scripts eliminating repetitive manual data tasks, saving the team 10+ hours/week in operational effort
  • Designed and implemented RESTful APIs enabling frontend-backend communication, reducing integration errors by 30%
  • Optimized MySQL database queries improving data retrieval speed by 40%, directly impacting application response times
  • Delivered 100% of assigned features on time across all 3 project milestone phases
  • Maintained 90%+ code quality scores across all peer reviews following clean code standards
  • Collaborated within a team of 4 developers across daily standups, code reviews, and sprint deliverables
JavaPythonSpring BootREST APIsMySQLOOPGitAgile

Selected Projects

AgentFuse — LLM Agent Cost Optimization Runtime

AgentFuse — LLM Agent Cost Optimization Runtime

Production-grade Python SDK that enforces per-run LLM budgets with semantic caching, graduated cost policies, and unified observability — across 12 LLM providers including OpenAI, Anthropic, Gemini, and DeepSeek. Published on PyPI as agentfuse-runtime.

Team: Solo Project
Duration: 2025 — Active · v0.2.0
Stack: Python, FAISS, Redis, LangChain, OpenAI, Anthropic, OpenTelemetry, Prometheus
Impact: 71.8% cost reduction — same workload costs $0.24 vs $0.87 without AgentFuse. 179,445 tokens saved per 100 API calls.
PyPI · agentfuse-runtime
  • 87.5% cache hit rate on repeated and paraphrased prompts via two-tier Redis + FAISS semantic cache
  • Graduated budget policies — auto-downgrade model at 80%, compress context at 90%, graceful terminate at 100%
  • Supports 22+ models across 12 providers with hot-reloadable pricing
  • 260 unit tests, 86% core coverage
  • Full framework integrations — LangChain, CrewAI, OpenAI Agents SDK
PythonFAISSRedisLangChainOpenAIAnthropicSemantic CachingVector SearchOpenTelemetryPrometheusPyPI
TradeFlow — Event-Driven Loan Processing Platform

TradeFlow — Event-Driven Loan Processing Platform

Production-grade microservices fintech platform handling loan applications end-to-end — from submission through automated underwriting, manual review workflows, real-time notifications, and CQRS analytics — powered by Apache Kafka event streaming.

Team: Solo Project
Duration: 2025 — Active
Stack: Java, Spring Boot, Apache Kafka, React, TypeScript, PostgreSQL, Redis, Docker, Kubernetes, AWS
Impact: Full microservices loan processing pipeline with event-driven architecture, CQRS, and AI-powered underwriting
AWS ECS · Docker · Kubernetes
  • Full microservices architecture — API Gateway, Application Service, Underwriting Service, Notification Service, Reporting Service
  • Event-driven with Apache Kafka + Avro schemas + Confluent Schema Registry
  • CQRS pattern — separate write and read models for high-performance dashboards
  • Outbox pattern for guaranteed at-least-once Kafka delivery
  • JWT auth at gateway level with role-based access (Admin, Manager, Analyst, Applicant)
  • 4-stage CI/CD pipeline via GitHub Actions to AWS ECR to AWS ECS rolling deployment
  • Kubernetes-ready with HPA auto-scaling (2-10 replicas at 70% CPU)
  • AI-powered underwriting explanations via Anthropic Claude API
JavaSpring BootApache KafkaReactTypeScriptPostgreSQLRedisDockerKubernetesAWS ECSAWS ECRCI/CDCQRSMicroservicesJWTAvro
Intent Atoms — Semantic LLM Caching Engine

Intent Atoms — Semantic LLM Caching Engine

Semantic caching system that minimizes LLM API costs through a three-tier matching architecture. Instead of caching full queries, Intent Atoms decomposes compound requests into atomic semantic units, enabling granular cache reuse — achieving 87.5% cache hit rate and 71.8% cost reduction on production benchmarks.

Team: Solo Project
Duration: 2025 — Active · V3
Stack: Python, FAISS, Sentence-Transformers (MPNet), FastAPI, React, Anthropic Claude API
Impact: 71.8% cost reduction ($0.24 vs $0.87) with 87.5% cache hit rate and 179,445 tokens saved per 100 API calls
Vercel · FastAPI Backend
  • Three-tier hybrid matching — direct hits (>0.85 similarity), adapted responses (0.70-0.85 via Haiku), and atomic decomposition for novel queries
  • 87.5% cache hit rate with 71.8% cost savings on 100 real Anthropic API calls
  • Atomic intent decomposition — breaks compound queries into reusable semantic units for granular cache reuse
  • FAISS vector search with sentence-transformers all-mpnet-base-v2 embeddings (768 dimensions)
  • React dashboard with Recharts visualizations for real-time cache performance monitoring
  • REST API with query processing, stats, atom browsing, eviction, and health endpoints
PythonFAISSSentence-TransformersFastAPIReactAnthropic ClaudeVector SearchSemantic Caching
CodeMind — Chat with Any GitHub Repository

CodeMind — Chat with Any GitHub Repository

Production-grade RAG application that lets you have natural language conversations with any GitHub codebase. Ingests repositories, chunks and embeds code with context-aware strategies, and retrieves precise answers grounded in the actual source — not hallucinated guesses.

Team: Solo Project
Duration: 2025-2026
Stack: NestJS, React, PostgreSQL + pgvector, Redis, AWS, LangChain, OpenAI
Impact: Enables developers to query and understand unfamiliar codebases through natural conversation
Vercel · AWS
  • RAG pipeline with context-aware code chunking and pgvector embeddings
  • NestJS backend with Redis caching for fast retrieval across large repos
  • React frontend with real-time streaming responses
  • Supports any public GitHub repository — paste a link and start chatting
NestJSReactPostgreSQLpgvectorRedisAWSLangChainOpenAITypeScript
JobRadar — AI-Powered Job Search Agent

JobRadar — AI-Powered Job Search Agent

Personal AI job board that aggregates positions from multiple job APIs, scores each listing against my profile, and ranks opportunities by H-1B sponsorship likelihood. Built to automate the most painful parts of the international job search.

Team: Solo Project
Duration: 2025-2026
Stack: Python, FastAPI, React, OpenAI, PostgreSQL
Impact: Automated job discovery and ranking — turning hours of manual searching into minutes
Self-hosted
  • Multi-source job aggregation from major job listing APIs
  • AI-driven profile matching and relevance scoring per listing
  • H-1B sponsorship likelihood prediction based on company hiring patterns
  • Dashboard with filters for role type, location, sponsorship, and match score
PythonFastAPIReactOpenAIPostgreSQLREST APIs
NutriBot — AI Nutrition Assistant

NutriBot — AI Nutrition Assistant

AI-powered personalized nutrition assistant built with Next.js 15 and the Vercel AI SDK. Provides real-time macro tracking, meal recommendations, and nutritional insights through a conversational interface — evolved from an MS capstone into a polished, deployed product.

Team: Solo Project · MS Capstone
Duration: Masters Capstone Project · 2024–2025
Stack: Next.js 15, Vercel AI SDK, React, Tailwind CSS, Vercel
Impact: End-to-end AI chatbot delivering real-time personalized nutrition guidance
Vercel
  • Next.js 15 with Vercel AI SDK for streaming conversational responses
  • Personalized nutrition tracking with real-time macro data
  • Clean, responsive chat UI built with React and Tailwind CSS
  • Deployed and live on Vercel — production-ready
Next.js 15Vercel AI SDKReactTailwind CSSTypeScriptVercel

Tools & Collaboration Stack

How I work and communicate effectively

Version Control

GitHub · GitLab · Bitbucket

Project Management

Jira · Linear · Trello

Communication

Slack · Zoom · Notion

Design & Testing

Figma · Postman · Jest

Education

Master of Science · Computer Science

The University of Kansas

Jul 2023 – Aug 2025

NxtWave CCBP 4.0 Intensive · Fellow

NxtWave

Jan 2024 – Dec 2024

Bachelor of Technology · Computer Science

Sree Vidyanikethan Engineering College

2019 – 2023

Open to Full-Time & Remote Opportunities

Remote · Hybrid · On-Site · Full-Time

Let's build something amazing together