Vinay Kumar Reddy Budideti

Full-Stack Engineer · AI Integration

Vinay Kumar Reddy Budideti
Open to Full-Time · Remote · Hybrid · On-Site
50+
Features Shipped
4
Research Publications

Vinay Kumar Reddy Budideti

|

I build full-stack applications and ship AI into them. React · Java Spring Boot · Python · LangChain · RAG · Docker · GCP — owning every layer from database to UI, end-to-end, production-grade.

Lawrence, Kansas, US · Remote · Hybrid · On-Site
US Timezone (CST) · Remote-First

Professional Highlights

Summary

Full-Stack Engineer with hands-on experience across the complete product lifecycle — React/TypeScript frontends to Java Spring Boot APIs, AI integration with LangChain & RAG pipelines, and cloud deployment on GCP & AWS. Shipped production features at Vosyn and the University of Kansas NCCS Research Lab — owning every feature end-to-end, working across Agile sprint cycles, and consistently delivering measurable results.

35+ Features Shipped

End-to-end, requirements to release

40% Fewer Bugs

Via versioned APIs & test coverage

15 Agile Sprints

Delivered across full sprint cycles

4 Research Papers

Published peer-reviewed research

Skills & Tech Stack

Frontend

React & TypeScript 90%
HTML5 & CSS3 88%
JavaScript 87%
Next.js 85%
Tailwind CSS 85%

Backend

REST APIs 92%
Java & Spring Boot 90%
Hibernate & JPA 87%
FastAPI 85%
PostgreSQL & MongoDB 84%
MySQL & Oracle 84%
Python & Flask 82%
NestJS 82%
Redis 80%
Node.js 78%

AI & NLP

LangChain & RAG 88%
OpenAI & LLM APIs 85%
Rasa NLU 83%
FAISS & Vector Search 82%
Vector Embeddings 80%
NLP Pipelines 80%
Vercel AI SDK 78%

DevOps & Cloud

Git & CI/CD 88%
Jenkins 85%
Docker & GCP 82%
Vercel 82%
AWS & Cloud 80%
Splunk & Observability 78%
Maven & Gradle 78%

Work Methodology

How I approach software development

Discovery & Planning

Understanding business goals before writing code.

Clean Architecture

Maintainable, testable, and scalable systems.

Async Collaboration

Clear documentation & written communication.

Continuous Improvement

Code reviews, refactoring, performance tuning.

Work Experience

Software Engineer (Full-Stack & AI) · Internship

Vosyn — Illinois, United States · Remote

Aug 2025 – Nov 2025
  • Architected and deployed RAG pipelines using LangChain, GPT-4, and FAISS/Pinecone vector databases, reducing document processing time by 40% across 6 enterprise clients and enabling sub-second semantic search over 50,000+ documents
  • Built Python backend services integrating LLM APIs with retry logic, token management, and structured error handling — maintained 99.5% pipeline uptime across production workloads processing 10,000+ daily requests
  • Developed a shared React/TypeScript component library adopted by 3 product teams, cutting UI development time by ~30% and enforcing consistent design patterns across the platform
  • Designed and shipped versioned REST APIs with backward compatibility guarantees, reducing integration breakages by ~40% and eliminating cross-team deployment conflicts
  • Engineered n8n automation pipelines orchestrating 5+ webhook endpoints for event-driven workflows, eliminating 4+ hours/week of manual data processing and Dockerizing all services for reproducible deployments
  • Implemented structured logging and observability via Splunk, reducing root-cause isolation time from hours to under 30 minutes and enabling proactive alerting on pipeline anomalies
PythonTypeScriptReactLangChainGPT-4FAISSPineconeDockerREST APIsn8nSplunkAgile

Information Technology Support Specialist

The University of Kansas — Lawrence, Kansas, United States · On-site

Apr 2024 – Aug 2024
  • Resolved 100+ support tickets per week across 6 months diagnosing software, hardware, and account issues for 500+ university users
  • Maintained 95% first-call resolution rate consistently across the full 6-month tenure using structured troubleshooting on every ticket (tracked via ServiceNow resolution metrics)
  • Supported 20+ users daily via LogMeIn remote sessions resolving technical issues without requiring in-person escalation
  • Managed complex multi-issue tickets simultaneously while maintaining full SLA compliance and quality across all cases
  • Received consistent positive feedback from users and supervisors by maintaining professional customer-first communication on every interaction
ServiceNowLogMeInTechnical SupportTroubleshootingSLA Management

Full-Stack Software Developer & Research Assistant

The University of Kansas — Lawrence, Kansas, United States · Hybrid

Jan 2024 – Mar 2024
  • Developed internal research web applications using React.js, integrating RESTful APIs to display real-time service data with async state management and comprehensive error handling
  • Implemented structured logging and Splunk-based monitoring that reduced average debugging time by 35% across the research team's application suite
  • Co-authored 4 peer-reviewed publications in deep learning and natural language processing, contributing novel neural architecture experiments and evaluation frameworks
  • Built IoT security monitoring dashboards and embedded systems interfaces, processing real-time sensor data for cybersecurity research applications
React.jsJavaScriptREST APIsPythonSplunkGitIoTPyTorchTensorFlow

Selected Projects

AgentFuse — LLM Agent Cost Optimization Runtime

AgentFuse — LLM Agent Cost Optimization Runtime

Production-grade Python SDK that enforces per-run LLM budgets with semantic caching, graduated cost policies, and unified observability — across 12 LLM providers including OpenAI, Anthropic, Gemini, and DeepSeek. Published on PyPI as agentfuse-runtime.

Team: Solo Project
Duration: 2025 — Active · v0.2.0
Stack: Python, FAISS, Redis, LangChain, OpenAI, Anthropic, OpenTelemetry, Prometheus
Impact: 71.8% cost reduction — same workload costs $0.24 vs $0.87 without AgentFuse. 179,445 tokens saved per 100 API calls.
PyPI · agentfuse-runtime
  • 87.5% cache hit rate on repeated and paraphrased prompts via two-tier Redis + FAISS semantic cache
  • Graduated budget policies — auto-downgrade model at 80%, compress context at 90%, graceful terminate at 100%
  • Supports 22+ models across 12 providers with hot-reloadable pricing
  • 260 unit tests, 86% core coverage
  • Full framework integrations — LangChain, CrewAI, OpenAI Agents SDK
PythonFAISSRedisLangChainOpenAIAnthropicSemantic CachingVector SearchOpenTelemetryPrometheusPyPI
TradeFlow — Event-Driven Loan Processing Platform

TradeFlow — Event-Driven Loan Processing Platform

Production-grade microservices fintech platform handling loan applications end-to-end — from submission through automated underwriting, manual review workflows, real-time notifications, and CQRS analytics — powered by Apache Kafka event streaming.

Team: Solo Project
Duration: 2025 — Active
Stack: Java, Spring Boot, Apache Kafka, React, TypeScript, PostgreSQL, Redis, Docker, Kubernetes, AWS
Impact: Full microservices loan processing pipeline with event-driven architecture, CQRS, and AI-powered underwriting
AWS ECS · Docker · Kubernetes
  • Full microservices architecture — API Gateway, Application Service, Underwriting Service, Notification Service, Reporting Service
  • Event-driven with Apache Kafka + Avro schemas + Confluent Schema Registry
  • CQRS pattern — separate write and read models for high-performance dashboards
  • Outbox pattern for guaranteed at-least-once Kafka delivery
  • JWT auth at gateway level with role-based access (Admin, Manager, Analyst, Applicant)
  • 4-stage CI/CD pipeline via GitHub Actions to AWS ECR to AWS ECS rolling deployment
  • Kubernetes-ready with HPA auto-scaling (2-10 replicas at 70% CPU)
  • AI-powered underwriting explanations via Anthropic Claude API
JavaSpring BootApache KafkaReactTypeScriptPostgreSQLRedisDockerKubernetesAWS ECSAWS ECRCI/CDCQRSMicroservicesJWTAvro
Intent Atoms — Semantic LLM Caching Engine

Intent Atoms — Semantic LLM Caching Engine

Semantic caching system that minimizes LLM API costs through a three-tier matching architecture. Instead of caching full queries, Intent Atoms decomposes compound requests into atomic semantic units, enabling granular cache reuse — achieving 87.5% cache hit rate and 71.8% cost reduction on production benchmarks.

Team: Solo Project
Duration: 2025 — Active · V3
Stack: Python, FAISS, Sentence-Transformers (MPNet), FastAPI, React, Anthropic Claude API
Impact: 71.8% cost reduction ($0.24 vs $0.87) with 87.5% cache hit rate and 179,445 tokens saved per 100 API calls
Vercel · FastAPI Backend
  • Three-tier hybrid matching — direct hits (>0.85 similarity), adapted responses (0.70-0.85 via Haiku), and atomic decomposition for novel queries
  • 87.5% cache hit rate with 71.8% cost savings on 100 real Anthropic API calls
  • Atomic intent decomposition — breaks compound queries into reusable semantic units for granular cache reuse
  • FAISS vector search with sentence-transformers all-mpnet-base-v2 embeddings (768 dimensions)
  • React dashboard with Recharts visualizations for real-time cache performance monitoring
  • REST API with query processing, stats, atom browsing, eviction, and health endpoints
PythonFAISSSentence-TransformersFastAPIReactAnthropic ClaudeVector SearchSemantic Caching
CodeMind — Chat with Any GitHub Repository

CodeMind — Chat with Any GitHub Repository

Production-grade RAG application that lets you have natural language conversations with any GitHub codebase. Ingests repositories, chunks and embeds code with context-aware strategies, and retrieves precise answers grounded in the actual source — not hallucinated guesses.

Team: Solo Project
Duration: 2025-2026
Stack: NestJS, React, PostgreSQL + pgvector, Redis, AWS, LangChain, OpenAI
Impact: Enables developers to query and understand unfamiliar codebases through natural conversation
Vercel · AWS
  • RAG pipeline with context-aware code chunking and pgvector embeddings
  • NestJS backend with Redis caching for fast retrieval across large repos
  • React frontend with real-time streaming responses
  • Supports any public GitHub repository — paste a link and start chatting
NestJSReactPostgreSQLpgvectorRedisAWSLangChainOpenAITypeScript
JobRadar — AI-Powered Job Search Agent

JobRadar — AI-Powered Job Search Agent

Personal AI job board that aggregates positions from multiple job APIs, scores each listing against my profile, and ranks opportunities by H-1B sponsorship likelihood. Built to automate the most painful parts of the international job search.

Team: Solo Project
Duration: 2025-2026
Stack: Python, FastAPI, React, OpenAI, PostgreSQL
Impact: Automated job discovery and ranking — turning hours of manual searching into minutes
Self-hosted
  • Multi-source job aggregation from major job listing APIs
  • AI-driven profile matching and relevance scoring per listing
  • H-1B sponsorship likelihood prediction based on company hiring patterns
  • Dashboard with filters for role type, location, sponsorship, and match score
PythonFastAPIReactOpenAIPostgreSQLREST APIs
NutriBot — AI Nutrition Assistant

NutriBot — AI Nutrition Assistant

AI-powered personalized nutrition assistant built with Next.js 15 and the Vercel AI SDK. Provides real-time macro tracking, meal recommendations, and nutritional insights through a conversational interface — evolved from an MS capstone into a polished, deployed product.

Team: Solo Project · MS Capstone
Duration: Masters Capstone Project · 2024–2025
Stack: Next.js 15, Vercel AI SDK, React, Tailwind CSS, Vercel
Impact: End-to-end AI chatbot delivering real-time personalized nutrition guidance
Vercel
  • Next.js 15 with Vercel AI SDK for streaming conversational responses
  • Personalized nutrition tracking with real-time macro data
  • Clean, responsive chat UI built with React and Tailwind CSS
  • Deployed and live on Vercel — production-ready
Next.js 15Vercel AI SDKReactTailwind CSSTypeScriptVercel

Tools & Collaboration Stack

How I work and communicate effectively

Version Control

GitHub · GitLab · Bitbucket

Project Management

Jira · Linear · Trello

Communication

Slack · Zoom · Notion

Design & Testing

Figma · Postman · Jest

Education

Master of Science · Computer Science

The University of Kansas

Jul 2023 – Aug 2025

NxtWave CCBP 4.0 Intensive · Fellow

NxtWave

Jan 2024 – Dec 2024

Bachelor of Technology · Computer Science

Sree Vidyanikethan Engineering College

2019 – 2023

Open to Full-Time & Remote Opportunities

Remote · Hybrid · On-Site · Full-Time

Let's build something amazing together