Open to Full-Time · Remote · Hybrid · On-Site

Years Experience

50+

Features Shipped

Research Publications

Vinay Kumar Reddy Budideti

|

I build full-stack applications and ship AI into them. React · Java Spring Boot · Python · LangChain · RAG · Docker · GCP — owning every layer from database to UI, end-to-end, production-grade.

Lawrence, Kansas, US · Remote · Hybrid · On-Site

US Timezone (CST) · Remote-First

vinaykumarreddy.budideti@gmail.com

github.com/vinaybudideti

Hire Me View Projects

Professional Highlights

Summary

Full-Stack Engineer with hands-on experience across the complete product lifecycle — React/TypeScript frontends to Java Spring Boot APIs, AI integration with LangChain & RAG pipelines, and cloud deployment on GCP & AWS. Shipped production features at Vosyn, the University of Kansas NCCS Research Lab, and Cognizant — owning every feature end-to-end, working across Agile sprint cycles, and consistently delivering measurable results.

35+ Features Shipped

End-to-end, requirements to release

40% Fewer Bugs

Via versioned APIs & test coverage

15 Agile Sprints

Delivered across full sprint cycles

4 Research Papers

Published peer-reviewed research

Skills & Tech Stack

Frontend

React & TypeScript 90%

HTML5 & CSS3 88%

JavaScript 87%

Next.js 85%

Tailwind CSS 85%

Backend

REST APIs 92%

Java & Spring Boot 90%

Hibernate & JPA 87%

FastAPI 85%

PostgreSQL & MongoDB 84%

MySQL & Oracle 84%

Python & Flask 82%

NestJS 82%

Redis 80%

Node.js 78%

AI & NLP

LangChain & RAG 88%

OpenAI & LLM APIs 85%

Rasa NLU 83%

FAISS & Vector Search 82%

Vector Embeddings 80%

NLP Pipelines 80%

Vercel AI SDK 78%

DevOps & Cloud

Git & CI/CD 88%

Jenkins 85%

Docker & GCP 82%

Vercel 82%

AWS & Cloud 80%

Splunk & Observability 78%

Maven & Gradle 78%

Work Methodology

How I approach software development

Discovery & Planning

Understanding business goals before writing code.

Clean Architecture

Maintainable, testable, and scalable systems.

Async Collaboration

Clear documentation & written communication.

Continuous Improvement

Code reviews, refactoring, performance tuning.

Work Experience

AI Software Engineer

Vosyn — Illinois, United States · Remote

Oct 2025 – Nov 2025

Delivered 4 AI user stories in 2 months building LangChain chains and RAG pipelines on internal product docs and support tickets as knowledge source
Improved LLM groundedness by ~35% by building 4 RAG pipelines using vector embeddings and semantic retrieval
Reduced average LLM response latency by ~25% by optimizing LangChain chain architecture and retrieval chunk sizing
Cut out-of-scope AI outputs by ~40% by implementing schema-based output formatting and post-generation validators across all LLM workflows
Connected 4 AI services to existing REST APIs and frontend by wiring LangChain agents into full-stack product architecture with clean integration boundaries
Maintained reliable pipeline runs by applying structured logging and execution tracing across every LangChain chain

LangChainRAGVector EmbeddingsREST APIsPython

Full-Stack Engineering Intern

Vosyn — Illinois, United States · Remote

Aug 2025 – Oct 2025

Shipped 12 end-to-end features in 3 months owning full workflow from requirements to release in a remote Agile team
Reduced UI dev time by ~30% by building a reusable React/TypeScript component library
Cut integration breakages by ~40% by implementing versioned REST APIs with input validation and standardized error handling
Eliminated 4+ hrs/week of manual work by building n8n automation pipelines across 5+ webhooks with idempotent steps and retry logic
Accelerated root-cause isolation from hours to under 30 mins by introducing structured logging and traceable execution paths via Splunk
Zero regressions in flows owned during the internship by adding unit and integration test coverage for every critical flow before each release

ReactTypeScriptREST APIsn8nSplunkAgile

NCCS Lab · Full-Stack Software Developer

The University of Kansas — Lawrence, KS

Jan 2023 – Mar 2024

Delivered 10+ full-stack dashboard features across 15 Agile sprint cycles serving 5–15 active research staff from requirements to production
Reduced UI feature delivery time by ~25% by building reusable React component modules with consistent state handling
Supported 4+ core daily research workflows by building Java Spring Boot REST endpoints with full validation, exception handling, and service-layer logic
Improved data correctness across 3+ dashboard screens by designing integrated data flows across MongoDB for document storage and SQL for reporting
Reduced system triage time by 40% by instrumenting structured logs and building Splunk observability dashboards
Built and integrated a Rasa conversational assistant into the dashboard for guided in-app help for 5–15 daily users
Enabled context-backed self-service answers by building a RAG prototype connected to the React UI retrieving answers from internal research documents in real time
Maintained regression-safe deployments across all sprints by writing JUnit/Mockito and Jest/RTL tests covering all critical application flows

ReactJavaSpring BootMongoDBSQLRasaRAGSplunkJUnit

Information Technology Support Specialist

The University of Kansas — Lawrence, KS

Apr 2024 – Aug 2024

Resolved 100+ support tickets/week across 6 months, diagnosing software, hardware, and account issues for 500+ university users
Maintained a 95% first-call resolution rate consistently across the full tenure using structured troubleshooting tracked via ServiceNow
Supported 20+ users daily via LogMeIn remote sessions resolving issues without in-person escalation
Maintained 100% SLA compliance across all complex multi-issue tickets simultaneously

ServiceNowLogMeInTechnical SupportTroubleshootingSLA Management

Full Stack Developer

Cognizant Technology Solutions — Andhra Pradesh, India

Jul 2022 – Jul 2023

Engineered end-to-end Java enterprise web applications supporting 10,000+ active business users, delivering a 35% improvement in application performance through backend optimization and API redesign
Built and maintained 15+ RESTful APIs using Spring Boot and Hibernate, reducing average API response time from ~800ms to under 200ms — a 75% improvement in responsiveness
Developed responsive front-end interfaces across 5+ application modules, reducing customer-reported UI defects by 40% within the first two release cycles
Optimized complex SQL queries and indexing strategies on Oracle and MySQL databases, improving data retrieval speeds by 45% and cutting report generation time by 50%
Drove a 60% reduction in production-critical bugs by introducing JUnit and Mockito unit testing, achieving 80%+ code coverage across core business logic
Partnered with DevOps to deploy 3 major production releases via Jenkins CI/CD pipelines, reducing deployment time by 50%
Consistently delivered across 12+ two-week Agile sprints, maintaining a 95% on-time delivery rate
Mentored 2 junior developers on Java best practices, improving team code review pass rates by 25%

JavaSpring BootSpring MVCHibernateJPAREST APIsReact.jsMySQLOracleJUnitMockitoJenkinsCI/CDMavenGitAgileSwagger

Programming Analyst Intern

Cognizant Technology Solutions — Andhra Pradesh, India

Jan 2022 – Jun 2022

Contributed to 3 live production modules within the first 60 days, delivering code that passed QA with zero critical defects
Identified and resolved 20+ legacy code bugs, reducing application downtime by 15%
Built internal utility tools automating manual validation tasks, saving the team 5+ hours/week
Converted to full-time before internship completion based on performance

JavaSpring BootREST APIsMySQLJUnitGitAgile

Software Developer — Co-op

Sree Vidyanikethan Engineering College — Rangampeta, India

Jun 2020 – Dec 2020

Developed 8+ Java backend modules using OOP and Spring Boot principles, improving application workflow efficiency by 45% across core business processes
Wrote Python automation scripts eliminating repetitive manual data tasks, saving the team 10+ hours/week in operational effort
Designed and implemented RESTful APIs enabling frontend-backend communication, reducing integration errors by 30%
Optimized MySQL database queries improving data retrieval speed by 40%, directly impacting application response times
Delivered 100% of assigned features on time across all 3 project milestone phases
Maintained 90%+ code quality scores across all peer reviews following clean code standards
Collaborated within a team of 4 developers across daily standups, code reviews, and sprint deliverables

JavaPythonSpring BootREST APIsMySQLOOPGitAgile

Selected Projects

AgentFuse — LLM Agent Cost Optimization Runtime

Production-grade Python SDK that enforces per-run LLM budgets with semantic caching, graduated cost policies, and unified observability — across 12 LLM providers including OpenAI, Anthropic, Gemini, and DeepSeek. Published on PyPI as agentfuse-runtime.

Team: Solo Project

Duration: 2025 — Active · v0.2.0

Stack: Python, FAISS, Redis, LangChain, OpenAI, Anthropic, OpenTelemetry, Prometheus

Impact: 71.8% cost reduction — same workload costs $0.24 vs $0.87 without AgentFuse. 179,445 tokens saved per 100 API calls.

PyPI · agentfuse-runtime

87.5% cache hit rate on repeated and paraphrased prompts via two-tier Redis + FAISS semantic cache
Graduated budget policies — auto-downgrade model at 80%, compress context at 90%, graceful terminate at 100%
Supports 22+ models across 12 providers with hot-reloadable pricing
260 unit tests, 86% core coverage
Full framework integrations — LangChain, CrewAI, OpenAI Agents SDK

PythonFAISSRedisLangChainOpenAIAnthropicSemantic CachingVector SearchOpenTelemetryPrometheusPyPI

Source

TradeFlow — Event-Driven Loan Processing Platform

Production-grade microservices fintech platform handling loan applications end-to-end — from submission through automated underwriting, manual review workflows, real-time notifications, and CQRS analytics — powered by Apache Kafka event streaming.

Team: Solo Project

Duration: 2025 — Active

Stack: Java, Spring Boot, Apache Kafka, React, TypeScript, PostgreSQL, Redis, Docker, Kubernetes, AWS

Impact: Full microservices loan processing pipeline with event-driven architecture, CQRS, and AI-powered underwriting

AWS ECS · Docker · Kubernetes

Full microservices architecture — API Gateway, Application Service, Underwriting Service, Notification Service, Reporting Service
Event-driven with Apache Kafka + Avro schemas + Confluent Schema Registry
CQRS pattern — separate write and read models for high-performance dashboards
Outbox pattern for guaranteed at-least-once Kafka delivery
JWT auth at gateway level with role-based access (Admin, Manager, Analyst, Applicant)
4-stage CI/CD pipeline via GitHub Actions to AWS ECR to AWS ECS rolling deployment
Kubernetes-ready with HPA auto-scaling (2-10 replicas at 70% CPU)
AI-powered underwriting explanations via Anthropic Claude API

JavaSpring BootApache KafkaReactTypeScriptPostgreSQLRedisDockerKubernetesAWS ECSAWS ECRCI/CDCQRSMicroservicesJWTAvro

Source

Intent Atoms — Semantic LLM Caching Engine

Semantic caching system that minimizes LLM API costs through a three-tier matching architecture. Instead of caching full queries, Intent Atoms decomposes compound requests into atomic semantic units, enabling granular cache reuse — achieving 87.5% cache hit rate and 71.8% cost reduction on production benchmarks.

Team: Solo Project

Duration: 2025 — Active · V3

Stack: Python, FAISS, Sentence-Transformers (MPNet), FastAPI, React, Anthropic Claude API

Impact: 71.8% cost reduction ($0.24 vs $0.87) with 87.5% cache hit rate and 179,445 tokens saved per 100 API calls

Vercel · FastAPI Backend

Three-tier hybrid matching — direct hits (>0.85 similarity), adapted responses (0.70-0.85 via Haiku), and atomic decomposition for novel queries
87.5% cache hit rate with 71.8% cost savings on 100 real Anthropic API calls
Atomic intent decomposition — breaks compound queries into reusable semantic units for granular cache reuse
FAISS vector search with sentence-transformers all-mpnet-base-v2 embeddings (768 dimensions)
React dashboard with Recharts visualizations for real-time cache performance monitoring
REST API with query processing, stats, atom browsing, eviction, and health endpoints

PythonFAISSSentence-TransformersFastAPIReactAnthropic ClaudeVector SearchSemantic Caching

Live Demo Source

CodeMind — Chat with Any GitHub Repository

Production-grade RAG application that lets you have natural language conversations with any GitHub codebase. Ingests repositories, chunks and embeds code with context-aware strategies, and retrieves precise answers grounded in the actual source — not hallucinated guesses.

Team: Solo Project

Duration: 2025-2026

Stack: NestJS, React, PostgreSQL + pgvector, Redis, AWS, LangChain, OpenAI

Impact: Enables developers to query and understand unfamiliar codebases through natural conversation

Vercel · AWS

RAG pipeline with context-aware code chunking and pgvector embeddings
NestJS backend with Redis caching for fast retrieval across large repos
React frontend with real-time streaming responses
Supports any public GitHub repository — paste a link and start chatting

NestJSReactPostgreSQLpgvectorRedisAWSLangChainOpenAITypeScript

Live Demo Source

JobRadar — AI-Powered Job Search Agent

Personal AI job board that aggregates positions from multiple job APIs, scores each listing against my profile, and ranks opportunities by H-1B sponsorship likelihood. Built to automate the most painful parts of the international job search.

Team: Solo Project

Duration: 2025-2026

Stack: Python, FastAPI, React, OpenAI, PostgreSQL

Impact: Automated job discovery and ranking — turning hours of manual searching into minutes

Self-hosted

Multi-source job aggregation from major job listing APIs
AI-driven profile matching and relevance scoring per listing
H-1B sponsorship likelihood prediction based on company hiring patterns
Dashboard with filters for role type, location, sponsorship, and match score

PythonFastAPIReactOpenAIPostgreSQLREST APIs

Source

NutriBot — AI Nutrition Assistant

AI-powered personalized nutrition assistant built with Next.js 15 and the Vercel AI SDK. Provides real-time macro tracking, meal recommendations, and nutritional insights through a conversational interface — evolved from an MS capstone into a polished, deployed product.

Team: Solo Project · MS Capstone

Duration: Masters Capstone Project · 2024–2025

Stack: Next.js 15, Vercel AI SDK, React, Tailwind CSS, Vercel

Impact: End-to-end AI chatbot delivering real-time personalized nutrition guidance

Vercel

Next.js 15 with Vercel AI SDK for streaming conversational responses
Personalized nutrition tracking with real-time macro data
Clean, responsive chat UI built with React and Tailwind CSS
Deployed and live on Vercel — production-ready

Next.js 15Vercel AI SDKReactTailwind CSSTypeScriptVercel

Live Demo Source

Tools & Collaboration Stack

How I work and communicate effectively

Version Control

GitHub · GitLab · Bitbucket

Project Management

Jira · Linear · Trello

Communication

Slack · Zoom · Notion

Design & Testing

Figma · Postman · Jest

Education

Master of Science · Computer Science

The University of Kansas

Jul 2023 – Aug 2025

NxtWave CCBP 4.0 Intensive · Fellow

NxtWave

Jan 2024 – Dec 2024

Bachelor of Technology · Computer Science

Sree Vidyanikethan Engineering College

2019 – 2023

Open to Full-Time & Remote Opportunities

Remote · Hybrid · On-Site · Full-Time

Let's build something amazing together

Email Me Download Resume

vinaykumarreddy.budideti@gmail.com

github.com/vinaybudideti