GCP vLLM hosting prices
OpenAI-compatible vLLM serving usually starts at 24GB GPUs and moves into 80GB+ rows for larger models, batching, and long prompts. The cheapest tracked GCP row in this slice is L40 at $0.66/hr.
Current GCP rows for vLLM hosting
GCP has 2 qualifying GPU models across 2 current price rows for this workload. The overall tracked market floor is $0.40/hr on Vast.ai RTX 4090, making GCP $0.26/hr above that floor.
GCP vLLM hosting pricing overview
This page narrows GCP's GPU catalog to the rows most likely to matter for vLLM hosting. It is built for buyers who already know the provider they are evaluating and need a workload-specific price shortlist.
Open-weight inference APIs, chat completions, and model serving teams comparing provider-specific GPU options. Use the table below to move from the cheapest visible row into adjacent provider/GPU pages, broader guides, and market-floor comparisons.
Cheapest GCP row for vLLM hosting
GCP currently starts at $0.66/hr for vLLM hosting with L40 on on-demand pricing. The overall tracked market floor is $0.40/hr on Vast.ai RTX 4090, making GCP $0.26/hr above that floor.
How this workload slice is computed
We filter the live compare payload to on-demand rows with at least 24GB of VRAM, then sort qualifying GCP rows by current median hourly price.
GCP vLLM hosting pricing FAQ
What is the cheapest GCP option for vLLM hosting?
GCP currently starts at $0.66/hr for vLLM hosting with L40 on on-demand pricing.
Is GCP cheapest for vLLM hosting?
The overall tracked market floor is $0.40/hr on Vast.ai RTX 4090, making GCP $0.26/hr above that floor.
What GPUs count toward this vLLM hosting page?
This page filters to 24GB+ GPUs in the RTX 4090, RTX 5090, L4, A10G, L40, and more set and uses on-demand pricing.
How fresh is this GCP vLLM hosting page?
The rows are recalculated from the latest stored provider snapshot. The freshest qualifying row visible here is from Jun 21, 2026.
Next searches after GCP vLLM hosting
These links move sideways into the full provider catalog, workload guide, market floor, and adjacent provider workload pages.