vllm.entrypoints.openai.chat_completion.api_router ¶
ENDPOINT_LOAD_METRICS_FORMAT_HEADER_LABEL module-attribute ¶
attach_router ¶
chat ¶
chat(request: Request) -> OpenAIServingChat | None
create_chat_completion async ¶
create_chat_completion(
request: ChatCompletionRequest, raw_request: Request
)
Source code in vllm/entrypoints/openai/chat_completion/api_router.py
render_chat_completion async ¶
render_chat_completion(
request: ChatCompletionRequest, raw_request: Request
)
Render chat completion request and return conversation and engine prompts without generating.