LLM inference in C/C++
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
这个是只适合在 linux 环境么