native-llm - v0.2.0
    Preparing search index...

    Interface EngineOptions

    Options for engine initialization

    interface EngineOptions {
        model: string;
        gpuLayers?: number;
        contextSize?: number;
        huggingFaceToken?: string;
        enableThinking?: boolean;
    }
    Index

    Properties

    model: string

    Model to use (model ID, alias, or path to .gguf file)

    gpuLayers?: number

    GPU layers to offload (-1 = all, 0 = CPU only)

    contextSize?: number

    Context size override

    huggingFaceToken?: string

    HuggingFace access token for gated models (like Gemma 3) Can also be set via HF_TOKEN environment variable

    enableThinking?: boolean

    Enable thinking/reasoning mode for models that support it (Qwen3, DeepSeek R1)

    • When false (default): Disables thinking for faster responses
    • When true: Shows chain-of-thought reasoning (slower but more detailed)