▶️Use Model

To help developers to easily find and run the right model, Nexa AI Hub provide comprehensive filter system and SDK.

Explore in Model Hub

The goal of Nexa Model Hub is to help developers find the most suitable models. To achieve the goal, we provide the following filter options:

Model Type

Computer Vision
- Image-to-Text
- Image-to-Image
Audio
- Text-to-Speech
- Automatic Speech Recognition
Multimodal
- Image-Text-to-text
NLP
- Text Generation
- Chat Completion
- Question Answering

File Format Tag

GGUF

GGUF is an optimized binary format designed for efficient model loading and saving, particularly suited for inference tasks. It is compatible with GGML and other executors. Developed by @ggerganov, the creator of llama.cpp (a widely-used C/C++ LLM inference framework), GGUF forms the foundation of the Nexa SDK's GGML component.

ONNX

ONNX is an open standard format for representing machine learning models. It establishes a common set of operators and a unified file format, enabling AI developers to utilize models across various frameworks, tools, runtimes, and compilers. ONNX shows unique performance advantages on devices with limited ram(mobile, IoT). The Nexa SDK's ONNX component is built upon the onnxruntime framework.

Parameters

The Nexa Model Hub specializes in on-device models with parameter less than 10 billion.

RAM

This metric indicates the minimum random access memory (RAM) necessary for local model execution.

File Size

Displays the total storage space required for the model.

Use Model

Download Nexa SDK

Follow the Installation to download the appropriate SDK for your operating system.

Run Model using SDK

Nexa SDK enables developers to use one line of code to run the model that fits your specific requirement locally. The one line of code follows this pattern:

nexa model_type MODEL_PATH

To see example and popular MODEL_PATH, see Supported Popular Models below:

Model	Type	Format	Command
octopus-v2	NLP	GGUF	nexa run octopus-v2
octopus-v4	NLP	GGUF	nexa run octopus-v4
tinyllama	NLP	GGUF	nexa run tinyllama
llama2	NLP	GGUF/ONNX	nexa run llama2
llama3	NLP	GGUF/ONNX	nexa run llama3
llama3.1	NLP	GGUF/ONNX	nexa run llama3.1
gemma	NLP	GGUF/ONNX	nexa run gemma
gemma2	NLP	GGUF	nexa run gemma2
qwen1.5	NLP	GGUF	nexa run qwen1.5
qwen2	NLP	GGUF/ONNX	nexa run qwen2
qwen2.5	NLP	GGUF	nexa run qwen2.5
mathqwen	NLP	GGUF	nexa run mathqwen
mistral	NLP	GGUF/ONNX	nexa run mistral
codegemma	NLP	GGUF	nexa run codegemma
codellama	NLP	GGUF	nexa run codellama
codeqwen	NLP	GGUF	nexa run codeqwen
deepseek-coder	NLP	GGUF	nexa run deepseek-coder
dolphin-mistral	NLP	GGUF	nexa run dolphin-mistral
phi2	NLP	GGUF	nexa run phi2
phi3	NLP	GGUF/ONNX	nexa run phi3
llama2-uncensored	NLP	GGUF	nexa run llama2-uncensored
llama3-uncensored	NLP	GGUF	nexa run llama3-uncensored
llama2-function-calling	NLP	GGUF	nexa run llama2-function-calling
nanollava	Multimodal	GGUF	nexa run nanollava
llava-phi3	Multimodal	GGUF	nexa run llava-phi3
llava-llama3	Multimodal	GGUF	nexa run llava-llama3
llava1.6-mistral	Multimodal	GGUF	nexa run llava1.6-mistral
llava1.6-vicuna	Multimodal	GGUF	nexa run llava1.6-vicuna
stable-diffusion-v1-4	Computer Vision	GGUF	nexa run sd1-4
stable-diffusion-v1-5	Computer Vision	GGUF/ONNX	nexa run sd1-5
lcm-dreamshaper	Computer Vision	GGUF/ONNX	nexa run lcm-dreamshaper
hassaku-lcm	Computer Vision	GGUF	nexa run hassaku-lcm
anything-lcm	Computer Vision	GGUF	nexa run anything-lcm
faster-whisper-tiny	Audio	BIN	nexa run faster-whisper-tiny
faster-whisper-small	Audio	BIN	nexa run faster-whisper-small
faster-whisper-medium	Audio	BIN	nexa run faster-whisper-medium
faster-whisper-base	Audio	BIN	nexa run faster-whisper-base
faster-whisper-large	Audio	BIN	nexa run faster-whisper-large

Model

Type

Format

Command

octopus-v2

NLP

GGUF

nexa run octopus-v2

octopus-v4

NLP

GGUF

nexa run octopus-v4