BitNet

Official Inference Framework for 1-bit LLMs

BitNet is a revolutionary framework for running 1-bit Large Language Models, providing efficient inference with reduced memory footprint and improved performance. Developed by Microsoft, BitNet enables state-of-the-art language model capabilities with unprecedented efficiency.

25.8k GitHub Stars
2.1k Forks
MIT License

Key Features

High Performance

Optimized inference with custom CUDA kernels for maximum efficiency and speed.

Learn more →
💾

Memory Efficient

1-bit quantization dramatically reduces memory requirements while maintaining model quality.

Learn more →
🔧

Easy Integration

Simple installation and straightforward API for seamless integration into your projects.

Learn more →
🤖

Multiple Models

Support for various BitNet model architectures including BitNet-b1.58 and Falcon3 variants.

View models →
📊

Benchmark Tools

Comprehensive benchmarking utilities to evaluate model performance and throughput.

Run benchmarks →
📚

Extensive Documentation

Complete documentation, tutorials, and examples to help you get started quickly.

Read docs →

Quick Start

Installation
# Create conda environment
conda create -n bitnet-cpp python=3.9
conda activate bitnet-cpp

# Install dependencies
pip install -r requirements.txt

# Download model
huggingface-cli download microsoft/BitNet-b1.58-2B-4T-gguf \
  --local-dir models/BitNet-b1.58-2B-4T

# Setup environment
python setup_env.py -md models/BitNet-b1.58-2B-4T -q i2_s
Run Inference
python run_inference.py \
  -m models/BitNet-b1.58-2B-4T/ggml-model-i2_s.gguf \
  -p "You are a helpful assistant" \
  -cnv

Supported Models

BitNet supports a variety of 1-bit model architectures. View all supported models for detailed information.

BitNet-b1.58-2B-4T

2B parameter model trained on 4 trillion tokens

View on HuggingFace →

BitNet-b1.58-3B

3B parameter variant with enhanced capabilities

View on HuggingFace →

Falcon3-1B-Instruct

1B parameter instruction-tuned model

View on HuggingFace →

Llama3-8B-1.58

8B parameter model with 100B tokens training

View on HuggingFace →

Join the Community

BitNet is open source and actively developed. Join thousands of developers using BitNet for efficient LLM inference.