Code
# Initialize a new virtual environment
uv init
mlx-lm
is a powerful Python library designed for running and fine-tuning large language models (LLMs) on Apple silicon. It is built on top of Apple’s MLX framework, which is optimized for high-performance machine learning on Apple hardware. mlx-lm
makes it easy to download, run, and customize a wide range of LLMs.
This guide will walk you through the process of setting up your environment, installing the necessary packages, and running a pre-trained model.
We will use uv
to create a virtual environment and manage our dependencies.
Let’s verify the Python version we are using.
Now that our environment is set up, we can download and run a pre-trained LLM.
We will use the load
function from mlx_lm
to download and load the model and its corresponding tokenizer. In this example, we are using a 4-bit quantized version of the Mistral-7B-Instruct-v0.3 model from the mlx-community
Hugging Face organization.
Once the model and tokenizer are loaded, we can use the generate
function to generate text based on a prompt.
# Define the prompt
prompt = "Write a story about Einstein"
# Format the prompt using the chat template
messages = [{"role": "user", "content": prompt}]
formatted_prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
# Generate the text
text = generate(model, tokenizer, prompt=formatted_prompt, verbose=True)
text
mlx-lm
provides a simple and efficient way to work with large language models on Apple silicon. Its integration with the MLX framework ensures optimal performance, making it an excellent choice for developers and researchers who want to leverage the power of LLMs on their Apple devices.
---
title: "Getting Started with mlx-lm for Python"
execute:
warning: false
error: false
eval: false
format:
html:
toc: true
toc-location: right
code-fold: show
code-tools: true
number-sections: true
code-block-bg: true
code-block-border-left: "#31BAE9"
---

# Introduction to mlx-lm
`mlx-lm` is a powerful Python library designed for running and fine-tuning large language models (LLMs) on Apple silicon. It is built on top of Apple's MLX framework, which is optimized for high-performance machine learning on Apple hardware. `mlx-lm` makes it easy to download, run, and customize a wide range of LLMs.
This guide will walk you through the process of setting up your environment, installing the necessary packages, and running a pre-trained model.
# Environment Setup
We will use `uv` to create a virtual environment and manage our dependencies.
```{python}
#| eval: false
# Initialize a new virtual environment
uv init
```
```{python}
#| eval: false
# Install the required packages
uv add mlx-lm nbclient nbformat
```
Let's verify the Python version we are using.
```{python}
import sys
print(sys.version)
```
# Running a Pre-trained Model
Now that our environment is set up, we can download and run a pre-trained LLM.
## Load the Model and Tokenizer
We will use the `load` function from `mlx_lm` to download and load the model and its corresponding tokenizer. In this example, we are using a 4-bit quantized version of the Mistral-7B-Instruct-v0.3 model from the `mlx-community` Hugging Face organization.
```{python}
from mlx_lm import load, generate
# Load the pre-trained model and tokenizer
model, tokenizer = load("mlx-community/Mistral-7B-Instruct-v0.3-4bit")
```
## Generate Text
Once the model and tokenizer are loaded, we can use the `generate` function to generate text based on a prompt.
```{python}
# Define the prompt
prompt = "Write a story about Einstein"
# Format the prompt using the chat template
messages = [{"role": "user", "content": prompt}]
formatted_prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
# Generate the text
text = generate(model, tokenizer, prompt=formatted_prompt, verbose=True)
text
```
# Conclusion
`mlx-lm` provides a simple and efficient way to work with large language models on Apple silicon. Its integration with the MLX framework ensures optimal performance, making it an excellent choice for developers and researchers who want to leverage the power of LLMs on their Apple devices.
# References
- [mlx-lm on GitHub](https://github.com/ml-explore/mlx-lm)
- [mlx-vlm on GitHub](https://github.com/Blaizzy/mlx-vlm)