Code
::pak("ollamar")
pak::pkg_deps_tree("ollamar") pak
Tony D
March 18, 2025
Running AI model on local machine with Ollama, huggingface and more
https://ollama.com/download
and open the app on computer
download model
list downloaded model
show model detail
run model
using multiple models
install package
local pacakge
download model
list all download model
# Extracting data from the ListResponse
data = []
for model in ollama_model.models:
model_data = {
'model': model.model,
'modified_at': model.modified_at,
'digest': model.digest,
'size': (model.size/1000000000),
'parent_model': model.details.parent_model,
'format': model.details.format,
'family': model.details.family,
'families': model.details.families,
'parameter_size': model.details.parameter_size,
'quantization_level': model.details.quantization_level
}
data.append(model_data)
# Convert the list of dictionaries into a pandas DataFrame
ollama_model_df = pd.DataFrame(data)
# Show the DataFrame
print(ollama_model_df)
slow model detail
delete model
run model
create model
push model to ollama
DeepSeek-R1-Distill-Qwen-1.5B as example:
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
dir_peline = pipeline("text-generation", model=model, tokenizer=tokenizer)
text = "my text for named entity recognition here."
dir_peline(text)
https://github.com/allenai/olmocr
https://github.com/gradio-app/gradio
https://www.youtube.com/watch?v=XF3Q_ZjwfaI
runing mlx_whisper as example
https://mlverse.github.io/mall/
---
title: "本地运行AI模型"
subtitle: "Run AI model on local machine"
author: "Tony D"
date: "2025-03-18"
categories:
- AI
- R
- Python
image: "images/1719563355.png"
execute:
warning: false
error: false
eval: false
---
Running AI model on local machine with Ollama, huggingface and more
# 1.Ollama

## Download and install the Ollama app
https://ollama.com/download
and open the app on computer
## Run LLM model on Ollama
::: panel-tabset
### Run in R with ollamar pacakge
#### download pacakge check connection
```{r}
pak::pak("ollamar")
pak::pkg_deps_tree("ollamar")
```
```{r}
library(ollamar)
test_connection()
```
download model
```{r}
ollamar::pull("llama3.1")
```
list downloaded model
```{r}
list_models()
```
show model detail
```{r}
ollamar::show("gemma3")
```
run model
```{r}
resp <- generate("gemma3", "tell me a 5-word story")
resp
```
```{r}
# get just the text from the response object
resp_process(resp, "text")
```
```{r}
# get the text as a tibble dataframe
resp_process(resp, "df")
```
using multiple models
```{r}
(list_models())$name
```
```{r}
models_name=(list_models())$name[-1]
models_name
```
```{r}
input_prompt="tell me a 5-word story"
```
```{r}
all_model=c()
for (i in models_name){
resp <- generate(i, input_prompt)
#print(paste0("Model: ", i))
print(resp_process(resp, "text"))
#resp_process(resp, "df")
all_model=rbind(all_model, resp_process(resp, "df"))
}
```
```{r}
all_model
```
### Run in R with ellmer package
### Run in terminal
```{python}
!ollama pull llama3.1
```
```{python}
!ollama run llama3.1 "tell me a 5-word story"
```
#### Run in Python
install package
```{python}
!pip install ollama
```
local pacakge
```{python}
import json
import pandas as pd
from pandas import json_normalize
from ollama import chat
from ollama import ChatResponse
import ollama
```
download model
```{python}
#ollama.pull('llama3.2:1b')
```
list all download model
```{python}
ollama_model=ollama.list()
```
```{python}
# Extracting data from the ListResponse
data = []
for model in ollama_model.models:
model_data = {
'model': model.model,
'modified_at': model.modified_at,
'digest': model.digest,
'size': (model.size/1000000000),
'parent_model': model.details.parent_model,
'format': model.details.format,
'family': model.details.family,
'families': model.details.families,
'parameter_size': model.details.parameter_size,
'quantization_level': model.details.quantization_level
}
data.append(model_data)
# Convert the list of dictionaries into a pandas DataFrame
ollama_model_df = pd.DataFrame(data)
# Show the DataFrame
print(ollama_model_df)
```
slow model detail
```{python}
ollama.show('deepseek-r1:7b-qwen-distill-q4_K_M')
```
delete model
```{python}
#ollama.delete('llama3.2:1b')
```
run model
```{python}
response: ChatResponse=ollama.chat(model='deepseek-r1:7b-qwen-distill-q4_K_M', messages=[
{'role': 'system',
'content': '你是一个诗人,你只能输出中文'},
{'role': 'assistant',
'content': ''},
{'role': 'user',
'content': 'give me a 3 lines story'}
])
```
```{python}
print(response.message.content)
```
```{python}
response: ChatResponse =ollama.chat(model='gemma3', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])
```
```{python}
print(response.message.content)
```
create model
```{python}
ollama.create(model='example_model', from_='llama3.2', system="You are Mario from Super Mario Bros.")
```
push model to ollama
```{python}
ollama.push('user/example_model')
```
:::
# 2.hugging face

::: panel-tabset
## Run in Python with hugging face transformer
DeepSeek-R1-Distill-Qwen-1.5B as example:
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
### using pipeline
```{python}
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages)
```
### Load model directly
```{python}
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
dir_peline = pipeline("text-generation", model=model, tokenizer=tokenizer)
text = "my text for named entity recognition here."
dir_peline(text)
```
:::
# Terminal
::: panel-tabset
## run in Terminal
https://github.com/allenai/olmocr
https://github.com/gradio-app/gradio
https://www.youtube.com/watch?v=XF3Q_ZjwfaI
## run in R code
runing mlx_whisper as example
```{r}
command=paste0("mlx_whisper '",file_name,"' --model mlx-community/whisper-turbo --language 'Chinese' --initial-prompt '以下是普通話的句子,請以繁體輸出'")
command
```
## run in Python code
```{python}
import os
command
os.system(command)
```
:::
# vLLM
# mall pacakge
https://mlverse.github.io/mall/