【AI】D2000 arm64 aarch64 22.04.3 LTS (Jammy Jellyfish) 编译llama.cpp 使用chinese-alpaca-2-7b模型 CPU版本

下载编译llama.cpp

cd ~/Downloads/ai/
git clone --depth=1 https://gh.api.99988866.xyz/https://github.com/ggerganov/llama.cpp
cd llma.cpp
make -j8

下载模型到/home/yeqiang/Downloads/ai/chinese-alpaca-2-7b目录

hfl/chinese-alpaca-2-7b at main

转换模型

安装venv

sudo apt install python3.10-venv

配置pip国内镜像（阿里云）

创建~/.pip/pip.conf，内容如下

[global]
index-url=http://mirrors.aliyun.com/pypi/simple
[install]
trusted-host=mirror.aliyun.com

安装依赖(失败)

cd ~/Downloads/ai/llama.cpp
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

没有匹配的版本，为节约时间，放弃在arm64下安装python依赖，回到x86_64 ubuntu上转换模型

生成量化版模型

(venv) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/llama.cpp$ python convert.py ~/Downloads/ai/chinese-alpaca-2-7b/

进一步对FP16模型进行4-bit量化

(venv) yeqiang@yeqiang-MS-7B23:~/Downloads/ai/llama.cpp$ ~/Downloads/ai/llama.cpp/quantize /home/yeqiang/Downloads/ai/chinese-alpaca-2-7b/ggml-model-f16.gguf /home/yeqiang/Downloads/ai/chinese-alpaca-2-7b/ggml-model-f16-q4_0.bin q4_0

再生成一个q8_0版本的

拷贝到D2000笔记本上去。

测试

在llama.cpp项目下创建chat.sh

#!/bin/bash

# temporary script to chat with Chinese Alpaca-2 model
# usage: ./chat.sh alpaca2-ggml-model-path your-first-instruction

SYSTEM_PROMPT='You are a helpful assistant. 你是一个乐于助人的助手。'
# SYSTEM_PROMPT='You are a helpful assistant. 你是一个乐于助人的助手。请你提供专业、有逻辑、内容真实、有价值的详细回复。' # Try this one, if you prefer longer response.
MODEL_PATH=$1
FIRST_INSTRUCTION=$2

./main -m "$MODEL_PATH" \
--color -i -c 4096 -t 8 --temp 0.5 --top_k 40 --top_p 0.9 --repeat_penalty 1.1 \
--in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -p \
"[INST] <<SYS>>
$SYSTEM_PROMPT
<</SYS>>

$FIRST_INSTRUCTION [/INST]"