소개

Dalai - LLaMA를 당신의 컴퓨터에서 가장 간단히 실행하는 방법 | GeekNews
Alpaca와 온디바이스 LLM개발의 가속화 | GeekNews
KoAlpaca - 한국어 Alpaca 모델 | GeekNews
[Alpaca 맥/PC에서 실행하는 방법 (Dalai) - YouTube](https://www.yout ube.com/watch?v=kN-mIYKC2vA)
ChatGPT 대안을 제시하다 - 스탠포드 알파카 Stanford Alpaca 코드리뷰 (feat, LLaMa + GPT3.5 vs ChatGPT) - YouTube

Dalai - LLaMA를 당신의 컴퓨터에서 가장 간단히 실행하는 방법 | GeekNews

다음과 같이 디렉터리를 만들어서 설치해 봅니다.

mkdir dalai &amp;&amp; cd dalai &amp;&amp; npx dalai llama install 7B

인스톨 과정 중에 로그를 살펴보면 llama.cpp 코드를 직접 빌드하는 것을 확인할 수 있습니다.

I llama.cpp build info:
I UNAME_S:  Darwin
I UNAME_P:  arm
I UNAME_M:  arm64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS:   -framework Accelerate
I CC:       Apple clang version 14.0.0 (clang-1400.0.29.102)
I CXX:      Apple clang version 14.0.0 (clang-1400.0.29.102)

하지만 모델 다운로드 시간이 상당히 걸리고 끝까지 받아지지 않았습니다.

Alpaca 맥/PC에서 실행하는 방법 (Dalai) - YouTube 영상에도 언급해 주셨지만 실 사용하기에는 아직 문제가 많아 보입니다.

install

dalai 인스톨 과정엔 postinstall 스크립트를 포함합니다.

setup.js를 수행하는데 이는 다시 Dalai.setup을 수행하여 각 운영체제 별로 python과 빌드 환경을 구성합니다.

현재 alpaca와 llama 구현체가 있습니다.

setup

node js 18 버전 이상을 필요로 합니다.

windows 환경에서 python이 없는 경우 다음 경로에 있는 런타임을 설치합니다.

https://github.com/indygreg/python-build-standalone/releases/

ibrod83/nodejs-file-downloader 패키지를 활용합니다.

linux 환경에서는 build-essential와 같은 빌드툴이 없을 경우 설치해 줍니다.

Python 패키지로는 torch등을 설치합니다.

alpaca

~/dalai 폴더에 모델을 다운로드 합니다.

https://github.com/ItsPi3141/alpaca.cpp 소스를 다운로드 합니다.

다운로드 받은 소스를 빌드합니다.

https://huggingface.co/Pi3141/alpaca-7B-ggml/resolve/main/ggml-model-q4_0.bin 파일을 다운로드 합니다.

serve

serve 후에 prompt를 날려 보았지만, 응답을 받을 수 없었습니다.

exec: /Users/swcho/dalai/alpaca/main --seed -1 --threads 4 --n_predict 200 --model models/7B/ggml-model-q4_0.bin --top_k 40 --top_p 0.9 --temp 0.8 --repeat_last_n 64 --repeat_penalty 1.3 -p &quot;write pythone script&quot; in /Users/swcho/dalai/alpaca

위 커멘드를 직접 수행해 보면 다음과 같이 모델 로딩에 실패하는 것을 알 수 있습니다.

/Users/swcho/dalai/alpaca/main --seed -1 --threads 4 --n_predict 200 --model models/7B/ggml-model-q4_0.bin --top_k 40 --top_p 0.9 --temp 0.8 --repeat_last_n 64 --repeat_penalty 1.3 -p &quot;write pythone script&quot;
main: seed = 1684053767
llama_model_load: loading model from &#39;models/7B/ggml-model-q4_0.bin&#39; - please wait ...
llama_model_load: failed to open &#39;models/7B/ggml-model-q4_0.bin&#39;
main: failed to load model from &#39;models/7B/ggml-model-q4_0.bin&#39;

위 에러는 model binary의 magic number 0x67676d6c 값과 일치하지 않아 발생합니다. https://github.com/ItsPi3141/alpaca.cpp/blob/779a873fb2ac2c40b4595c8ad4e93bf6ce133b14/main.cpp#L107-L110

dalai를 작성하는 시점에서의 alpaca 모델 버전과 새로운 버전이 달라 발생하는 문제로 보입니다.

https://github.com/ItsPi3141/alpaca.cpp 프로젝트는 https://github.com/antimatter15/alpaca.cpp 프로젝트를 fork 한 것 입니다.

다행히 해당 버전의 바이너리를 받을 수 있는 힌트를 다음에서 찾을 수 있었습니다.

https://github.com/antimatter15/alpaca.cpp#get-started-7b
ggml-alpaca-7b-q4.bin

해당 파일을 ~/dalai/alpaca/models/7B 폴더에 다운로드 합니다.

다음과 같은 결과를 얻을 수 있었습니다.

더 시도해 볼만한 일

alpaca.cpp의 수정사항들을 적용해 볼만할 것 같습니다.

또한 나에게 필요한 prompt를 개발해서 다양한 용도에 적용해 볼 수 있을 것 같습니다.