[AI] LLM RAG 환경 구성 - 2 Ollama 실행 및 모델 등록
페이지 정보
본문
안녕하세요.
꿈꾸는여행자입니다.
LLM RAG 환경 구성을 다루고자 합니다.
이번 항목에서는 Ollama 실행 및 모델 등록 실습을 진행해 보도록 하겠습니다.
상세 내역은 아래와 같습니다.
감사합니다.
> 아래
________________
2.3. HuggingFace-Hub
* HuggingFace gguf 파일을 Ollama 로딩
2.3.1. HuggingFace-Hub 설치
* 설치
* gguf 환경에서 사용하기 위해 huggingface 구성
* 3분 소요
pip install huggingface-hub
(env) [lds@llm llm]$ pip install huggingface-hub
Collecting huggingface-hub
Using cached huggingface_hub-0.22.2-py3-none-any.whl (388 kB)
Collecting filelock
Using cached filelock-3.14.0-py3-none-any.whl (12 kB)
Collecting fsspec>=2023.5.0
Using cached fsspec-2024.3.1-py3-none-any.whl (171 kB)
Collecting packaging>=20.9
Using cached packaging-24.0-py3-none-any.whl (53 kB)
Collecting pyyaml>=5.1
Downloading PyYAML-6.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (757 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 757.7/757.7 kB 2.2 MB/s eta 0:00:00
Collecting requests
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Collecting tqdm>=4.42.1
Using cached tqdm-4.66.2-py3-none-any.whl (78 kB)
Collecting typing-extensions>=3.7.4.3
Using cached typing_extensions-4.11.0-py3-none-any.whl (34 kB)
Collecting charset-normalizer<4,>=2
Downloading charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (140 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.3/140.3 kB 1.4 MB/s eta 0:00:00
Collecting idna<4,>=2.5
Using cached idna-3.7-py3-none-any.whl (66 kB)
Collecting urllib3<3,>=1.21.1
Using cached urllib3-2.2.1-py3-none-any.whl (121 kB)
Collecting certifi>=2017.4.17
Using cached certifi-2024.2.2-py3-none-any.whl (163 kB)
Installing collected packages: urllib3, typing-extensions, tqdm, pyyaml, packaging, idna, fsspec, filelock, charset-normalizer, certifi, requests, huggingface-hub
Successfully installed certifi-2024.2.2 charset-normalizer-3.3.2 filelock-3.14.0 fsspec-2024.3.1 huggingface-hub-0.22.2 idna-3.7 packaging-24.0 pyyaml-6.0.1 requests-2.31.0 tqdm-4.66.2 typing-extensions-4.11.0 urllib3-2.2.1
[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: pip install --upgrade pip
(env) [lds@llm llm]$
(env) [lds@llm llm]$
*
2.3.2. Download GGUF 파일
* 사용되는 파일
* 아래의 예시는 EEVE-Korean-Instruct-10.8B-v1.0
* HF: https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0
* GGUF: https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF
* GGUF 파일을 다운로드 받기 위하여 https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF 에서 원하는 .gguf 모델을 다운로드 받습니다.
* 30분 소요
huggingface-cli download \
heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \
ggml-model-Q5_K_M.gguf \
--local-dir [PATH_DOWNLOAD] \
--local-dir-use-symlinks False
huggingface-cli download \
heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \
ggml-model-Q5_K_M.gguf \
--local-dir ./ \
--local-dir-use-symlinks False
(env) [lds@llm llm]$ huggingface-cli download \
heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \
ggml-model-Q5_K_M.gguf \
--local-dir ./ \
--local-dir-use-symlinks False
Consider using `hf_transfer` for faster downloads. This solution comes with some limitations. See https://huggingface.co/docs/huggingface_hub/hf_transfer for more details.
downloading https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF/resolve/main/ggml-model-Q5_K_M.gguf to /home/lds/.cache/huggingface/hub/tmpaz24aq7z
ggml-model-Q5_K_M.gguf: 100%|████████████████████████████████████████████████████| 7.65G/7.65G [18:59<00:00, 6.72MB/s]
./ggml-model-Q5_K_M.gguf
(env) [lds@llm llm]$
*
2.3.3. Modelfile 준비
* EEVE-Korean-Instruct-10.8B-v1.0 예시
* 각 GGUF 마다 사용되는 Template 존재
* SOLAR Template 내용 활용시 https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0
* vi Modelfile
FROM ggml-model-Q5_K_M.gguf
TEMPLATE """{{- if .System }}
<s>{{ .System }}</s>
{{- end }}
<s>Human:
{{ .Prompt }}</s>
<s>Assistant:
"""
SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."""
PARAMETER stop <s>
PARAMETER stop </s>
(env) [lds@llm llm]$ pwd
/home/lds/Works/llm
(env) [lds@llm llm]$
(env) [lds@llm llm]$ vi Modelfile
(env) [lds@llm llm]$ cat Modelfile
FROM ggml-model-Q5_K_M.gguf
TEMPLATE """{{- if .System }}
<s>{{ .System }}</s>
{{- end }}
<s>Human:
{{ .Prompt }}</s>
<s>Assistant:
"""
SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."""
PARAMETER stop <s>
PARAMETER stop </s>
(env) [lds@llm llm]$
2.4. Run Ollama
2.4.1. List
* Ollama 모델 목록
* ollama list
(env) [lds@llm llm]$ ollama list
NAME ID SIZE MODIFIED
(env) [lds@llm llm]$
2.4.2. Create
* Ollama 모델 생성
* 3분 소요
ollama create EEVE-Korean-10.8B -f EEVE-Korean-Instruct-10.8B-v1.0-GGUF/Modelfile
ollama create EEVE-Korean-10.8B -f ./Modelfile
ollama list
(env) [lds@llm llm]$ ollama create EEVE-Korean-10.8B -f ./Modelfile
transferring model data
creating model layer
creating template layer
creating system layer
creating parameters layer
creating config layer
using already created layer sha256:b9e3d1ad5e8aa6db09610d4051820f06a5257b7d7f0b06c00630e376abcfa4c1
writing layer sha256:6b70a2ad0d545ca50d11b293ba6f6355eff16363425c8b163289014cf19311fc
writing layer sha256:1fa69e2371b762d1882b0bd98d284f312a36c27add732016e12e52586f98a9f5
writing layer sha256:fc44d47f7d5a1b793ab68b54cdba0102140bd358739e9d78df4abf18432fb3ea
writing layer sha256:fd9b55ebd209ef8ef07f55be3fa0f624f8781cf08d6f0931875cffd76fcdbcd7
writing manifest
success
(env) [lds@llm llm]$
(env) [lds@llm llm]$ ollama list
NAME ID SIZE MODIFIED
EEVE-Korean-10.8B:latest 254d40e8ab74 7.7 GB About a minute ago
(env) [lds@llm llm]$
*
2.4.3. Run
* Ollama 실행
* model 기준으로 cli로 문의 답변 가능
* 1분안에 답변하지만 Text가 느리게 나옴
* CPU 4Core/ Memory 16GB 기준
ollama run EEVE-Korean-10.8B:latest
(env) [lds@llm app]$ ollama run EEVE-Korean-10.8B:latest
>>> Hello
Hello! I'm here to assist you in any way I can. Please go ahead and ask your question or provide a topic for me
to help with. If I don't know the answer to something, I will do my best to point you in the right direction or
find the information together. I strive to provide accurate and positive responses while ensuring safety and
respect for all individuals involved. Let's start our conversation!
>>> /?
Available Commands:
/set Set session variables
/show Show model information
/load <model> Load a session or model
/save <model> Save your current session
/bye Exit
/?, /help Help for a command
/? shortcuts Help for keyboard shortcuts
Use """ to begin a multi-line message.
>>> /bye
(env) [lds@llm app]$
*
- 이전글[AI] LLM RAG 환경 구성 - 3 LangServe 구성 24.09.13
- 다음글[AI] LLM RAG 환경 구성 - 1. Ollama 설치 24.06.21
댓글목록
등록된 댓글이 없습니다.