[AI] LLM RAG 환경 구성 - 2 Ollama 실행 및 모델 등록 > 머신러닝/AI 자료실

본문 바로가기

사이트 내 전체검색

뒤로가기 머신러닝/AI 자료실

[AI] LLM RAG 환경 구성 - 2 Ollama 실행 및 모델 등록

페이지 정보

작성자 꿈꾸는여행자 작성일 24-06-27 15:10 조회 1,014 댓글 0

본문

안녕하세요.

 

꿈꾸는여행자입니다.

 

 

LLM RAG 환경 구성을 다루고자 합니다.


이번 항목에서는 Ollama 실행 및 모델 등록 실습을 진행해 보도록 하겠습니다. 

 

상세 내역은 아래와 같습니다.



감사합니다. 


> 아래 

 

 


________________




2.3. HuggingFace-Hub

* HuggingFace gguf 파일을 Ollama 로딩



2.3.1. HuggingFace-Hub 설치

* 설치

   * gguf 환경에서 사용하기 위해 huggingface 구성 

      * 3분 소요

pip install huggingface-hub

(env) [lds@llm llm]$ pip install huggingface-hub

Collecting huggingface-hub

  Using cached huggingface_hub-0.22.2-py3-none-any.whl (388 kB)

Collecting filelock

  Using cached filelock-3.14.0-py3-none-any.whl (12 kB)

Collecting fsspec>=2023.5.0

  Using cached fsspec-2024.3.1-py3-none-any.whl (171 kB)

Collecting packaging>=20.9

  Using cached packaging-24.0-py3-none-any.whl (53 kB)

Collecting pyyaml>=5.1

  Downloading PyYAML-6.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (757 kB)

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 757.7/757.7 kB 2.2 MB/s eta 0:00:00

Collecting requests

  Using cached requests-2.31.0-py3-none-any.whl (62 kB)

Collecting tqdm>=4.42.1

  Using cached tqdm-4.66.2-py3-none-any.whl (78 kB)

Collecting typing-extensions>=3.7.4.3

  Using cached typing_extensions-4.11.0-py3-none-any.whl (34 kB)

Collecting charset-normalizer<4,>=2

  Downloading charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (140 kB)

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.3/140.3 kB 1.4 MB/s eta 0:00:00

Collecting idna<4,>=2.5

  Using cached idna-3.7-py3-none-any.whl (66 kB)

Collecting urllib3<3,>=1.21.1

  Using cached urllib3-2.2.1-py3-none-any.whl (121 kB)

Collecting certifi>=2017.4.17

  Using cached certifi-2024.2.2-py3-none-any.whl (163 kB)

Installing collected packages: urllib3, typing-extensions, tqdm, pyyaml, packaging, idna, fsspec, filelock, charset-normalizer, certifi, requests, huggingface-hub

Successfully installed certifi-2024.2.2 charset-normalizer-3.3.2 filelock-3.14.0 fsspec-2024.3.1 huggingface-hub-0.22.2 idna-3.7 packaging-24.0 pyyaml-6.0.1 requests-2.31.0 tqdm-4.66.2 typing-extensions-4.11.0 urllib3-2.2.1



[notice] A new release of pip available: 22.3.1 -> 24.0

[notice] To update, run: pip install --upgrade pip

(env) [lds@llm llm]$ 



(env) [lds@llm llm]$ 

   * 


2.3.2. Download GGUF 파일 

* 사용되는 파일 

   * 아래의 예시는 EEVE-Korean-Instruct-10.8B-v1.0

      * HF: https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0

      * GGUF: https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF

* GGUF 파일을 다운로드 받기 위하여 https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF 에서 원하는 .gguf 모델을 다운로드 받습니다.

   * 30분 소요

huggingface-cli download \

  heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \

  ggml-model-Q5_K_M.gguf \

  --local-dir [PATH_DOWNLOAD] \

  --local-dir-use-symlinks False



huggingface-cli download \

  heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \

  ggml-model-Q5_K_M.gguf \

  --local-dir ./ \

  --local-dir-use-symlinks False

(env) [lds@llm llm]$ huggingface-cli download \

  heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF \

  ggml-model-Q5_K_M.gguf \

  --local-dir ./ \

  --local-dir-use-symlinks False

Consider using `hf_transfer` for faster downloads. This solution comes with some limitations. See https://huggingface.co/docs/huggingface_hub/hf_transfer for more details.

downloading https://huggingface.co/heegyu/EEVE-Korean-Instruct-10.8B-v1.0-GGUF/resolve/main/ggml-model-Q5_K_M.gguf to /home/lds/.cache/huggingface/hub/tmpaz24aq7z

ggml-model-Q5_K_M.gguf: 100%|████████████████████████████████████████████████████| 7.65G/7.65G [18:59<00:00, 6.72MB/s]

./ggml-model-Q5_K_M.gguf

(env) [lds@llm llm]$ 

   * 


2.3.3. Modelfile 준비 

* EEVE-Korean-Instruct-10.8B-v1.0 예시

   * 각 GGUF 마다 사용되는 Template 존재 

      * SOLAR Template 내용 활용시 https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0

      * vi Modelfile





FROM ggml-model-Q5_K_M.gguf



TEMPLATE """{{- if .System }}

<s>{{ .System }}</s>

{{- end }}

<s>Human:

{{ .Prompt }}</s>

<s>Assistant:

"""



SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."""



PARAMETER stop <s>

PARAMETER stop </s>

(env) [lds@llm llm]$ pwd

/home/lds/Works/llm

(env) [lds@llm llm]$ 

(env) [lds@llm llm]$ vi Modelfile 

(env) [lds@llm llm]$ cat Modelfile 

FROM ggml-model-Q5_K_M.gguf



TEMPLATE """{{- if .System }}

<s>{{ .System }}</s>

{{- end }}

<s>Human:

{{ .Prompt }}</s>

<s>Assistant:

"""



SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."""



PARAMETER stop <s>

PARAMETER stop </s>



(env) [lds@llm llm]$ 




2.4. Run Ollama

2.4.1. List

* Ollama 모델 목록

   * ollama list

(env) [lds@llm llm]$ ollama list

NAME        ID        SIZE        MODIFIED 

(env) [lds@llm llm]$ 


2.4.2. Create  

* Ollama 모델 생성

   * 3분 소요

ollama create EEVE-Korean-10.8B -f EEVE-Korean-Instruct-10.8B-v1.0-GGUF/Modelfile



ollama create EEVE-Korean-10.8B -f ./Modelfile



ollama list

(env) [lds@llm llm]$ ollama create EEVE-Korean-10.8B -f ./Modelfile

transferring model data 

creating model layer 

creating template layer 

creating system layer 

creating parameters layer 

creating config layer 

using already created layer sha256:b9e3d1ad5e8aa6db09610d4051820f06a5257b7d7f0b06c00630e376abcfa4c1 

writing layer sha256:6b70a2ad0d545ca50d11b293ba6f6355eff16363425c8b163289014cf19311fc 

writing layer sha256:1fa69e2371b762d1882b0bd98d284f312a36c27add732016e12e52586f98a9f5 

writing layer sha256:fc44d47f7d5a1b793ab68b54cdba0102140bd358739e9d78df4abf18432fb3ea 

writing layer sha256:fd9b55ebd209ef8ef07f55be3fa0f624f8781cf08d6f0931875cffd76fcdbcd7 

writing manifest 

success 

(env) [lds@llm llm]$ 

(env) [lds@llm llm]$ ollama list

NAME                            ID                  SIZE          MODIFIED           

EEVE-Korean-10.8B:latest        254d40e8ab74        7.7 GB        About a minute ago        

(env) [lds@llm llm]$ 





   * 


2.4.3. Run

* Ollama 실행

   * model 기준으로 cli로 문의 답변 가능 

      * 1분안에 답변하지만 Text가 느리게 나옴

         * CPU 4Core/ Memory 16GB 기준

ollama run EEVE-Korean-10.8B:latest

(env) [lds@llm app]$ ollama run EEVE-Korean-10.8B:latest

>>> Hello

Hello! I'm here to assist you in any way I can. Please go ahead and ask your question or provide a topic for me 

to help with. If I don't know the answer to something, I will do my best to point you in the right direction or 

find the information together. I strive to provide accurate and positive responses while ensuring safety and 

respect for all individuals involved. Let's start our conversation!



>>> /?

Available Commands:

  /set            Set session variables

  /show           Show model information

  /load <model>   Load a session or model

  /save <model>   Save your current session

  /bye            Exit

  /?, /help       Help for a command

  /? shortcuts    Help for keyboard shortcuts



Use """ to begin a multi-line message.



>>> /bye

(env) [lds@llm app]$ 



   * 



 

댓글목록 0

등록된 댓글이 없습니다.

Copyright © 소유하신 도메인. All rights reserved.

사이트 정보

회사명 : (주)리눅스데이타시스템 / 대표 : 정정모
서울본사 : 서울특별시 강남구 봉은사로 114길 40 홍선빌딩 2층 / tel : 02-6207-1160
대전지사 : 대전광역시 유성구 노은로174 도원프라자 5층 / tel : 042-331-1161

PC 버전으로 보기