chendl's picture
add requirements
a1d409e
raw
history blame
10.8 kB
<!---
Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# μ„€μΉ˜λ°©λ²•[[installation]]
πŸ€— Transformersλ₯Ό μ‚¬μš© 쀑인 λ”₯λŸ¬λ‹ λΌμ΄λΈŒλŸ¬λ¦¬μ— 맞좰 μ„€μΉ˜ν•˜κ³ , μΊμ‹œλ₯Ό κ΅¬μ„±ν•˜κ±°λ‚˜ μ„ νƒμ μœΌλ‘œ μ˜€ν”„λΌμΈμ—μ„œλ„ μ‹€ν–‰ν•  수 μžˆλ„λ‘ πŸ€— Transformersλ₯Ό μ„€μ •ν•˜λŠ” 방법을 λ°°μš°κ² μŠ΅λ‹ˆλ‹€.
πŸ€— TransformersλŠ” Python 3.6+, PyTorch 1.1.0+, TensorFlow 2.0+ 및 Flaxμ—μ„œ ν…ŒμŠ€νŠΈλ˜μ—ˆμŠ΅λ‹ˆλ‹€. λ”₯λŸ¬λ‹ 라이브러리λ₯Ό μ„€μΉ˜ν•˜λ €λ©΄ μ•„λž˜ 링크된 μ €λ§ˆλ‹€μ˜ 곡식 μ‚¬μ΄νŠΈλ₯Ό μ°Έκ³ ν•΄μ£Όμ„Έμš”.
* [PyTorch](https://pytorch.org/get-started/locally/) μ„€μΉ˜ν•˜κΈ°
* [TensorFlow 2.0](https://www.tensorflow.org/install/pip) μ„€μΉ˜ν•˜κΈ°
* [Flax](https://flax.readthedocs.io/en/latest/) μ„€μΉ˜ν•˜κΈ°
## pip으둜 μ„€μΉ˜ν•˜κΈ°[[install-with-pip]]
πŸ€— Transformersλ₯Ό [가상 ν™˜κ²½](https://docs.python.org/3/library/venv.html)에 μ„€μΉ˜ν•˜λŠ” 것을 μΆ”μ²œλ“œλ¦½λ‹ˆλ‹€. Python 가상 ν™˜κ²½μ— μ΅μˆ™ν•˜μ§€ μ•Šλ‹€λ©΄, 이 [κ°€μ΄λ“œ](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)λ₯Ό μ°Έκ³ ν•˜μ„Έμš”. 가상 ν™˜κ²½μ„ μ‚¬μš©ν•˜λ©΄ μ„œλ‘œ λ‹€λ₯Έ ν”„λ‘œμ νŠΈλ“€μ„ 보닀 μ‰½κ²Œ 관리할 수 있고, μ˜μ‘΄μ„± κ°„μ˜ ν˜Έν™˜μ„± 문제λ₯Ό 방지할 수 μžˆμŠ΅λ‹ˆλ‹€.
λ¨Όμ € ν”„λ‘œμ νŠΈ λ””λ ‰ν† λ¦¬μ—μ„œ 가상 ν™˜κ²½μ„ λ§Œλ“€μ–΄ μ€λ‹ˆλ‹€.
```bash
python -m venv .env
```
가상 ν™˜κ²½μ„ ν™œμ„±ν™”ν•΄μ£Όμ„Έμš”. Linuxλ‚˜ MacOS의 경우:
```bash
source .env/bin/activate
```
Windows의 경우:
```bash
.env/Scripts/activate
```
이제 πŸ€— Transformersλ₯Ό μ„€μΉ˜ν•  μ€€λΉ„κ°€ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. λ‹€μŒ λͺ…령을 μž…λ ₯ν•΄μ£Όμ„Έμš”.
```bash
pip install transformers
```
CPU만 써도 λœλ‹€λ©΄, πŸ€— Transformers와 λ”₯λŸ¬λ‹ 라이브러리λ₯Ό 단 1μ€„λ‘œ μ„€μΉ˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ πŸ€— Transformers와 PyTorch의 경우:
```bash
pip install transformers[torch]
```
πŸ€— Transformers와 TensorFlow 2.0의 경우:
```bash
pip install transformers[tf-cpu]
```
πŸ€— Transformers와 Flax의 경우:
```bash
pip install transformers[flax]
```
λ§ˆμ§€λ§‰μœΌλ‘œ πŸ€— Transformersκ°€ μ œλŒ€λ‘œ μ„€μΉ˜λ˜μ—ˆλŠ”μ§€ 확인할 μ°¨λ‘€μž…λ‹ˆλ‹€. μ‚¬μ „ν›ˆλ ¨λœ λͺ¨λΈμ„ λ‹€μš΄λ‘œλ“œν•˜λŠ” μ½”λ“œμž…λ‹ˆλ‹€.
```bash
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"
```
라벨과 μ μˆ˜κ°€ 좜λ ₯되면 잘 μ„€μΉ˜λœ κ²ƒμž…λ‹ˆλ‹€.
```bash
[{'label': 'POSITIVE', 'score': 0.9998704791069031}]
```
## μ†ŒμŠ€μ—μ„œ μ„€μΉ˜ν•˜κΈ°[[install-from-source]]
πŸ€— Transformersλ₯Ό μ†ŒμŠ€μ—μ„œ μ„€μΉ˜ν•˜λ €λ©΄ μ•„λž˜ λͺ…령을 μ‹€ν–‰ν•˜μ„Έμš”.
```bash
pip install git+https://github.com/huggingface/transformers
```
μœ„ λͺ…령은 μ΅œμ‹ μ΄μ§€λ§Œ (μ•ˆμ •μ μΈ) `stable` 버전이 μ•„λ‹Œ μ‹€ν—˜μ„±μ΄ 짙은 `main` 버전을 μ„€μΉ˜ν•©λ‹ˆλ‹€. `main` 버전은 개발 ν˜„ν™©κ³Ό λ°œλ§žμΆ”λŠ”λ° μœ μš©ν•©λ‹ˆλ‹€. μ˜ˆμ‹œλ‘œ λ§ˆμ§€λ§‰ 곡식 릴리슀 이후 발견된 버그가 νŒ¨μΉ˜λ˜μ—ˆμ§€λ§Œ, μƒˆ 릴리슀둜 아직 λ‘€μ•„μ›ƒλ˜μ§€λŠ” μ•Šμ€ 경우λ₯Ό λ“€ 수 μžˆμŠ΅λ‹ˆλ‹€. λ°”κΏ” λ§ν•˜λ©΄ `main` 버전이 μ•ˆμ •μ„±κ³ΌλŠ” 거리가 μžˆλ‹€λŠ” λœ»μ΄κΈ°λ„ ν•©λ‹ˆλ‹€. μ €ν¬λŠ” `main` 버전을 μ‚¬μš©ν•˜λŠ”λ° λ¬Έμ œκ°€ 없도둝 λ…Έλ ₯ν•˜κ³  있으며, λŒ€λΆ€λΆ„μ˜ λ¬Έμ œλŠ” λŒ€κ°œ λͺ‡ μ‹œκ°„μ΄λ‚˜ ν•˜λ£¨ μ•ˆμ— ν•΄κ²°λ©λ‹ˆλ‹€. λ§Œμ•½ λ¬Έμ œκ°€ λ°œμƒν•˜λ©΄ [이슈](https://github.com/huggingface/transformers/issues)λ₯Ό μ—΄μ–΄μ£Όμ‹œλ©΄ 더 빨리 ν•΄κ²°ν•  수 μžˆμŠ΅λ‹ˆλ‹€!
μ „κ³Ό λ§ˆμ°¬κ°€μ§€λ‘œ πŸ€— Transformersκ°€ μ œλŒ€λ‘œ μ„€μΉ˜λ˜μ—ˆλŠ”μ§€ 확인할 μ°¨λ‘€μž…λ‹ˆλ‹€.
```bash
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I love you'))"
```
## μˆ˜μ • κ°€λŠ₯ν•œ μ„€μΉ˜[[editable-install]]
μˆ˜μ • κ°€λŠ₯ν•œ μ„€μΉ˜κ°€ ν•„μš”ν•œ κ²½μš°λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.
* `main` λ²„μ „μ˜ μ†ŒμŠ€ μ½”λ“œλ₯Ό μ‚¬μš©ν•˜κΈ° μœ„ν•΄
* πŸ€— Transformers에 κΈ°μ—¬ν•˜κ³  μ‹Άμ–΄μ„œ μ½”λ“œμ˜ λ³€κ²½ 사항을 ν…ŒμŠ€νŠΈν•˜κΈ° μœ„ν•΄
리포지터리λ₯Ό λ³΅μ œν•˜κ³  πŸ€— Transformersλ₯Ό μ„€μΉ˜ν•˜λ €λ©΄ λ‹€μŒ λͺ…령을 μž…λ ₯ν•΄μ£Όμ„Έμš”.
```bash
git clone https://github.com/huggingface/transformers.git
cd transformers
pip install -e .
```
μœ„ λͺ…령은 리포지터리λ₯Ό λ³΅μ œν•œ μœ„μΉ˜μ˜ 폴더와 Python 라이브러리의 경둜λ₯Ό μ—°κ²°μ‹œν‚΅λ‹ˆλ‹€. Python이 일반 라이브러리 경둜 외에 λ³΅μ œν•œ 폴더 λ‚΄λΆ€λ₯Ό 확인할 κ²ƒμž…λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ Python νŒ¨ν‚€μ§€κ°€ 일반적으둜 `~/anaconda3/envs/main/lib/python3.7/site-packages/`에 μ„€μΉ˜λ˜μ–΄ μžˆλŠ”λ°, λͺ…령을 받은 Python이 이제 λ³΅μ œν•œ 폴더인 `~/transformers/`도 κ²€μƒ‰ν•˜κ²Œ λ©λ‹ˆλ‹€.
<Tip warning={true}>
라이브러리λ₯Ό 계속 μ‚¬μš©ν•˜λ €λ©΄ `transformers` 폴더λ₯Ό κΌ­ μœ μ§€ν•΄μ•Ό ν•©λ‹ˆλ‹€.
</Tip>
λ³΅μ œλ³Έμ€ μ΅œμ‹  λ²„μ „μ˜ πŸ€— Transformers둜 μ‰½κ²Œ μ—…λ°μ΄νŠΈν•  수 μžˆμŠ΅λ‹ˆλ‹€.
```bash
cd ~/transformers/
git pull
```
Python ν™˜κ²½μ„ λ‹€μ‹œ μ‹€ν–‰ν•˜λ©΄ μ—…λ°μ΄νŠΈλœ πŸ€— Transformers의 `main` 버전을 μ°Ύμ•„λ‚Ό κ²ƒμž…λ‹ˆλ‹€.
## conda둜 μ„€μΉ˜ν•˜κΈ°[[install-with-conda]]
`huggingface` conda μ±„λ„μ—μ„œ μ„€μΉ˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
```bash
conda install -c huggingface transformers
```
## μΊμ‹œ κ΅¬μ„±ν•˜κΈ°[[cache-setup]]
μ‚¬μ „ν›ˆλ ¨λœ λͺ¨λΈμ€ λ‹€μš΄λ‘œλ“œλœ ν›„ 둜컬 경둜 `~/.cache/huggingface/hub`에 μΊμ‹œλ©λ‹ˆλ‹€. μ…Έ ν™˜κ²½ λ³€μˆ˜ `TRANSFORMERS_CACHE`의 κΈ°λ³Έ λ””λ ‰ν„°λ¦¬μž…λ‹ˆλ‹€. Windows의 경우 κΈ°λ³Έ λ””λ ‰ν„°λ¦¬λŠ” `C:\Users\username\.cache\huggingface\hub`μž…λ‹ˆλ‹€. μ•„λž˜μ˜ μ…Έ ν™˜κ²½ λ³€μˆ˜λ₯Ό (μš°μ„  μˆœμœ„) μˆœμ„œλŒ€λ‘œ λ³€κ²½ν•˜μ—¬ λ‹€λ₯Έ μΊμ‹œ 디렉토리λ₯Ό 지정할 수 μžˆμŠ΅λ‹ˆλ‹€.
1. μ…Έ ν™˜κ²½ λ³€μˆ˜ (κΈ°λ³Έ): `HUGGINGFACE_HUB_CACHE` λ˜λŠ” `TRANSFORMERS_CACHE`
2. μ…Έ ν™˜κ²½ λ³€μˆ˜: `HF_HOME`
3. μ…Έ ν™˜κ²½ λ³€μˆ˜: `XDG_CACHE_HOME` + `/huggingface`
<Tip>
κ³Όκ±° πŸ€— Transformersμ—μ„œ μ“°μ˜€λ˜ μ…Έ ν™˜κ²½ λ³€μˆ˜ `PYTORCH_TRANSFORMERS_CACHE` λ˜λŠ” `PYTORCH_PRETRAINED_BERT_CACHE`이 μ„€μ •λ˜μžˆλ‹€λ©΄, μ…Έ ν™˜κ²½ λ³€μˆ˜ `TRANSFORMERS_CACHE`을 μ§€μ •ν•˜μ§€ μ•ŠλŠ” ν•œ μš°μ„  μ‚¬μš©λ©λ‹ˆλ‹€.
</Tip>
## μ˜€ν”„λΌμΈ λͺ¨λ“œ[[offline-mode]]
πŸ€— Transformersλ₯Ό 둜컬 파일만 μ‚¬μš©ν•˜λ„λ‘ ν•΄μ„œ λ°©ν™”λ²½ λ˜λŠ” μ˜€ν”„λΌμΈ ν™˜κ²½μ—μ„œ μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€. ν™œμ„±ν™”ν•˜λ €λ©΄ `TRANSFORMERS_OFFLINE=1` ν™˜κ²½ λ³€μˆ˜λ₯Ό μ„€μ •ν•˜μ„Έμš”.
<Tip>
`HF_DATASETS_OFFLINE=1` ν™˜κ²½ λ³€μˆ˜λ₯Ό μ„€μ •ν•˜μ—¬ μ˜€ν”„λΌμΈ ν›ˆλ ¨ 과정에 [πŸ€— Datasets](https://huggingface.co/docs/datasets/)을 μΆ”κ°€ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
</Tip>
예λ₯Ό λ“€μ–΄ μ™ΈλΆ€ κΈ°κΈ° 사이에 방화벽을 λ‘” 일반 λ„€νŠΈμ›Œν¬μ—μ„œ ν‰μ†Œμ²˜λŸΌ ν”„λ‘œκ·Έλž¨μ„ λ‹€μŒκ³Ό 같이 μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
```bash
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
```
μ˜€ν”„λΌμΈ κΈ°κΈ°μ—μ„œ λ™μΌν•œ ν”„λ‘œκ·Έλž¨μ„ λ‹€μŒκ³Ό 같이 μ‹€ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
```bash
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...
```
이제 μŠ€ν¬λ¦½νŠΈλŠ” 둜컬 νŒŒμΌμ— ν•œν•΄μ„œλ§Œ 검색할 κ²ƒμ΄λ―€λ‘œ, μŠ€ν¬λ¦½νŠΈκ°€ μ€‘λ‹¨λ˜κ±°λ‚˜ μ‹œκ°„μ΄ 초과될 λ•ŒκΉŒμ§€ λ©ˆμΆ°μžˆμ§€ μ•Šκ³  잘 싀행될 κ²ƒμž…λ‹ˆλ‹€.
### μ˜€ν”„λΌμΈμš© λͺ¨λΈ 및 ν† ν¬λ‚˜μ΄μ € λ§Œλ“€μ–΄λ‘κΈ°[[fetch-models-and-tokenizers-to-use-offline]]
Another option for using πŸ€— Transformers offline is to download the files ahead of time, and then point to their local path when you need to use them offline. There are three ways to do this:
πŸ€— Transformersλ₯Ό μ˜€ν”„λΌμΈμœΌλ‘œ μ‚¬μš©ν•˜λŠ” 또 λ‹€λ₯Έ 방법은 νŒŒμΌμ„ 미리 λ‹€μš΄λ‘œλ“œν•œ λ‹€μŒ, μ˜€ν”„λΌμΈμΌ λ•Œ μ‚¬μš©ν•  둜컬 경둜λ₯Ό μ§€μ •ν•΄λ‘λŠ” κ²ƒμž…λ‹ˆλ‹€. 3가지 쀑 νŽΈν•œ 방법을 κ³ λ₯΄μ„Έμš”.
* [Model Hub](https://huggingface.co/models)의 UIλ₯Ό 톡해 νŒŒμΌμ„ λ‹€μš΄λ‘œλ“œν•˜λ €λ©΄ ↓ μ•„μ΄μ½˜μ„ ν΄λ¦­ν•˜μ„Έμš”.
![download-icon](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/download-icon.png)
* [`PreTrainedModel.from_pretrained`]와 [`PreTrainedModel.save_pretrained`] μ›Œν¬ν”Œλ‘œλ₯Ό ν™œμš©ν•˜μ„Έμš”.
1. 미리 [`PreTrainedModel.from_pretrained`]둜 νŒŒμΌμ„ λ‹€μš΄λ‘œλ“œν•΄λ‘μ„Έμš”.
```py
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> tokenizer = AutoTokenizer.from_pretrained("bigscience/T0_3B")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("bigscience/T0_3B")
```
2. [`PreTrainedModel.save_pretrained`]둜 μ§€μ •λœ κ²½λ‘œμ— νŒŒμΌμ„ μ €μž₯ν•΄λ‘μ„Έμš”.
```py
>>> tokenizer.save_pretrained("./your/path/bigscience_t0")
>>> model.save_pretrained("./your/path/bigscience_t0")
```
3. 이제 μ˜€ν”„λΌμΈμΌ λ•Œ [`PreTrainedModel.from_pretrained`]둜 μ €μž₯ν•΄λ’€λ˜ νŒŒμΌμ„ μ§€μ •λœ κ²½λ‘œμ—μ„œ λ‹€μ‹œ λΆˆλŸ¬μ˜€μ„Έμš”.
```py
>>> tokenizer = AutoTokenizer.from_pretrained("./your/path/bigscience_t0")
>>> model = AutoModel.from_pretrained("./your/path/bigscience_t0")
```
* [huggingface_hub](https://github.com/huggingface/huggingface_hub/tree/main/src/huggingface_hub) 라이브러리λ₯Ό ν™œμš©ν•΄μ„œ νŒŒμΌμ„ λ‹€μš΄λ‘œλ“œν•˜μ„Έμš”.
1. κ°€μƒν™˜κ²½μ— `huggingface_hub` 라이브러리λ₯Ό μ„€μΉ˜ν•˜μ„Έμš”.
```bash
python -m pip install huggingface_hub
```
2. [`hf_hub_download`](https://huggingface.co/docs/hub/adding-a-library#download-files-from-the-hub) ν•¨μˆ˜λ‘œ νŒŒμΌμ„ νŠΉμ • μœ„μΉ˜μ— λ‹€μš΄λ‘œλ“œν•  수 μžˆμŠ΅λ‹ˆλ‹€. 예λ₯Ό λ“€μ–΄ μ•„λž˜ λͺ…령은 [T0](https://huggingface.co/bigscience/T0_3B) λͺ¨λΈμ˜ `config.json` νŒŒμΌμ„ μ§€μ •λœ κ²½λ‘œμ— λ‹€μš΄λ‘œλ“œν•©λ‹ˆλ‹€.
```py
>>> from huggingface_hub import hf_hub_download
>>> hf_hub_download(repo_id="bigscience/T0_3B", filename="config.json", cache_dir="./your/path/bigscience_t0")
```
νŒŒμΌμ„ λ‹€μš΄λ‘œλ“œν•˜κ³  λ‘œμ»¬μ— μΊμ‹œ 해놓고 λ‚˜λ©΄, λ‚˜μ€‘μ— λΆˆλŸ¬μ™€ μ‚¬μš©ν•  수 μžˆλ„λ‘ 둜컬 경둜λ₯Ό μ§€μ •ν•΄λ‘μ„Έμš”.
```py
>>> from transformers import AutoConfig
>>> config = AutoConfig.from_pretrained("./your/path/bigscience_t0/config.json")
```
<Tip>
Hub에 μ €μž₯된 νŒŒμΌμ„ λ‹€μš΄λ‘œλ“œν•˜λŠ” 방법을 더 μžμ„Ένžˆ μ•Œμ•„λ³΄λ €λ©΄ [Hubμ—μ„œ 파일 λ‹€μš΄λ‘œλ“œν•˜κΈ°](https://huggingface.co/docs/hub/how-to-downstream) μ„Ήμ…˜μ„ μ°Έκ³ ν•΄μ£Όμ„Έμš”.
</Tip>