plamo-100b / README.md
nzw0301's picture
Fix bibtex
fe05ad4 verified
|
raw
history blame
14.8 kB
metadata
language:
  - en
  - ja
license: other
license_name: plamo-100b-license
license_link: https://huggingface.co/pfnet/plamo-100b/tree/main/LICENSE
library_name: transformers
pipeline_tag: text-generation
extra_gated_prompt: >-
  ### PLaMo Non-Commercial License Agreement


  The PLaMo Non-Commercial License Agreement hereby sets forth the licensing
  terms and conditions the User must comply with for the non-commercial use of
  the foundational large language model PLaMo-100B, provided by Preferred
  Networks, Inc. By agreeing to this Agreement or by using the Model, the User
  consents to be legally bound by all terms and conditions stipulated herein.


  Article 1: Definitions

  (1) "Agreement" shall mean this PLaMo Non-Commercial License Agreement.

  (2) "PFN" stands for Preferred Networks, Inc.

  (3) "Model" shall mean  the model code named "PLaMo-100B", including its
  training scripts, Tokenizer, pre-trained weights, and any associated
  components or resources provided by PFN. 

  (4) "User" is the person or legal entity that uses the Model.

  (5) "License" shall mean the permission granted by PFN to User to use the
  Model under the terms of this Agreement.

  (6) "Derivative Model" shall mean any model code created through modifications
  of the Model, such as, fine-tuning, downsizing by quantization, code editing,
  and parameter tuning. The Derivative Model includes the weights of fine-tuning
  and other associated components and resources of the created model. 

  (7) "Outputs" shall mean the results generated by the Model or Derivative
  Model.

  (8) "Models and Outputs" shall collectively refer to the Model, Derivative
  Models, and Outputs.


  Article 2: User

  The User must be at least 18 years of age, or of legal age to independently
  enter into an agreement in their country of residence. Notwithstanding, this
  requirement does not apply if the User’s parent or legal proxy provides their
  consent for the User to enter into this Agreement.


  Article 3: License

  (1) PFN grants the User permission to use the Model under the terms and
  conditions of this Agreement, and to the extent stipulated herein, provided
  that the User agrees to and abides by all these terms and conditions.

  (2) The License provided shall be non-exclusive, worldwide, revocable,
  non-sublicensable, non-transferrable, and royalty-free.

  (3) The User shall only use the Models and Outputs for personal or academic
  applications.

  (4) The User is prohibited from using the Models and Outputs for any of the
  following purposes or any other commercial purposes:

    (a) For any business of the User or a third party.

    (b) For the development or research of models or services intended for commercial applications

  (5) The User shall not provide the Model or any Derivative Models to any third
  parties, nor shall the User allow third parties to use them, regardless of
  whether the use is for commercial or non-commercial purposes.


  Article 4: Derivative Model

  (1) The User may create a Derivative Model from the Model through methods such
  as fine-tuning, downsizing by quantization, code editing, and parameter
  modification. However, the creation of a Derivative Model for purposes set
  forth in Paragraph 4 of the preceding Article or for any other commercial
  purposes is strictly prohibited.

  (2) Upon creating any Derivative Models, the User must include and clearly
  display the prefix "PLaMo" in the names of these Derivative Models.


  Article 5: Output

  (1) The User may publicize Outputs, provided that it is clearly stated that
  they are the outputs generated by the Model or Derivative Models.

  (2) The User is strictly prohibited from utilizing the Outputs for the purpose
  of developing, training, or enhancing any other large language models that are
  not the Model or Derivative Models.


  Article 6: Other Usage Restrictions

  In relation to the usage of the Model, Derivative Models, or Outputs
  (collectively defined as the "Models and Outputs"), the User is strictly
  prohibited from committing any of the listed acts:

  (1) Violating any laws and regulations or disrupting public order and societal
  norms

  (2) Infringing upon the rights or interests of PFN or any third party

  (3) Tarnishing the reputation or credibility of PFN or any third party

  (4) Inflicting financial damage upon PFN or any third party

  (5) Making intimidations, racial discrimination, or defamatory remarks

  (6) Inputting personal information as defined by Japanese law, specifically
  Paragraph 1 of Article 2 of the Act on the Protection of Personal Information
  (Act No. 57 of 2003), or sensitive personal information as similarly defined
  by this statute.

  (7) Stalking, harassing, trolling, or doxxing other users

  (8) Developing, endorsing, or using computer viruses, malicious software,
  automated software or bots, or harmful programs

  (9) Engaging in any communication, action, or expression that can incite or
  encourage harmful actions such as  suicide, self-abuse, violence, and drug use

  (10) Communicating false information

  (11) Circulating information implying the Outputs to be the official view and
  opinion of PFN

  (12) Using the Models and Outputs in finance, education, employment, housing,
  insurance, legal, medical, or any other areas where such usage could have a
  legal or significant impact on any individual or business entity

  (13) Relying on the Models and Outputs as the only source of information or as
  an alternative for expert advice

  (14) Utilizing the Models and Outputs for vehicle navigation or for automated
  driving systems

  (15) Engaging in, threatening to commit, participating in, or assisting in any
  criminal activities or any activities related thereto

  (16) Engaging in money laundering or similar financial malpractices

  (17) Providing direct or indirect benefits to anti-social forces

  (18) Circulating obscene content or materials detrimental to the healthy
  development of young individuals

  (19) Utilizing the Models and Outputs for political activities or activities
  of similar nature

  (20) Acquiring the Model through methods other than the interface provided by
  PFN

  (21) In addition to the aforementioned, any conduct deemed reasonably
  inappropriate as per PFN’s discretion


  Article 7: Disclaimer of Warranty

  THE MODEL AND OUTPUTS ARE PROVIDED ON AN "AS IS" BASIS. PFN MAKES NO
  GUARANTEES OR ASSURANCES OF ANY KIND IN RELATION TO THEM, INCLUDING BUT NOT
  LIMITED TO THEIR ACCURACY, AUTHENTICITY, MERCHANTABILITY, QUALITY,
  PERFORMANCE, APPLICABILITY FOR A PARTICULAR USE, OR  NON-INFRINGEMENT OF ANY
  RIGHTS. IT FALLS ON THE USER TO DISCERN THE APPROPRIATENESS OF USING THE
  MODELS AND OUTPUTS, AND THE USER WILL ASSUME FULL RESPONSIBILITY FOR ALL
  CONSEQUENCES AS A RESULT OF THE USE OF THE MODELS AND OUTPUTS.


  Article 8: Limitation of Liability

  (1) PFN'S LIABILITY FOR ANY DAMAGE INCURRED BY THE USER, IN RELATION TO THIS
  AGREEMENT AND THE MODELS AND OUTPUTS, WHETHER ARISING FROM CONTRACT, TORT,
  PRODUCT LIABILITY OR ANY OTHER LEGAL CLAIM, WILL BE LIMITED TO DIRECT AND
  GENERAL DAMAGES ONLY (PFN WILL NOT BE HELD RESPONSIBLE FOR ANY LOSS OF
  PROFITS, SPECIAL, INDIRECT, OR ANY OTHER DAMAGES, WHETHER SUCH DAMAGES WERE
  FORESEEABLE OR UNFORESEEABLE.), AND THE MAXIMUM LIABILITY OF DAMAGES SHALL BE
  500 YEN. THIS PROVISION, HOWEVER, DOES NOT APPLY IF PFN IS DETERMINED TO HAVE
  ACTED WITH DELIBERATE INTENT OR GROSS NEGLIGENCE.

  (2) REGARDLESS OF THE PREVIOUS PARAGRAPH, SHOULD THE USER USE THE MODELS AND
  OUTPUTS FOR BUSINESS PURPOSES, PFN WILL NOT BEAR ANY LIABILITY TO THE USER FOR
  ANY DAMAGES OR OTHER LIABILITIES REGARDING THIS AGREEMENT AND THE MODELS AND
  OUTPUTS. 


  Article 9: User's Responsibility

  (1) The User must ensure their acquisition and use of the Models and Outputs
  is in compliance with all relevant laws and regulations, including but not
  limited to those concerning import, export, and trade, in addition to the
  terms and conditions of this Agreement. 

  (2) The User shall compensate for any damages incurred by PFN as a result of
  the User's violation of this Agreement or use of the Models and Outputs.

  (3) Should PFN be subject to any third-party claims for damages or any other
  liabilities resulting from the User's usage of the Models and Outputs, it is
  the User’s responsibility to absolve PFN from such claims and safeguard PFN
  against any potential liabilities.


  Article 10: Ownership of Rights

  (1) All rights pertaining to the Model shall belong to PFN or third parties
  who have licensed the Model to PFN.

  (2) The User holds the rights to the modifications they make to the Model when
  creating Derivative Models, while PFN retains the rights in all remaining
  parts of the Derivative Models.

  (3) The User holds all rights pertaining to the Outputs.


  Article 11: Termination of Agreement

  PFN reserves the right to terminate this Agreement at any given time and at
  its sole discretion.


  Article 12: Duration of Agreement

  (1) This Agreement will take effect when the User either agrees to its terms
  or accesses the Model, whichever occurs first, and will remain in effect until
  it is terminated.

  (2) Upon the termination of the Agreement, regardless of the reasons, the User
  shall immediately cease all use of the Model and Derivative Models and delete
  all of them.


  Article 13: Revision of Agreement

  PFN may revise this Agreement (including the rules and regulations concerning
  the Models and Outputs; the same shall apply hereinafter in this Article). PFN
  shall announce any revisions to this Agreement, including the details of the
  changes and their effective date, in a prescribed manner by PFN, and prior to
  the implementation of the changes. 


  Article 14: Governing Law and Court of Jurisdiction

  (1) This Agreement shall be governed by the laws of Japan.

  (2) Any conflicts arising out of or in connection with this Agreement or the
  Models and Outputs shall be settled under the exclusive jurisdiction of the
  Tokyo District Court.
extra_gated_heading: Agree to our license to download PLaMo-100B
extra_gated_description: >-
  To download PLaMo-100B, you have to agree to our license. PLaMo-100B is
  released under both commercial and non-commercial license. For non-commercial
  use, please check the
  [LICENSE](https://huggingface.co/pfnet/plamo-100b/tree/main/LICENSE). For
  commercial use, please contact us via this
  [form](https://forms.gle/J96Fu9RPQ96uGTd88)
extra_gated_button_content: agree to PLaMo-100B license

PLaMo-100B

Model Description

PLaMo-100B is a 100B model pre-trained on English and Japanese open datasets, developed by Preferred Elements, Inc. PLaMo-100B is released under both Commercial and Non-Commercial Licenses. Please check the LICENSE for non-commercial use, both Japanese version and English version of the license are available. For commercial use, please contact us via this form (Japanese Only).

NOTE: This model has NOT been instruction-tuned for chat dialog or other downstream tasks. We provide instruction-tuned version of PLaMo-100B model via our API and solution packages. Please check our official PLaMo website (Japanese only) for details.

Usage

Requirements

  • numpy
  • sentencepiece
  • torch
  • transformers

Use a pipeline as a high-level helper

import transformers
pipeline = transformers.pipeline("text-generation", model="pfnet/plamo-100b", trust_remote_code=True)
print(pipeline("The future of artificial intelligence technology is ", max_new_tokens=32))

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("pfnet/plamo-100b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("pfnet/plamo-100b", trust_remote_code=True)
text = "これからの人工知能技術は"
input_ids = tokenizer(text, return_tensors="pt").input_ids
generated_tokens = model.generate(
    inputs=input_ids,
    max_new_tokens=32,
    do_sample=True,
    top_k=50,
    top_p=0.95,
    temperature=1.0,
)[0]
generated_text = tokenizer.decode(generated_tokens)
print(generated_text)

Model Details

  • Model size: 100B
  • Trained tokens: 2T tokens (English: 1.3T tokens, Japanese: 0.7T tokens)
  • Developed by: Preferred Elements, Inc
  • Model type: Causal decoder-only
  • Language(s): English, Japanese
  • License: Commercial and Non-Commercial

Training Dataset

We trained PLaMo-100B in two phases, phase 1 with 1.5T tokens and phase 2 with 0.5T tokens. The percentage of datasets in each phase is shown in the following table.

1.5T (phase 1) 0.5T (phase 2)
RefinedWeb (English) 42% 17%
Other English Dataset 28% 33%
Proprietary CommonCrawl-JP 18% 46%
Other Japanese Dataset 12% 4%

Tokenizer

PLaMo-100B uses sentencepiece tokenizer which is trained on a subset of the datasets for model pre-training.

Tech Blog

https://tech.preferred.jp/ja/blog/plamo-100b/

Bias, Risks, and Limitations

PLaMo-100B is a new technology that carries risks with use. Testing conducted to date has been in English and Japanese, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, PLaMo-100B’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of PLaMo-100B, developers should perform safety testing and tuning tailored to their specific applications of the model.

How to cite

@article{plamo100b,
    author    = {Preferred Elements, Inc. and Kenshin Abe and Kaizaburo Chubachi and Yasuhiro Fujita and Yuta Hirokawa and Kentaro Imajo and Toshiki Kataoka and Hiroyoshi Komatsu and Hiroaki Mikami and Tsuguo Mogami and Shogo Murai and Kosuke Nakago and Daisuke Nishino and Toru Ogawa and Daisuke Okanohara and Yoshihiko Ozaki and Shotaro Sano and Shuji Suzuki and Tianqi Xu and Toshihiko Yanase},
    title     = {PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency},
    year      = {2024},
    url       = {https://arxiv.org/abs/2410.07563},
    journal   = {arXiv}
}

Acknowledgement

This model is trained under the project, “Research and Development Project of the Enhanced Infrastructures for Post 5G Information and Communication System” (JPNP 20017), subsidized by the New Energy and Industrial Technology Development Organization (NEDO).