Release of LLM-jp-3 172B beta2

2024.11.15 Achievements

The Research and Development Center for Large Language Models (LLMC) at the National Institute of Informatics has been working on developing an open GPT-3-class large-scale language model, “LLM-jp-3 172B,” optimized for Japanese.

We are pleased to announce the release of the model “LLM-jp-3 172B beta2,” which has been trained with approximately 1.4 trillion tokens, about two-thirds of our target dataset. This release also includes an instruction-tuned model, as well as the newly created synthetic data used for this purpose.

In evaluations using the “llm-jp-eval v1.4.1” framework, which conducts cross-sectional assessments based on existing Japanese linguistic resources, the instruction-tuned model achieved a score of 0.547. This score surpasses the 0.538 scored by the model gpt-35-turbo-16k-0613.

For more details on the models and data, please visit the following links:

llm-jp/llm-jp-3-172b-beta2
llm-jp/llm-jp-3-172b-beta2-instruct2
llm-jp/magpie-sft-v1.0