Release of LLM-jp-3 Series Instruct2 and Instruct3

Achievements

The Research and Development Center for Large Language Models (LLMC) at the National Institute of Informatics is dedicated to developing open, Japanese-optimized large language models. Since September 2024, we have been releasing the “LLM-jp-3” series, trained using llm-jp-corpus v3. So far, we have released models with 1.8B, 3.7B, 13B, and 172B parameters.

This time, we are newly releasing base models with 150M, 440M, 980M, and 7.2B parameters, trained on the same pre-training corpus. Additionally, we are releasing “Instruct2” models—Supervised Fine-Tuned versions of all LLM-jp-3 base models, including the already released 1.8B, 3.7B, and 13B models. On top of that, we are also introducing “Instruct3” models, which apply Direct Preference Optimization to the Instruct2 models

The “Instruct2” models offer improved utility compared to the previously released “Instruct” models for 1.8B, 3.7B, and 13B. Furthermore, the “Instruct3” models significantly enhance safety while maintaining nearly the same level of utility as Instruct2.

With this release, we now offer eight models with different parameter sizes—150M, 440M, 980M, 1.8B, 3.7B, 7.2B, 13B, and 172B—trained on the same pretraining and fine-tuning datasets.

Alongside the model releases, we are also making publicly available the datasets and code used for fine-tuning. Additionally, we have published technical blog posts that provide an overview of the Instruct2 and Instruct3 models, as well as evaluation results for each model.

llm-jp/llm-jp-3-172b-instruct2 is released under the same license as llm-jp/llm-jp-3-172b. However, all models with 13B parameters or fewer are provided under the Apache License 2.0. As fully open models—including data and training processes—they can be freely used for applications, further fine-tuning, and more.

For more details on the resources released this time, please refer to the links below.

  • Models

  ・Base models
    ・llm-jp/llm-jp-3-150m
    ・llm-jp/llm-jp-3-440m
    ・llm-jp/llm-jp-3-980m
    ・llm-jp/llm-jp-3-7.2b

  ・Instruction-tuned models
    ・llm-jp/llm-jp-3-150m-instruct2
    ・llm-jp/llm-jp-3-150m-instruct3
    ・llm-jp/llm-jp-3-440m-instruct2
    ・llm-jp/llm-jp-3-440m-instruct3
    ・llm-jp/llm-jp-3-980m-instruct2
    ・llm-jp/llm-jp-3-980m-instruct3
    ・llm-jp/llm-jp-3-1.8b-instruct2
    ・llm-jp/llm-jp-3-1.8b-instruct3
    ・llm-jp/llm-jp-3-3.7b-instruct2
    ・llm-jp/llm-jp-3-3.7b-instruct3
    ・llm-jp/llm-jp-3-7.2b-instruct
    ・llm-jp/llm-jp-3-7.2b-instruct2
    ・llm-jp/llm-jp-3-7.2b-instruct3
    ・llm-jp/llm-jp-3-13b-instruct2
    ・llm-jp/llm-jp-3-13b-instruct3
    ・llm-jp/llm-jp-3-172b-instruct2

  • Datasets

  ・Datasets for SFT
    ・llm-jp/wizardlm8x22b-logical-math-coding-sft-ja
    ・llm-jp/FLAN
    ・llm-jp/Synthetic-JP-EN-Coding-Dataset
    ・llm-jp/AnswerCarefully

  ・Datasets for DPO
    ・llm-jp/aya-ja-evol-inst
    ・llm-jp/ac-self-inst

  • Code

  ・llm-jp/instruct3