Release of LLM-jp-3.1 Series Instruct4

2025.05.30 Achievements

The Research and Development Center for Large Language Models (LLMC) at the National Institute of Informatics is dedicated to developing open, Japanese-optimized large language models. Since September 2024, we have been releasing the “LLM-jp-3” series, trained using llm-jp-corpus v3. The releases so far include eight dense models (150M, 440M, 980M, 1.8B, 3.7B, 7.2B, 13B, and 172B) and two Mixture-of-Experts (MoE) models (8×1.8B and 8×13B).

Today, we are excited to announce the release of the new “LLM-jp-3.1” series. This series significantly improves instruction-following capabilities through the addition of continuous pre-training and improvements to post-training.

The llm-jp-3.1-1.8b-instruct4 model achieves a Japanese MT-Bench score of 6.30, a substantial improvement over its predecessor llm-jp-3-1.8b-instruct3 (4.64), and even surpasses llm-jp-3-13b-instruct3 (6.21). The mid-sized llm-jp-3.1-13b-instruct4 model outperforms Qwen2.5-14B-Instruct, which has a similar parameter scale. Furthermore, our flagship model llm-jp-3.1-8x13b-instruct4 (73B total parameters, 22B active parameters) exceeds the performance of gpt-4-0613.

Along with this model release, we have published a technical blog detailing the continuous pre-training and post-training strategies behind LLM-jp-3.1, evaluation results for each model, and the datasets used for post-training.

For more details on the resources released this time, please refer to the links below.

Model
- Base Model
- Instruction-tuned Model
Dataset
- llm-jp/extraction-wiki-ja
Technical blog