Release of LLM-jp-3 MoE Series
The Research and Development Center for Large Language Models (LLMC) at the National Institute of Informatics is dedicated to developing open, Japanese-optimized large language models. Since September 2024, we have been releasing the “LLM-jp-3” series, trained using llm-jp-corpus v3. So far, we have released models with 150M, 440M, 980M, 1.8B, 3.7B, 7.2B, 13B, and 172B parameters.
We are excited to announce the release of two new models from LLM-jp—the first in our MoE (Mixture of Experts) model series: 8×1.8B and 8x13B. Both models are trained on llm-jp-corpus v3.
In evaluations using llm-jp-eval (v1.4.1) and Japanese MT Bench, the 8×1.8B model, with 9.2B total parameters and 2.9B active parameters, achieves performance comparable to a 7.2B dense model. The 8x13B model, with 73B total parameters and 22B active parameters, outperforms a 172B dense model.
Alongside the model release, we have also published a technical blog detailing the MoE pretraining methods and evaluation results.
Both the base and fine-tuned models are released under the Apache License 2.0. As fully open models—including data and training processes—they are freely available for application development, further fine-tuning, and more.
For more details on the released resources, please see the following link:
- Models