lfm25-audio-jp-dialect-lora

Hack the Liquid WAY 2026 — Track 2 / チーム山口

LFM2.5-Audio-1.5B-JP を LoRA fine-tune した、 方言音声(関西弁・熊本弁)→ 標準語テキスト変換アダプタです。

性能

モデル val_loss 備考
lora_convert_v4 (LoRA, 1000 steps) 0.934 このアダプタ
full_ft_v1 (Full FT, 800 steps) 0.908 参考値(不採用)

平均CER: 0.40(testセット 50件)/ 推論 1〜2秒(GPU)

LoRA 設定

  • r=16, alpha=32, dropout=0.05
  • 対象層: q_proj / k_proj / v_proj / out_proj / w1 / w2 / w3(FFN)
  • 訓練パラメータ: 11M / 1,464M = 0.76%
  • 学習環境: RTX 5090

使い方

import torch
import json
from pathlib import Path
from peft import LoraConfig, get_peft_model
import safetensors.torch as st
from liquid_audio import LFM2AudioModel, LFM2AudioProcessor, ChatState
from huggingface_hub import snapshot_download

# アダプタのダウンロード
adapter_dir = snapshot_download("YujiYamaguchi/lfm25-audio-jp-dialect-lora")

# モデルロード
MODEL_ID = "LiquidAI/LFM2.5-Audio-1.5B-JP"
processor = LFM2AudioProcessor.from_pretrained(MODEL_ID)
model = LFM2AudioModel.from_pretrained(MODEL_ID, device="cuda", dtype=torch.bfloat16)

# LoRA 適用
cfg = json.loads(Path(adapter_dir, "adapter_config.json").read_text())
lora_cfg = LoraConfig(
    r=cfg["r"], lora_alpha=cfg["lora_alpha"],
    target_modules=cfg["target_modules"],
    lora_dropout=cfg.get("lora_dropout", 0.05),
)
model = get_peft_model(model, lora_cfg)
weights = st.load_file(str(Path(adapter_dir, "adapter_model.safetensors")))
model.load_state_dict(weights, strict=False)
model.eval()

詳細は GitHubリポジトリ を参照。

学習データ

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for YujiYamaguchi/lfm25-audio-jp-dialect-lora

Adapter
(3)
this model

Dataset used to train YujiYamaguchi/lfm25-audio-jp-dialect-lora