Speech recognition for Japanese

Why not?

Prepare an app for recording

  1. Keep showing sentences
  2. Let me record my voice
  3. Upload the voice to server

Input data and labels

あいうえお かきくけこ さしすせそ たちつてと なにぬねのはひふへほ まみむめも やゆよ らりるれろ わをんぁぃぅぇぉ ゃゅょ ゎ っがぎぐげご ざじずぜぞ だぢづでど ばびぶべぼぱぴぷぺぽ ゔ ー <Space>
  1. Some characters like and have same sounds.
  2. Sometimes comma is used just for clarification and we don’t take a breathing pause at that case. And where we take a breathing pause is ultimately up to reader.
  3. Young generation use to express long lasting sounds but is not a traditional Japanese.

Record voice



  1. Data is definitely not enough. I may need 100x more data?
  2. Where we take a breathing pause is tricky. I’m not sure whether I should even include <space> which represent breathing pause at all. Or should I have added <space> manually after recording voice ?
  3. I used Hiragana for labels. But there are exceptions and some Hiragana characters have unfortunately same pronunciation.


I will develop better model with your suggestions and more data and report it on this blog.




