Inkuntri
Japanese Research, tools & pedagogy

Designing a Japanese Error Corpus From Your Own Mistakes

The reader can build a personal Japanese error corpus from writing, speaking, translation, and comprehension mistakes, then turn it into targeted practice.

Published January 30, 2026 Japanese

Core examples: 誤用, 添削, 助詞, 敬語, コロケーション, 直訳, 自他動詞, 漢字ミス, レジスター, 発音, 修正, 再発.

Your mistakes are not shame. They are data.

A learner gets corrected:

❌ ご確認してください。 ✅ ご確認ください。 / 確認してください。

They fix the sentence and move on.

Two weeks later:

❌ ご返信してください。 ✅ ご返信ください。 / 返信してください。

This is not two separate mistakes. It is a pattern. The learner has not learned how honorific prefixes interact with する verbs and request forms.

That is why an error corpus matters.

The key principle is:

Repeated errors are a curriculum trying to reveal itself.

Do not merely collect corrections. Classify them, revisit them, and convert them into drills.

What is an error corpus?

An error corpus is a structured collection of mistakes.

A personal Japanese error corpus should include:

  • original mistaken sentence,
  • intended meaning,
  • corrected version,
  • context,
  • source task,
  • error type,
  • severity,
  • explanation,
  • native/corpus examples,
  • practice prompt,
  • recurrence count,
  • retest date.

A notebook of corrections is passive. A corpus is diagnostic.

誤用

誤用

means incorrect use/misuse.

Examples of 誤用:

❌ 彼に本を読んだ。 ✅ 彼が本を読んだ。 particle misuse

❌ ご確認してください。 ✅ ご確認ください。 keigo/request-form misuse

❌ 雨が強いです。 Possible, but depending context: 雨が激しい / 雨が強く降っています may be more natural. collocation/register issue.

Learner action: every error needs a type, not just a correction.

添削

添削

correction of writing.

添削 may come from:

  • teacher,
  • tutor,
  • native speaker,
  • editor,
  • AI with verification,
  • language exchange partner,
  • self-correction after dictionary/corpus check.

Not all corrections are equal. A casual native speaker may fix naturalness but not explain. A teacher may explain grammar. An AI may be useful but must be verified for subtle Japanese.

Learner action: record source of correction.

助詞

助詞

particles.

Particle errors are common and diagnostic.

Examples:

❌ 東京を住んでいます。 ✅ 東京に住んでいます。

❌ 先生で聞きました。 ✅ 先生に聞きました.

❌ 日本語を上手です。 ✅ 日本語が上手です.

Corpus fields:

  • mistaken particle,
  • correct particle,
  • verb/adjective involved,
  • pattern,
  • example.

Learner action: particle errors should be grouped by predicate.

敬語

敬語

honorific language.

Common learner errors:

❌ ご確認してください ✅ ご確認ください

❌ 先生が参ります ✅ 先生がいらっしゃいます

❌ 私がいらっしゃいます ✅ 私が参ります / 伺います

Keigo errors often show confusion about who is honored.

Error type fields:

  • respectful form,
  • humble form,
  • polite form,
  • subject honored,
  • listener respected,
  • domain: business/service/school.

Learner action: keigo errors need relationship notes.

コロケーション

コロケーション

collocation.

Collocation errors are often invisible if the sentence is grammatically correct but unnatural.

Examples:

❌ 確認を作る ✅ 確認する

❌ 写真を取る ✅ 写真を撮る

❌ 強い雨が降る ✅ 強い雨 can occur, but 激しい雨 / 雨が強く降る may fit different contexts.

A corpus should record the natural word pair.

Learner action: collocation errors require example mining.

直訳

直訳

literal translation.

Translationese errors come from English-shaped Japanese.

Example:

❌ 私はそれが良いアイデアだと思います。 Possible, but often over-explicit.

Natural alternatives by context:

いいと思います。 それは良い案だと思います。 その案でよいと思います。

直訳 errors often include unnecessary pronouns, unnatural word order, or English idiom transfer.

Learner action: record the English source thought if relevant.

自他動詞

自他動詞

intransitive/transitive verbs.

Examples:

ドアが開く the door opens

ドアを開ける open the door

Common errors:

❌ ドアを開いた ✅ ドアを開けた

❌ 電気が消した ✅ 電気を消した / 電気が消えた

A corpus should group pairs:

IntransitiveTransitive
開く開ける
閉まる閉める
消える消す
入る入れる
出る出す

Learner action: practice pairs in context.

漢字ミス

漢字ミス

kanji mistake.

Types:

  • wrong character,
  • wrong homophone,
  • missing okurigana,
  • wrong variant,
  • lookalike confusion,
  • over-kanji use,
  • kana should be used.

Examples:

❌ 以外と ✅ 意外と

❌ 保障 / 保証 / 補償 confusion

❌ 取扱い / 取り扱い style inconsistency

Learner action: record why the wrong kanji seemed plausible.

レジスター

レジスター

register.

A sentence can be correct but wrong for context.

Examples:

❌ お世話になっております。 to a close friend in a casual chat.

❌ これ、マジでヤバいです。 in a formal report.

❌ 何卒よろしく! mixed formal/casual.

Register error fields:

  • intended setting,
  • actual phrase,
  • better phrase,
  • formality mismatch,
  • domain.

Learner action: register errors are usage errors, not grammar errors.

発音

発音

pronunciation.

Pronunciation errors can be part of an error corpus too.

Fields:

  • word/phrase,
  • target feature,
  • recording link/date,
  • model source,
  • error category,
  • corrected attempt.

Examples:

長音 missing: おばさん / おばあさん

促音 missing: きて / きって

pitch error: 雨 / 飴

Learner action: audio mistakes need recordings, not just notes.

修正

修正

correction/revision.

A correction entry should include the corrected version and the rule or pattern.

Bad entry:

Mistake: ご確認してください. Correct: ご確認ください.

Better entry:

Error type: keigo/request. Explanation: ご + noun/する-stem + ください can form a polite request; do not add してください after ご確認. Related: ご返信ください, ご記入ください, ご提出ください. Drill: Rewrite five “Please X” requests.

再発

再発

recurrence/relapse.

In an error corpus, 再発 means the same error pattern appears again.

A repeated error should be counted.

Example:

DateErrorTypeRecurrence
May 1ご確認してください敬語1
May 15ご返信してください敬語2
May 28ご記入してください敬語3

After recurrence 3, stop merely correcting. Create a micro-drill.

Severity

Not all errors deserve equal attention.

SeverityExampleAction
highmedical/legal/safety meaning wrongimmediate correction and caution
highrude register in formal emailtargeted practice
mediumparticle error affects meaningpattern drill
mediumcollocation unnaturalexample mining
lowstylistic awkwardnessnote and expose
lowrare kanji issuedefer unless recurring

Learner action: prioritize errors by consequence and recurrence.

Error corpus template

Use a spreadsheet or database with these columns:

  1. ID.
  2. Date.
  3. Source task.
  4. Original sentence.
  5. Intended meaning.
  6. Corrected sentence.
  7. Error type.
  8. Subtype.
  9. Context/register.
  10. Explanation.
  11. Native examples.
  12. Practice prompt.
  13. Recurrence count.
  14. Retest date.
  15. Resolved? yes/no.

Example entry

Original:

❌ 来週までレポートを提出します。

Intended:

I will submit the report by next week.

Correction:

✅ 来週までにレポートを提出します。

Error type:

助詞 / deadline expression

Explanation:

までに marks deadline for completion. まで marks continuation until a time.

Native examples:

期限までに提出してください。 5時までに連絡します。

Practice prompt:

Translate five deadline sentences using までに.

Retest:

two weeks later.

Micro-drills from errors

A micro-drill is small and targeted.

Particle drill

Prompt:

Submit by Friday.

Answer:

金曜日までに提出する。

Keigo drill

Prompt:

Please confirm.

Answer:

ご確認ください。

Collocation drill

Prompt:

take measures

Answer:

対策を講じる / 対策を取る

Register drill

Prompt:

Say “Thanks” to a coworker after shared work.

Answer:

お疲れさまでした / ありがとうございました depending context.

Error categories

CategoryExamples
助詞は/が/を/に/で mistakes
活用conjugation
時制・アスペクトtense/aspect
自他動詞transitive/intransitive
敬語honorific/humble/polite
コロケーションunnatural word pairing
直訳translationese
漢字ミスwrong kanji/homophone
語彙選択wrong word choice
レジスターwrong formality/context
語順unnatural order
発音sound/pitch/timing

Weekly review routine

Once a week:

  1. add new corrections,
  2. tag error types,
  3. sort by recurrence,
  4. pick top 2 patterns,
  5. find 3 native examples each,
  6. create 1 micro-drill per pattern,
  7. retest old patterns,
  8. mark resolved items.

Do not review every error every day. That becomes punishment. Use error data to choose practice.

Example bank walkthrough

誤用

Misuse.

Learner action: record incorrect use by type.

添削

Correction.

Learner action: source and correction quality.

助詞

Particle.

Learner action: group by predicate and pattern.

敬語

Honorific language.

Learner action: relationship and role.

コロケーション

Collocation.

Learner action: natural word pairing.

直訳

Literal translation.

Learner action: identify source-language interference.

自他動詞

Transitive/intransitive verbs.

Learner action: pair practice.

漢字ミス

Kanji mistake.

Learner action: homophone/lookalike note.

レジスター

Register.

Learner action: context mismatch.

発音

Pronunciation.

Learner action: record and compare.

修正

Correction/revision.

Learner action: corrected pattern.

再発

Recurrence.

Learner action: repeated error becomes drill.

Error-corpus workflow

To build your own Japanese error corpus:

  1. Collect mistakes weekly.
  2. Record original and intended meaning.
  3. Add corrected version.
  4. Tag error type.
  5. Write a short explanation.
  6. Add one or two native examples.
  7. Count recurrence.
  8. Create micro-drills for repeated patterns.
  9. Retest after two weeks.
  10. Archive resolved errors.
  11. Keep high-stakes errors visible.

Error type to practice type table

An error corpus becomes useful when each error type triggers the right drill.

Error typeBest remediation
助詞predicate-based sentence drills
敬語role-map rewrites
コロケーションnative example mining
直訳Japanese-first paraphrase practice
自他動詞paired action/state drills
漢字ミスhomophone/lookalike contrast cards
レジスターscenario rewrite ladder
発音recording and minimal-pair practice
語順sentence skeleton reconstruction
語彙選択near-synonym comparison
活用focused conjugation cards
読解ミスsource sentence reparse

A correction without a practice type is likely to repeat.

Feedback-source reliability table

Not all corrections should be treated the same way.

SourceStrengthCaution
trained teacherexplanation and pedagogymay simplify
professional editornatural written Japanesemay not explain
tutor/native speakernaturalness feedbackmay be intuitive only
language partnerconversational correctionuneven accuracy
corpus examplesusage evidencerequires interpretation
dictionarysense and collocationnot full context
AI feedbackquick pattern suggestionsverify carefully
self-correctionbuilds awarenessblind spots remain

Record the source of each correction so the corpus remains trustworthy.

Recurrence dashboard

A useful corpus should surface repeated patterns.

Recurrence countAction
1note and move on
2add one native example
3create micro-drill
4–5suspend related production until retrained
6+ask teacher/tutor for diagnosis

The goal is not to shame the learner. The goal is to stop treating the same pattern as a new surprise.

Resolved error criteria

Mark an error pattern “resolved” only when you can:

  1. explain the issue in plain language,
  2. recognize the correct pattern in real input,
  3. produce the corrected form in a controlled drill,
  4. avoid the error in new writing/speaking,
  5. pass a retest after at least two weeks.

A corrected sentence is not the same as a resolved pattern.

A strong tool for this article would turn mistakes into practice.

Suggested functions:

  1. Error-entry form.
  2. Tag system for Japanese-specific error types.
  3. Recurrence counter.
  4. Severity rating.
  5. Native-example slot.
  6. Micro-drill generator.
  7. Retest reminder.
  8. Resolved-pattern archive.

Final rule

A mistake corrected once is a note. A mistake tracked over time is a learning system.

誤用 shows what failed. 添削 gives correction. 助詞, 敬語, コロケーション, 直訳, 自他動詞, 漢字ミス, レジスター, and 発音 tell the error type. 修正 gives the model. 再発 tells you what to train.

Stop being embarrassed by mistakes. Make them work.

Related reading