Designing a Japanese Error Corpus From Your Own Mistakes
The reader can build a personal Japanese error corpus from writing, speaking, translation, and comprehension mistakes, then turn it into targeted practice.
Core examples: 誤用, 添削, 助詞, 敬語, コロケーション, 直訳, 自他動詞, 漢字ミス, レジスター, 発音, 修正, 再発.
Your mistakes are not shame. They are data.
A learner gets corrected:
❌ ご確認してください。 ✅ ご確認ください。 / 確認してください。
They fix the sentence and move on.
Two weeks later:
❌ ご返信してください。 ✅ ご返信ください。 / 返信してください。
This is not two separate mistakes. It is a pattern. The learner has not learned how honorific prefixes interact with する verbs and request forms.
That is why an error corpus matters.
The key principle is:
Repeated errors are a curriculum trying to reveal itself.
Do not merely collect corrections. Classify them, revisit them, and convert them into drills.
What is an error corpus?
An error corpus is a structured collection of mistakes.
A personal Japanese error corpus should include:
- original mistaken sentence,
- intended meaning,
- corrected version,
- context,
- source task,
- error type,
- severity,
- explanation,
- native/corpus examples,
- practice prompt,
- recurrence count,
- retest date.
A notebook of corrections is passive. A corpus is diagnostic.
誤用
誤用
means incorrect use/misuse.
Examples of 誤用:
❌ 彼に本を読んだ。 ✅ 彼が本を読んだ。 particle misuse
❌ ご確認してください。 ✅ ご確認ください。 keigo/request-form misuse
❌ 雨が強いです。 Possible, but depending context: 雨が激しい / 雨が強く降っています may be more natural. collocation/register issue.
Learner action: every error needs a type, not just a correction.
添削
添削
correction of writing.
添削 may come from:
- teacher,
- tutor,
- native speaker,
- editor,
- AI with verification,
- language exchange partner,
- self-correction after dictionary/corpus check.
Not all corrections are equal. A casual native speaker may fix naturalness but not explain. A teacher may explain grammar. An AI may be useful but must be verified for subtle Japanese.
Learner action: record source of correction.
助詞
助詞
particles.
Particle errors are common and diagnostic.
Examples:
❌ 東京を住んでいます。 ✅ 東京に住んでいます。
❌ 先生で聞きました。 ✅ 先生に聞きました.
❌ 日本語を上手です。 ✅ 日本語が上手です.
Corpus fields:
- mistaken particle,
- correct particle,
- verb/adjective involved,
- pattern,
- example.
Learner action: particle errors should be grouped by predicate.
敬語
敬語
honorific language.
Common learner errors:
❌ ご確認してください ✅ ご確認ください
❌ 先生が参ります ✅ 先生がいらっしゃいます
❌ 私がいらっしゃいます ✅ 私が参ります / 伺います
Keigo errors often show confusion about who is honored.
Error type fields:
- respectful form,
- humble form,
- polite form,
- subject honored,
- listener respected,
- domain: business/service/school.
Learner action: keigo errors need relationship notes.
コロケーション
コロケーション
collocation.
Collocation errors are often invisible if the sentence is grammatically correct but unnatural.
Examples:
❌ 確認を作る ✅ 確認する
❌ 写真を取る ✅ 写真を撮る
❌ 強い雨が降る ✅ 強い雨 can occur, but 激しい雨 / 雨が強く降る may fit different contexts.
A corpus should record the natural word pair.
Learner action: collocation errors require example mining.
直訳
直訳
literal translation.
Translationese errors come from English-shaped Japanese.
Example:
❌ 私はそれが良いアイデアだと思います。 Possible, but often over-explicit.
Natural alternatives by context:
いいと思います。 それは良い案だと思います。 その案でよいと思います。
直訳 errors often include unnecessary pronouns, unnatural word order, or English idiom transfer.
Learner action: record the English source thought if relevant.
自他動詞
自他動詞
intransitive/transitive verbs.
Examples:
ドアが開く the door opens
ドアを開ける open the door
Common errors:
❌ ドアを開いた ✅ ドアを開けた
❌ 電気が消した ✅ 電気を消した / 電気が消えた
A corpus should group pairs:
| Intransitive | Transitive |
|---|---|
| 開く | 開ける |
| 閉まる | 閉める |
| 消える | 消す |
| 入る | 入れる |
| 出る | 出す |
Learner action: practice pairs in context.
漢字ミス
漢字ミス
kanji mistake.
Types:
- wrong character,
- wrong homophone,
- missing okurigana,
- wrong variant,
- lookalike confusion,
- over-kanji use,
- kana should be used.
Examples:
❌ 以外と ✅ 意外と
❌ 保障 / 保証 / 補償 confusion
❌ 取扱い / 取り扱い style inconsistency
Learner action: record why the wrong kanji seemed plausible.
レジスター
レジスター
register.
A sentence can be correct but wrong for context.
Examples:
❌ お世話になっております。 to a close friend in a casual chat.
❌ これ、マジでヤバいです。 in a formal report.
❌ 何卒よろしく! mixed formal/casual.
Register error fields:
- intended setting,
- actual phrase,
- better phrase,
- formality mismatch,
- domain.
Learner action: register errors are usage errors, not grammar errors.
発音
発音
pronunciation.
Pronunciation errors can be part of an error corpus too.
Fields:
- word/phrase,
- target feature,
- recording link/date,
- model source,
- error category,
- corrected attempt.
Examples:
長音 missing: おばさん / おばあさん
促音 missing: きて / きって
pitch error: 雨 / 飴
Learner action: audio mistakes need recordings, not just notes.
修正
修正
correction/revision.
A correction entry should include the corrected version and the rule or pattern.
Bad entry:
Mistake: ご確認してください. Correct: ご確認ください.
Better entry:
Error type: keigo/request. Explanation: ご + noun/する-stem + ください can form a polite request; do not add してください after ご確認. Related: ご返信ください, ご記入ください, ご提出ください. Drill: Rewrite five “Please X” requests.
再発
再発
recurrence/relapse.
In an error corpus, 再発 means the same error pattern appears again.
A repeated error should be counted.
Example:
| Date | Error | Type | Recurrence |
|---|---|---|---|
| May 1 | ご確認してください | 敬語 | 1 |
| May 15 | ご返信してください | 敬語 | 2 |
| May 28 | ご記入してください | 敬語 | 3 |
After recurrence 3, stop merely correcting. Create a micro-drill.
Severity
Not all errors deserve equal attention.
| Severity | Example | Action |
|---|---|---|
| high | medical/legal/safety meaning wrong | immediate correction and caution |
| high | rude register in formal email | targeted practice |
| medium | particle error affects meaning | pattern drill |
| medium | collocation unnatural | example mining |
| low | stylistic awkwardness | note and expose |
| low | rare kanji issue | defer unless recurring |
Learner action: prioritize errors by consequence and recurrence.
Error corpus template
Use a spreadsheet or database with these columns:
- ID.
- Date.
- Source task.
- Original sentence.
- Intended meaning.
- Corrected sentence.
- Error type.
- Subtype.
- Context/register.
- Explanation.
- Native examples.
- Practice prompt.
- Recurrence count.
- Retest date.
- Resolved? yes/no.
Example entry
Original:
❌ 来週までレポートを提出します。
Intended:
I will submit the report by next week.
Correction:
✅ 来週までにレポートを提出します。
Error type:
助詞 / deadline expression
Explanation:
までに marks deadline for completion. まで marks continuation until a time.
Native examples:
期限までに提出してください。 5時までに連絡します。
Practice prompt:
Translate five deadline sentences using までに.
Retest:
two weeks later.
Micro-drills from errors
A micro-drill is small and targeted.
Particle drill
Prompt:
Submit by Friday.
Answer:
金曜日までに提出する。
Keigo drill
Prompt:
Please confirm.
Answer:
ご確認ください。
Collocation drill
Prompt:
take measures
Answer:
対策を講じる / 対策を取る
Register drill
Prompt:
Say “Thanks” to a coworker after shared work.
Answer:
お疲れさまでした / ありがとうございました depending context.
Error categories
| Category | Examples |
|---|---|
| 助詞 | は/が/を/に/で mistakes |
| 活用 | conjugation |
| 時制・アスペクト | tense/aspect |
| 自他動詞 | transitive/intransitive |
| 敬語 | honorific/humble/polite |
| コロケーション | unnatural word pairing |
| 直訳 | translationese |
| 漢字ミス | wrong kanji/homophone |
| 語彙選択 | wrong word choice |
| レジスター | wrong formality/context |
| 語順 | unnatural order |
| 発音 | sound/pitch/timing |
Weekly review routine
Once a week:
- add new corrections,
- tag error types,
- sort by recurrence,
- pick top 2 patterns,
- find 3 native examples each,
- create 1 micro-drill per pattern,
- retest old patterns,
- mark resolved items.
Do not review every error every day. That becomes punishment. Use error data to choose practice.
Example bank walkthrough
誤用
Misuse.
Learner action: record incorrect use by type.
添削
Correction.
Learner action: source and correction quality.
助詞
Particle.
Learner action: group by predicate and pattern.
敬語
Honorific language.
Learner action: relationship and role.
コロケーション
Collocation.
Learner action: natural word pairing.
直訳
Literal translation.
Learner action: identify source-language interference.
自他動詞
Transitive/intransitive verbs.
Learner action: pair practice.
漢字ミス
Kanji mistake.
Learner action: homophone/lookalike note.
レジスター
Register.
Learner action: context mismatch.
発音
Pronunciation.
Learner action: record and compare.
修正
Correction/revision.
Learner action: corrected pattern.
再発
Recurrence.
Learner action: repeated error becomes drill.
Error-corpus workflow
To build your own Japanese error corpus:
- Collect mistakes weekly.
- Record original and intended meaning.
- Add corrected version.
- Tag error type.
- Write a short explanation.
- Add one or two native examples.
- Count recurrence.
- Create micro-drills for repeated patterns.
- Retest after two weeks.
- Archive resolved errors.
- Keep high-stakes errors visible.
Error type to practice type table
An error corpus becomes useful when each error type triggers the right drill.
| Error type | Best remediation |
|---|---|
| 助詞 | predicate-based sentence drills |
| 敬語 | role-map rewrites |
| コロケーション | native example mining |
| 直訳 | Japanese-first paraphrase practice |
| 自他動詞 | paired action/state drills |
| 漢字ミス | homophone/lookalike contrast cards |
| レジスター | scenario rewrite ladder |
| 発音 | recording and minimal-pair practice |
| 語順 | sentence skeleton reconstruction |
| 語彙選択 | near-synonym comparison |
| 活用 | focused conjugation cards |
| 読解ミス | source sentence reparse |
A correction without a practice type is likely to repeat.
Feedback-source reliability table
Not all corrections should be treated the same way.
| Source | Strength | Caution |
|---|---|---|
| trained teacher | explanation and pedagogy | may simplify |
| professional editor | natural written Japanese | may not explain |
| tutor/native speaker | naturalness feedback | may be intuitive only |
| language partner | conversational correction | uneven accuracy |
| corpus examples | usage evidence | requires interpretation |
| dictionary | sense and collocation | not full context |
| AI feedback | quick pattern suggestions | verify carefully |
| self-correction | builds awareness | blind spots remain |
Record the source of each correction so the corpus remains trustworthy.
Recurrence dashboard
A useful corpus should surface repeated patterns.
| Recurrence count | Action |
|---|---|
| 1 | note and move on |
| 2 | add one native example |
| 3 | create micro-drill |
| 4–5 | suspend related production until retrained |
| 6+ | ask teacher/tutor for diagnosis |
The goal is not to shame the learner. The goal is to stop treating the same pattern as a new surprise.
Resolved error criteria
Mark an error pattern “resolved” only when you can:
- explain the issue in plain language,
- recognize the correct pattern in real input,
- produce the corrected form in a controlled drill,
- avoid the error in new writing/speaking,
- pass a retest after at least two weeks.
A corrected sentence is not the same as a resolved pattern.
A strong tool for this article would turn mistakes into practice.
Suggested functions:
- Error-entry form.
- Tag system for Japanese-specific error types.
- Recurrence counter.
- Severity rating.
- Native-example slot.
- Micro-drill generator.
- Retest reminder.
- Resolved-pattern archive.
Final rule
A mistake corrected once is a note. A mistake tracked over time is a learning system.
誤用 shows what failed. 添削 gives correction. 助詞, 敬語, コロケーション, 直訳, 自他動詞, 漢字ミス, レジスター, and 発音 tell the error type. 修正 gives the model. 再発 tells you what to train.
Stop being embarrassed by mistakes. Make them work.
Related reading
National Language Policy and the Idea of Kokugo
The reader can understand kokugo as a national-language idea with educational, political, and cultural consequences.
Kanji Component Analysis Without Fake Etymology
The reader can use kanji components for memory and lookup while avoiding made-up etymologies that teach false history.
Tracking Japanese Listening Progress With Real Audio
The reader can track Japanese listening progress using real audio, transcripts, comprehension targets, error categories, and repeated measurement.
When CJK Comparison Helps Learners and When It Becomes Noise
The reader can decide when CJK comparison accelerates Japanese learning and when it creates noise, overconfidence, or bad habits.
Japanese Shadowing With Drama, News, and Interviews
The reader can build a Japanese shadowing practice that changes depending on whether the source is drama, news, interview, or presentation.
The Real Function of Kana in Advanced Japanese
The reader can see kana as advanced grammar infrastructure, not just an introductory phonetic alphabet.