Inkuntri
Chinese Pronunciation & spoken language

Mandarin Tone Is Not Four Static Notes: Contour, Timing, and Context

The reader understands tones as dynamic pitch contours shaped by duration, neighboring tones, stress, and sentence context.

Published April 18, 2026 Chinese

Core examples: mā/má/mǎ/mà, 妈/麻/马/骂, 你好, 我想去, 今天很忙, tone-pair drills. Recommended feature module: Tone contour player with waveform and pitch trace: isolated syllable, word, phrase, and sentence versions of the same syllables. Related internal articles: 025, 037, 038, 044, 045, 055, 058, 063, 064, 065.

The “four notes” metaphor breaks early

Mandarin is often introduced as having four tones:

mā  má  mǎ  mà
妈  麻  马  骂

That beginning is useful. It shows that pitch can distinguish meaning. It gives learners a first map: level, rising, low/dipping, falling.

But the metaphor of “four notes” becomes harmful if it makes tones sound like fixed musical notes sitting on top of syllables. Mandarin tones are not static notes. They are contours: movements through a speaker’s pitch range over time.

A tone has:

starting pitch
ending pitch
movement shape
duration
stress level
position in a word or phrase
neighboring tones
speaker pitch range
sentence intonation

This is why a first tone from one speaker may be lower than a fourth tone from another speaker, yet both are correct in context. Tones are relative within a speaker’s range. A child, an adult man, and an adult woman do not share the same absolute pitch. They can all produce correct Mandarin tones because the contour and contrast are what matter.

Better first principle:

Mandarin tone is controlled pitch movement inside a syllable, interpreted relative to the speaker and the phrase.

1. Citation tones: the clean teaching version

In isolation, the four main tones are commonly taught like this:

TonePinyin markTeaching shapeExample
1st tonehigh level妈 mā
2nd tonerising麻 má
3rd tonelow, often taught as dipping马 mǎ
4th tonesharp falling骂 mà
neutral tonemalight/unstressed吗 ma

A traditional pitch-number shorthand uses a five-level scale:

ToneCommon pedagogical valueMeaning of value
1st55high to high
2nd35mid to high
3rd214low falling then rising, in citation form
4th51high to low

This is a teaching map, not a promise that every syllable in real speech is pronounced exactly that way.

The most important correction is the third tone. In ordinary connected speech, third tone is often not a dramatic full “fall-rise.” It is commonly low, short, and incomplete unless it appears before a pause or receives emphasis. Learners who always perform a full dipping third tone may sound slow, theatrical, or unnatural.

2. The first tone is level, not necessarily “high for everyone”

The first tone is usually described as high and level. But “high” means high within the speaker’s comfortable Mandarin pitch range.

Examples:

妈 mā
今天 jīntiān
北京 Běijīng 里的 jīng

Learner errors:

ErrorWhat happens
Singing first tone too highvoice becomes strained and unnatural
Letting first tone fallit may sound closer to fourth tone or unstressed speech
Making every first tone equally longspeech becomes robotic

First tone should feel steady. In phrases, it can be shorter or longer depending on rhythm:

今天很忙。
Jīntiān hěn máng.
Today is busy.

The jīn in 今天 is first tone, but in normal speech it does not need to be held like a singing exercise. It needs to remain recognizably level while fitting the phrase.

3. The second tone rises; it does not start at the bottom

Second tone rises. A common learner mistake is starting too low and making it a slow scoop. That can blur with third tone.

Better mental model:

Second tone starts around mid pitch and rises with energy.

Examples:

麻 má
忙 máng
学习 xuéxí
没有 méiyǒu

In 学习, both syllables are rising tone:

xuéxí

But fluent speech does not pronounce them as two isolated classroom swoops. The tones are shaped inside a word. The first syllable rises enough to be heard, then the second syllable also rises, often with phrase rhythm controlling duration.

Diagnostic question:

Can a listener hear the rise before the syllable is over?

If the rise comes too late, the second tone may sound hesitant or may be misheard.

4. The third tone is the most misunderstood tone

The textbook third tone is often shown as a dip:

mǎ = low falling-rising

In reality, third tone has several surface forms:

ContextCommon realization
isolationfull or fuller low-dipping contour
before another third tonechanges by sandhi; first becomes second-tone-like
before non-third toneoften low or half-third
before pause/emphasismay have a clearer rise
fast speechshortened low contour

Examples:

我 wǒ
很好 hěn hǎo
我想去 wǒ xiǎng qù

In 我想去, pronouncing both and as full dramatic dips makes the phrase heavy. Natural speech tends to group and reduce. The tones are still present, but their shape is adapted to the phrase.

Learner rule:

Do not learn third tone as “always dip.”
Learn it as “low, with full dip mainly in citation or phrase-final/emphatic contexts.”

Article 037 handles third-tone chains in detail.

5. The fourth tone is falling, but timing matters

Fourth tone falls sharply.

Examples:

骂 mà
去 qù
大 dà
电话 diànhuà
但是 dànshì

Learner errors:

ErrorEffect
Starting too lowno room to fall
Falling too latesounds like a flat or delayed tone
Making it too angrysounds emotionally harsh when not intended
Clipping the syllable too muchconsonant/vowel may become unclear

Fourth tone requires room to fall. It should start relatively high and move downward decisively.

But fourth tone is not always an emotional bark. In ordinary words like 电话, 现在, 但是, and 重要, fourth tones are normal lexical tones, not anger signals.

Practice:

去。        qù.        Go.
我去。      wǒ qù.     I’ll go.
我现在去。  wǒ xiànzài qù.  I’ll go now.

The tone remains falling, but its duration and force adapt to the sentence.

6. Tone and duration belong together

Tones need time. If a syllable is too short, the contour may not have enough space to unfold. If it is too long, speech becomes unnatural.

Consider:

我想去。
SyllableToneNatural issue
我 wǒ3often low/short, not full dramatic dip
想 xiǎng3affected by grouping and following tone
去 qù4final fall can be clearer

The final syllable in a short phrase may receive more duration. The middle syllable may be compressed. This means tone training cannot stop at isolated syllables.

A good sequence:

syllable → two-syllable word → tone pair → short phrase → sentence → paragraph

Learners who skip directly from isolated syllables to natural conversation often lose control. Learners who stay forever in isolated syllables often sound unnatural.

7. Tone is shaped by neighboring tones

Mandarin tones influence each other in connected speech. Some changes are formal sandhi rules; others are ordinary coarticulation.

Examples:

你好 nǐ hǎo → often pronounced ní hǎo
不去 bù qù → bú qù
一半 yī bàn → yí bàn

But even when there is no named rule, neighboring tones affect timing and pitch range.

Compare:

今天 jīntiān     1-1
学习 xuéxí       2-2
很好 hěn hǎo     3-3 sandhi environment
但是 dànshì      4-4

A fourth tone before another fourth tone does not sound like two isolated falling syllables pasted together. The first fall may be compressed, and the phrase rhythm determines how much pitch space remains.

This is why tone pairs are foundational. Article 044 develops that training method.

8. Sentence intonation does not erase lexical tone

In English, yes/no questions often rise at the end. Mandarin can also use sentence-level pitch, but it must preserve lexical tone enough for words to remain recognizable.

Compare:

你去。      Nǐ qù.       You’re going.
你去吗?    Nǐ qù ma?    Are you going?
你去不去?  Nǐ qù bu qù? Are you going or not?

The question force can come from , A-not-A structure, particles, pitch range, or context. Learners who impose English question intonation may distort the final lexical tone.

For example, 去 qù is fourth tone. In 你去吗?, the should not become a rising English-style question tone. The sentence can sound like a question because of and overall pitch, while remains falling enough to be identified.

Article 045 covers question intonation more directly.

9. Emotion changes pitch range, not the identity of tones

An excited speaker may use a wider pitch range. A tired speaker may use a narrower range. An angry speaker may increase intensity and shorten syllables. A pleading speaker may lengthen syllables and use particles.

But Mandarin lexical tones still need to survive.

太好了!
Tài hǎo le!
Great!

The emotion may raise the whole phrase, but remains fourth tone, remains third-tone behavior in context, and is light.

Learner mistake:

To sound expressive, I can ignore tone contours.

Better:

First keep lexical tone contrasts clear.
Then vary pitch range, duration, volume, and particles for emotion.

10. A practical tone-training sequence

A rigorous tone practice routine should move from controlled to natural.

Stage 1: isolated contrast

mā má mǎ mà
bā bá bǎ bà
shī shí shǐ shì

Goal: hear and produce the contrast.

Stage 2: tone pairs with real words

中国 Zhōngguó  1-2
学习 xuéxí      2-2
可以 kěyǐ       3-3 environment
但是 dànshì     4-4
没有 méiyǒu     2-3

Goal: keep tones under coarticulation.

Stage 3: short phrases

我想去。
今天很忙。
你有没有时间?
这个可以吗?

Goal: maintain tones inside rhythm.

Stage 4: paragraph shadowing

Use a short native-speaker clip. Mark tone problems by word, not by abstract tone number.

Problem word: 没有
Issue: second tone not rising enough; third tone too fully dipped.

Stage 5: communicative stress

Say the same sentence with different focus:

我今天去。  I am going today.
我今天去。  I am going today, not tomorrow.
我今天去。  I am going today, not someone else.

Goal: keep lexical tone while changing information focus.

11. Tool concept: tone contour player

The Inkuntri module should show the same word at several levels:

妈 mā
妈妈 māma
妈妈来了 māma lái le
我妈妈来了 wǒ māma lái le

For each level, display:

LayerFunction
audionative speaker at slow and natural speed
pitch tracevisual contour over time
waveformsyllable duration and stress
tone labelsunderlying tones and surface notes
learner recordingcomparison with native model
warning labels“third tone is half here,” “neutral tone is light,” “question intonation does not replace tone”

The tool should never reduce feedback to “tone correct/incorrect” only. It should identify timing and movement:

Second tone starts too low.
Fourth tone falls too late.
Third tone is over-dipped in connected speech.
First tone drifts downward.

That is useful feedback.

Tone values are training maps, not physical absolutes

Textbooks often teach Mandarin tones with numbers such as 55, 35, 214, and 51. This is useful, but it has to be understood correctly. These numbers describe relative pitch movement inside a speaker’s range, not fixed musical notes. A child, a bass-voiced adult, and a soprano-voiced adult will not produce the same absolute frequencies. They can still produce the same Mandarin tone category.

A practical learner version looks like this:

ToneClassroom contourBetter learner question
1sthigh levelDid I keep it steady and sufficiently high in my own range?
2ndrisingDid the pitch clearly rise toward the end?
3rdlow / dippingDid it stay low before non-third tones, and change before another third tone?
4thfallingDid it fall early and decisively rather than sag late?
neutralunstressedDid I reduce duration and attach it to the previous syllable?

This distinction matters because many learners chase the wrong target. They try to imitate the height of a teacher’s voice instead of the shape and timing of the contour. A low-voiced learner can make a good first tone without sounding high in absolute terms. A high-voiced learner can make a bad first tone if the syllable wobbles or drifts downward.

The best self-check is comparative:

Record: mā má mǎ mà
Then record: tā tá tǎ tà
Then record: dā dá dǎ dà

If your first tones are steady relative to your other tones, the category is probably developing. If every tone starts in the same place and only the final part changes, the contrast is still weak.

The “tone plus vowel” problem

Tone is carried by the voiced part of the syllable. That means your vowel and final quality affect how audible the tone is. A learner may think the tone is wrong when the real problem is the final.

Compare:

mā má mǎ mà
jī jí jǐ jì
fēng féng fěng fèng

The open vowel in ma gives plenty of room for pitch movement. The high vowel in ji can make the contour feel tighter. The nasal final in feng requires the tone to survive across a vowel plus nasal ending. This is why tone drills should not use only ma. The syllable ma is a clean demonstration, but it is not enough for real speech.

A stronger tone curriculum rotates syllable types:

Syllable typeExamplesWhy it matters
Open vowelma, ba, taGood for hearing contour clearly.
High vowelji, qi, xuTests whether tone survives narrow vowel space.
Diphthong/final glidexiao, kuai, meiTests timing across a moving vowel.
Nasal finalfang, feng, mingTests whether the pitch continues through the nasal.
Neutral second syllablemāma, péngyouTests rhythm and reduction.

This is also why pitch graphs can be misleading if the learner does not know what they are looking at. The graph may show messy transitions because the syllable has a glide, a nasal, or a weak ending. The question is not “Is the line beautiful?” The question is “Is the contrast recoverable to a listener?”

Tone practice should move through five layers

A complete pronunciation routine should not jump from isolated syllables straight to conversation. Use five layers:

1. Isolated syllable: mà
2. Real word: 骂人 màrén
3. Short phrase: 不要骂人
4. Sentence: 他不是在骂人。
5. Short discourse: 他声音大,但不是在骂人。他只是有点着急。

Each layer tests a different skill.

LayerWhat it tests
Isolated syllableCan you produce the basic category?
Real wordCan the tone survive a neighboring syllable?
PhraseCan you maintain rhythm without flattening?
SentenceCan tone and intonation coexist?
DiscourseCan tone remain stable while attention shifts to meaning?

Most learners overpractice layer 1 and underpractice layers 3–5. The result is a familiar problem: the learner can recite tone charts but loses tones in speech.

The fix is not to abandon tone drills. The fix is to make the drills more speech-like earlier.

Diagnostic signs that a tone problem is really timing

Tone errors are often described as pitch errors, but timing is often the hidden cause.

SymptomLikely issueFix
Second tone sounds like third toneRise starts too low or too lateBegin from mid pitch and rise earlier.
Fourth tone sounds angry or clippedFall is overforcedKeep the fall sharp but not shouted.
Third tone sounds theatricalFull dip used everywhereUse low/half-third before non-third tones.
First tone sounds uncertainPitch drifts downwardSustain level pitch for the whole syllable.
Neutral tone sounds like full toneSyllable too longShorten and attach it to the previous syllable.

Timing also explains why slow practice can be both helpful and dangerous. Slow practice helps you feel the contour. But if you make every syllable equally long in real speech, Mandarin rhythm becomes unnatural. After slow practice, always do normal-speed practice.

A useful sequence:

slow and exaggerated → normal and clear → natural but controlled

Do not stop at the first stage.

A better audio specification for this article

The Inkuntri tone player should not contain only isolated syllables. It should include a controlled ladder for each example:

ma tone set
real-word set
minimal phrase set
sentence set
natural paragraph set

For each item, include:

slow audio
normal audio
one male speaker
one female speaker
pitch trace
learner recording slot
common failure notes

The tool should also let the learner hide Pinyin after the first pass. Pinyin is useful, but tone learning has to become auditory. A good practice flow is:

hear audio → repeat → reveal characters/Pinyin → record → compare → repeat after delay

The article and tool should make one message unavoidable: tones are not decorative marks over Pinyin. They are part of the spoken word.

Final learner takeaway

Mandarin tones are not four musical notes. They are dynamic contours shaped by time, stress, neighboring tones, and sentence context.

Learn the clean citation tones first, but do not stay there.

The serious path is:

hear contrast
produce contour
practice tone pairs
control phrase rhythm
preserve tones inside real sentences

Tone accuracy is not about sounding dramatic. It is about making pitch movement stable enough that words remain recognizable in natural speech.

Related reading