Santa has consulted CoolGear to help him with expanding his operations to China, due to the large number of children living there who do not currently receive his gifts.
However, Santa has run into a problem: he does not know how to speak Chinese, and this thwarts his efforts to determine who is naughty and who is nice.
To start gently, let's help Santa understand the basic phonotactics of Mandarin Chinese.
In Hanyu Pinyin, there are 22 possible syllable initials:
b, p, m, f
d, t, n, l
g, k, h
j, q, x
zh, ch, sh, r
z, c, s
∅ (null initial)
and 35 possible syllable finals:
Group a: a,e,ai,ei,ao,ou,an,en,ang,eng,ong,u
Group i: ia,ie,iao,iu,ian,in,iang,ing,iong
Group u: ua,uo,uai,ui,uan,un,uang
Group ü: ü,üe,üan,ün
Misfits: i, o, er
Syllables can be formed by combining an initial with a final. However, not all possible combinations are phonotactically valid.
The rules below form a simplified approximation of the rules of Mandarin Chinese phonotactics and pinyin spelling:
- ∅ (the null initial) can be combined with any of the finals. However, some finals are written differently in a syllable with a null initial.
Namely:
∅+i=yi
∅+ia=ya
∅+iao=yao
∅+ie=ye
∅+iu=you
∅+ian=yan
∅+in=yin
∅+iang=yang
∅+ing=ying
∅+iong=yong |
∅+u=wu
∅+ua=wa
∅+uo=wo
∅+uai=wai
∅+ui=wei
∅+uan=wan
∅+un=wen
∅+uang=wang |
∅+ü=yu
∅+üe=yue
∅+üan=yuan
∅+ün=yun |
- "Group a" finals can be paired with any initial except j,q,x
- "Group i" finals can only be paired with ∅,b,p,m,d,t,n,l,j,q,x
- "Group u" finals can only be paired with ∅,d,t,n,l,g,k,h,zh,ch,sh,r,z,c,s
- "Group ü" finals can only be paired with ∅,n,l,j,q,x.
When j,q,x are paired with ü, the diacritics are dropped. E.g. jü ⇛ ju
- "i" can be paired with any initial except g,k,h
- "o" can only be paired with ∅,b,p,m,f,l
- "er" can only be paired with ∅
Problem 1
According to the rules above, how many possible syllables are there in Hanyu Pinyin?
Problem 2
Some words in English can be reanalyzed as a sequence of pinyin syllables.
For example, "enchilada" can be divided into valid pinyin syllables en-chi-la-da.
However, some English words have combinations of letters that cannot be divided into valid pinyin syllables, such as "chocolate".
What is the longest word in the English language (according to this wordlist) that can be reanalyzed as a series of pinyin syllables?