In other languages it’s easy to tell where a word begins and where it ends because of spaces. But how do you define the beginning and the end of a word in Japanese? Sure, particles like が、で、に、を、etc. can help, but there’s probably more to it than that, right?
For example, is 朝鮮民主主義人民共和国 (Democratic People’s Republic of Korea) a single word? If we go by the “there is no hiragana between kanji” logic, then it’s a single word.
15 comments
>For example, is 朝鮮民主主義人民共和国 (Democratic People’s Republic of Korea) a single word?
Sort of (compound noun) – but it does break down as follows…
* 朝鮮 = Korea
* 民主主義 = democracy
* 人民 = citizens
* 共和国 = republic
Usually the word boundaries are clear because there is a change of script (kanji vs hiragana vs katakana) and you recognize words. See [https://youtu.be/r7a8OjvViwE](https://youtu.be/r7a8OjvViwE) for examples.
If you can look them up as separate entities in a dictionary then they’re separate.
However if you understand when looking at them, it’s irrelevant. English didn’t always have spaces or punctuation either.
In practice it’s usually not that complicated.
The confusion only really comes from compound nouns, which your example is. You can stack a bunch of nouns one after another and it creates a longer and longer word. It’s really not that different from a lot of languages.
Some languages, like German also like to stack it to pretty ridiculous degrees.
Other languages tend to impose more practical limits.
But other than compound nouns, you can usually separate the words fairly easily. As you already note, particles make it much more obvious where one word begins and another ends.
True adjectives in Japanese always end with an “い”, so again, it’s easier to pick out. Verbs end in “う”.
Then you have things like て forms and past forms with た. But again, the kana makes it fairly easy to pick out.
So really, it’s the nouns which can be pure Kanji, and them stacking with one another is the only real challenge. Everything else becomes easy enough to identify thanks to particles and the kana they end with.
“word” is not really an easy thing to define linguistically, but languages tend to have a hierarchy of “barriers” between morphemes where some barriers are stronger than others.
In English for instance the barrier inside of “walker” separating “walk” from the suffix “-er” is perceived as weaker than the barrier in-between “grandchild” separating “grand” and “child” which is again weaker than the barrier in-between “grand child”. Indeed, as can be seen, which barrier to use can changes meaning “grand child” and “grandchild” mean two different things and are pronounced slightly differently though without context even native speakers can fail to distinguish in 100% of cases whether rit was “grandchild” or “grand child”. — One can debate whether “grandchild” is two words or one.
Japanese too has different barriers. In for instance “民主主義” the barrier between “民主” and “主義” is generally perceived as stronger than between “民” and “主”, whether “民主主義” is one word or two is again semantics.
Even with particles, words such as say “男の子” seem to have a very weak barrier in them compared to, say, “男の性器” as evidenced by that “男の子” also has only one pitch accent pattern. One would assume that historically this was not the case.
Yet another r/learnjapanese “How many pinheads can dance on an angel” discussion….
I’m trying to figure out what you would really be asking here.
​
Have you just started learning, and are trying to build some foundational knowledge for Japanese grammar?
Or are you native level and are being pedantic?
​
Because the whole system of these ideograms leads to increasingly long compound words that you just sort of have to learn. So what benefit does if a word like 朝鮮民主主義人民共和国 is a single word or a compound word, as a rule, give us?
Most of the questions on this sub can be answered with just one word: practice. You’ll get the feel of it the more you read lol
I always think these connected kanjis are one words, because there are accent changes.
Like how 東京 and 大学 have a rising (0) pitch accent but 東京大学 has a up-then-down (2) pitch accent.
But I guess it doesn’t really matter anyway.
The lack of spaces between words is why it’s helpful to start learning kanji as soon as possible. A string of hiragana is hard to sort out if you’re a beginner, and tricky even if you’re more experienced!
Someone mentioned it already but that-s not a single words.
Japanese know it because they know each words from that line.
That means that you have to know a certain amount of vocabulary to differentiate each word.
If your a begginer is obvious that you’ll only see a mess there. instead of each word. Practice more vocabulary and kanjis so you can start see words instead of a line of symbols.
Change “word” to “character”. I’m Chinese, look at a Chinese article website. A character is like a picture.
Ironically, when you start learning, the beginner’s books overcompensate by putting more hiragana than you would normally encounter in a Japanese text. I find it harder to pick out words in this case.
But to your point, country names in kanji are absolutely ridiculous and one of the exceptions. My Tokyo-born wife cannot read a lot of country names in Kanji because they appear so infrequently, so newspapers will often put furigana in such cases. Furigana has the double helpful effect of parsing out words
Japanese was slow to adopt any form of punctuation and has only recently come around to quotes and commas, though many formal publications avoid commas. Spaces you can find in some manga.
I hope more reforms come, because it’s all unnecessarily brutal
I’m not super advanced in japanese but at the start it’s kinda easy to tell words apart because a lot of times they start with a kanji
and i feel like once the sentences are gonna be harder, you’re gonna be able to tell the words apart easier even if they don’t start with a kanji
Well, it’s not much different from asking is “Democratic People’s Republic of North Korea” a single word in English. Only difference is that this one has spaces which doesn’t actually matter.
Japanese doesn’t have as solid a concept of “word” as English does. Your example could be considered as a single word, or a few words.