My apologies, I should have been more accurate regarding the format. I meant to write that I used Notepad++ to convert the file from a format called "UTF-8 without ROM" to UTF-8.
As for the collates.txt file, I think I know how it works now. Problem is, I haven't the faintest idea how to apply this to Japanese. You see, Japanese uses 3 different alphabets. Actually, 4 if you count the latin alphabet.
Two of these are phonetic, they represent syllables. (They're called hiragana and katakana, and they look like this: ひらがな and カタカナ) Each has about 50 characters. Then there's the Kanji (漢字). There's literally thousands of these rascals.
The entries in the dictionary can be pure hiragana:
Code:
<ar><k>である</k>
(v5r) to be (formal, literary)</ar>
pure katakana:
Code:
<ar><k>ディーゼルエンジン</k>
(n) diesel engine</ar>
or probably the most frequent, Kanji with pronunciation written in hiragana in the parentheses:
Code:
<ar><k>電波 [でんぱ]</k>
(n) electro-magnetic wave (P)</ar>
How do I make a collate file for this?