08-27-2023, 06:35 AM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Aug 2023
Device: Tolino Vison 5
|
KOreader dictionaries: Morphology/Inflections?
Dear Forum,
I am currently using AlreaderX on my Tolino device, paired with goldendict for English, French and Swedish (Various dictionaries + hunspell files). I have tried to use KOreader for its nicer dictionary interface. I converted my own stardict dictionaries based on dict.cc data, but found the lack of inflection/morphology support to be a dealbreaker, for French especially. I found conflicting answers online, thus pleases excuse me asking: Is there currently any way to support morphologies/inflections in KOreader? Thanks a lot! |
08-27-2023, 09:50 AM | #2 |
Wizard
Posts: 1,084
Karma: 4234828
Join Date: Feb 2012
Location: Cape Canaveral
Device: Kindle Scribe
|
I am afraid it does not. KOReader uses sdcv as engine (Stardict), which does not support inflections, only fuzzy search.
|
08-27-2023, 09:56 AM | #3 |
Addict
Posts: 358
Karma: 10703708
Join Date: Dec 2020
Device: Kindle Paperwhite 3
|
It doesn't support morphology. The fuzzy search should be enough for English and Swedish.
I've managed to get Spanish to work quite well by using certain Wiktionary dictionaries for it. You could try and see if French is also supported well enough. Try checking out this: https://github.com/BoboTiG/ebook-rea...s/fr/README.md (French-French Wiktionary) Or for a French-English one: https://github.com/Vuizur/Wiktionary...tardict.tar.gz |
09-01-2023, 08:26 AM | #4 |
Junior Member
Posts: 9
Karma: 10
Join Date: Feb 2019
Device: Kobo Glo HD
|
English Hunspell dictionaries work perfectly on Koreader. I got them from https://sourceforge.net/projects/wor...er/2020.12.07/
Just unzip the files and copy them to Koreader dict directory. For other languages, go to https://github.com/wooorm/dictionari...n/dictionaries |
09-06-2023, 02:51 AM | #5 | |
Enthusiast
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
|
Quote:
You can both use hunspell data under inflection_data folder and another dictionary as an inflection source. If the language you need is not under the aformentioned folder I can add them also. (Generally, free babylon dictionaries provide good inflection support, you can use them to bolster the dictionaries you have.) |
|
09-13-2023, 09:34 AM | #6 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Aug 2023
Device: Tolino Vison 5
|
Quote:
However I had a look at the inflection_data, so far there seems to be no data for Swedish? Maybe that could be added. |
|
09-13-2023, 09:36 AM | #7 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Aug 2023
Device: Tolino Vison 5
|
Quote:
Thanks for the reply. I have downloaded all available files (.aff, .ts, .dic, .js, .json) for all three languages and have put them in separate folders within in koreader/data/dict. The /dict folder already contains my exisiting dictionary files, also in separate folders. However, upon opening KOreader there are no changes, when I try setting the dictionary settings ("manage dictionaries"), there are no additional dictionaries to activate. I tried looking up a basic inflected word, which did not work, suggesting the hunspell dictionary is not active. Am I missing something? |
|
09-13-2023, 01:46 PM | #8 | |
Enthusiast
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
|
Quote:
|
|
09-14-2023, 04:18 AM | #9 |
Addict
Posts: 358
Karma: 10703708
Join Date: Dec 2020
Device: Kindle Paperwhite 3
|
Tried running the script on a stardict dictionary: http://libredict.org/dictionaries/ru...2023-09-07.tgz.
I unzipped the dictionary and ran the script on the directory. However, I got an error message: "[Errno 2] No such file or directory: '(long path)/Wiktionary Russian-Russian.ifo'" The file clearly exists and I've double checked that the path to it is correct. Is it the spaces that are messing something up or something else that I'm not understanding? |
09-14-2023, 07:12 AM | #10 | |
Enthusiast
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
|
Quote:
Code:
python .\add_inflections.py --dict-file '.\dicts\Wiktionary Russian-Russian\Wiktionary Russian-Russian.ifo' -j .\inflection_data\Russian.json.gz With the unmunched data in inflection_data folder, 1,940,291 synword has been added to dictionary. Here is the ifo file of the output: Spoiler:
|
|
09-14-2023, 10:28 AM | #11 | |
Addict
Posts: 358
Karma: 10703708
Join Date: Dec 2020
Device: Kindle Paperwhite 3
|
Quote:
I've tested it briefly and it seems to work rather well in most cases. The dictionary itself isn't the best due to bad formatting and lack of word stress. I realized that the Russian wiktionary that can be downloaded from within Koreader itself seems better with an even bigger .syn file (almost twice as big). However, some words just aren't found for whatever reason. Since that dictionary already comes with a .syn file I suppose it would be superfluous to run this script on it, too? I'll have to experiment further when I have the time. |
|
09-14-2023, 11:00 AM | #12 | |
Enthusiast
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
|
Quote:
|
|
09-16-2023, 11:07 PM | #13 | |
Junior Member
Posts: 9
Karma: 10
Join Date: Feb 2019
Device: Kobo Glo HD
|
Quote:
|
|
09-18-2023, 02:42 AM | #14 |
Enthusiast
Posts: 34
Karma: 11014
Join Date: Feb 2023
Device: Kobo Aura SE
|
AFAIK, fuzzy search has nothing to do with Hunspell files and sdcv has no support for Hunspell morphology.
|
03-30-2024, 01:31 AM | #15 | |
Member
Posts: 19
Karma: 10
Join Date: Sep 2014
Device: Kindle Scribe
|
when I convert my Tabfile to the dictionary file, I get the following output:
Preparing the inflection sources... Done. Reading the input dictionary... Done. > Processed 76,280 / ? words. Total inflections found: 78 Writing the output file(s)... Done. I am not sure what "? words" means, but it says that there's only 78 inflections found. Edit: I was able to make some progress by combining unmunched json inflection files from different dictionaries. Up to 304 inflections found on the one dictionary and 869 on the other one, but still getting a question mark. Is it possible that it's having difficulty with the bilingual aspect of the dictionary? Edit: using the above method: Quote:
I must admit this is quite a useful script. Very much appreciated. Thank you! Last edited by sricochet; 04-01-2024 at 01:37 AM. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
KOreader cannot handle certain dictionaries | LittleBiG | KOReader | 7 | 11-24-2020 07:36 AM |
Best dictionaries for koreader | Alan_S | KOReader | 11 | 12-18-2018 07:13 PM |
DSL dictionaries within KOReader? | jcn363 | KOReader | 4 | 09-20-2017 11:05 AM |
Dictionaries and identical inflections | Hatgirl | Amazon Kindle | 10 | 01-12-2014 05:29 PM |
Inflections (Kindle dictionaries) | LucasCorso | Amazon Kindle | 3 | 03-17-2011 07:47 AM |