Kanji searching added to JBLite’s kd2 module

I’ve started to implement searching for kanji from the command line. It’s not much so far, but it may be of mild interest.

Japanese search via kunyomi/nanori:

C:\code\projects\jblite>c:\python26\python -m jblite.kd2 data\kd2.sqlite3 ほとり
Searching for "%ほとり%", lang=None...
READINGS:
ID: 2298, literal: 畔
ID: 2396, literal: 瀕
ID: 4368, literal: 滸
ID: 4400, literal: 濆
ID: 5977, literal: 陲
ID: 8495, literal (repr): u'\u6c7b'
ID: 8562, literal (repr): u'\u6d98'
NANORI:
ID: 4, literal: 阿
MEANINGS:
No 'meaning' results found.
Result: [4, 2298, 2396, 4368, 4400, 5977, 8495, 8562]

French search:


C:\code\projects\jblite>c:\python26\python -m jblite.kd2 data\kd2.sqlite3 --lang=fr Asie
Searching for "%Asie%", lang='fr'...
READINGS:
No 'reading' results found.
NANORI:
No 'nanori' results found.
MEANINGS:
ID: 1, literal: 亜
Result: [1]

The values in “Result” are from the “id” column of the “character” table. Basically, this is the ID needed to look up all data for a given character. So, once a lookup-by-character-ID function is in place, it may be possible to print detailed character information to the command line. (Unfortunately, due to encoding limitations, this is not recommended for Windows.)

Searching can no doubt be improved: kunyomi “-” and “.” markers in KANJIDIC are not yet handled, onyomi must be searched via katakana input, and western language lookup is case sensitive for non-ASCII characters (per Python’s sqlite3 defaults) . However, it’s something which can be built on in the future.

Leave a Reply

Your email address will not be published. Required fields are marked *