Obviously the New Years J-Ben release didn’t happen. Anyway, I’m going to remove any tentative date on the software’s release, although I am continuing to work on it (as is evidenced by the git repository logs). I’ve been doing a lot of code reorg, but things are starting to look good. I’m also rethinking what’s Read more about J-Ben 2: update[…]
The new year has nearly arrived, and with it, a small update. For family, I have some big personal news. I prefer not to say it publicly right now, but if you ask Mom, she can tell you what it is. Otherwise, ask me in an e-mail directly. I’ll update this in a couple of Read more about Happy New Year![…]
J-Ben development has certainly come along recently, with parsers now more-or-less working for KANJIDIC, KANJIDIC2, EDICT and JMdict. Unit tests are written and each of these are searchable, even if they do not have all the functionality and optimizations I’ve planned for. However, for my own study purposes, I think I’m going to shift focus Read more about Current Happenings[…]
This was a rough problem. I’m using Python’s SAX modules for parsing JMdict, and ran into a problem regarding its use of XML entities. Nothing wrong with JMdict, but the expansion is rather verbose and does not lend itself well to what I’m doing. I don’t want “word containing irregular kanji usage”, but rather, I want the “iK” code.
Unfortunately, the default ExpatParser doesn’t have a clear way to disable this. But if you read the docs closely enough, you can find out about setting a “default handler” for it, which has the side effect of disabling internal expansion.
This isn’t a perfect fix, but here’s the class I used to get this done:
It’s been quite some time since I’ve updated this blog, so today I’ll write a few words. My life’s been pretty busy as of late, preparing for job hunting here in Japan. I will be continuing to live here for a while longer; I am unsure how many more years. So, I’m beginning to search Read more about Recent Events[…]