Mikan: A Tool to Auto-Generate Flashcards from a Japanese Text (with optional frequency restriction!)

I’ve had this project kicking around for a while and I finally cleaned it up (sort of) enough to share.

[https://github.com/moniquemurphy/mikan](https://github.com/moniquemurphy/mikan)

The general idea is: Grab a Japanese text, run the script, get a CSV file of all the vocab words in it, glossed (within reason, parsers aren’t foolproof). Use this knowledge to treat any text like a guided reader without having to do the work manually.

If you don’t want highly frequent words because you know them already, you can restrict to, say, words less frequent than the top 1000, or words with a frequency value of > XYZ (if you know a lot about linguistic corpora and that’s something you’re into, go for the frequency value, if not, rank threshold is pretty easy to understand).

Caveats:

* You’ll need a certain base level of computer savvy to install and run it, but everything necessary is linked, free, and I tried to keep the installation explanations at an intermediate level.
* The frequency corpus data from BCCWJ is a large file, but since it required a pretty advanced level of Japanese to even figure out where the CSV file lived, I included it in the repo. It’s free for non-commercial use.
* I didn’t include kana in the output because parsing it from the JMDict file and matching it with the corresponding kanji was not straightforward for words with multiple readings. If this really gets under your skin, I encourage forking 🙂

If you happen to dabble in local EPWING files, there’s some code to use JSONified versions of them as an alternate to the (free and lovely, but less comprehensive) JMDict. I used [https://github.com/FooSoft/zero-epwing](https://github.com/FooSoft/zero-epwing) to convert to JSON.

I hope this is helpful to someone!

Mikan: A Tool to Auto-Generate Flashcards from a Japanese Text (with optional frequency restriction!)

Tags:

Leave a Reply

18 day Itinerary Check

Passing JLPT with the power of nukige: A follow up

Will I be put on a list? Quite possibly. Will I remember the character? Most definitely.

Struggling with consistency and motivation, would like a few specific pointers

Project idea help!

Mikan: A Tool to Auto-Generate Flashcards from a Japanese Text (with optional frequency restriction!)

Tags:

Leave a Reply

18 day Itinerary Check

Passing JLPT with the power of nukige: A follow up

You May Also Like