Add furigana by command-line /plain text

I’m trying to **auto-generate and add** some kind of **furigana** to essentially plain text files, the output formatting doesn’t have to precisely conform to anything in particular, I’ll just post-process.

I’m Ok with text manipulation, working with html or with subtitle srs, vtt, ass etc files. I can take these files and add or remove tags as necessary.

Ultimately I want to take some vtt subtitle files in bulk that I downloaded from Netflix (there are chrome extensions etc that allow you to do this) and add furigana. If the furigana is added in brackets, or <ruby> tags /anything is fine, but the point is I have 100+ essentially plain(ish) text files and I want to add some kind of hiragana/ furigana information to supplement the kanji information. So working in bulk, scripting, command-line style processing would be ideal.

I know there are websites that allow kanji to furigana conversion, and browser extensions, and anki add-ons… these aren’t what I want because I’m not dealing with a single webpage or anki deck and I don’t want to copy and paste many many times to a website etc for online conversion… so that’s not quite what I want.

And I know the there are programming libraries, mostly I see recommendations of MECAB and variants/wrappers. Which are probably used at the back-end of the above websites and extensions.

… but is there actually any windows/Linux command line script or application where I can put a plain text file or similar (srt/ass/vtt subtitle) and get furigana output in any format?

I don’t mind a little coding or scripting but I don’t want to reinvent the wheel and code a lot to get multiple libraries working together in a complex way if I don’t need to for a problem that’s been essential solved many times before (for websites and anki extensions and browser extensions) if I don’t need to.

Any advice? For instance if MECAB is the best tool for the job (I’m not sure) is there any simple command line use of MECAB?

[edit: clarify auto-generate]

by bgaskin

Leave a Reply
You May Also Like