How does copy paste work with furigana?

I read epub books on the Books app on my iPhone. Sometimes I like to copy the whole sentence out and paste it into a translation app to check if I understood it correctly.

I noticed that in some cases the sentence copied nicely without the furigana. But sometimes the kanji + furigana is not copied at all and only the text without furigana is copied.

Anyone knows how it works?

2 comments
  1. epub files are actually zip files that contain html files. So it probably depend on the way the furigana are maked up in each file. I guess it depends if the furigana are in the file or added by the reader as some app can do that.

  2. There are many places where furigana (or in general, ruby text) can go wrong, from most likely to least likely:

    1. The author of the book set it up incorrectly
    1. Something went wrong during conversion to epub format
    1. There’s a bug in the reader app
    1. There’s a bug in the translator app
    1. There’s a bug in the iOS copy+paste function

    Since you asked how it a works, here’s a very quick rundown:

    In a correct epub ebook, furigana is set up in a way that apps know what is the original text and what is the furigana part. Something like this:

    <ruby>食<rp>(</rp><rt>た</rt><rp>)</rp></ruby>べる

    It’s useful info for apps that can handle text with ruby. Your translator app probably leaves it out and just pastes 食べる. Other apps (like desktop Anki and MS Word) will paste it including formatting unless you tell it not to.

    If a different app (like a notepad app) doesn’t understand ruby text, it will just paste everything without the “code”, i.e. 食(た)べる, so the furigana won’t be lost.

    ***

    As you can see, it’s quite technical, and very easy to mess it up. Since you mentioned that it sometimes works and sometimes doesn’t, it’s most likely a problem in the epub file(s) rather than the app. Having said that, while handling the furigana part can be inconsistent, I have never encountered a bug where the _original_ text wasn’t copied+pasted correctly.

Leave a Reply
You May Also Like