やさしい日本語 Corpus: an Excel spreadsheet of 50,000 Japanese sentences manually simplified into a handpicked “core 2000” vocab (not the same as the Anki core2k as far as I can tell), with English translation of each sentence

This looks like it could be a super useful resource! It includes the original sentences, the simplified sentences, and the English translations of those sentences.
It looks like the original intention might have been to create this corpus in order to help machine learning know how to turn complicated Japanese written material into simplified versions.
Here’s an academically published article discussing the creation of this corpus.
https://www.google.com/url?sa=t&source=web&rct=j&url=https://aclanthology.org/L18-1185.pdf&ved=2ahUKEwjnoNiqzer9AhVEyGEKHSZADP8QFnoECBwQAQ&usg=AOvVaw2L1upXrm6cATUH3mlQFRgT
Here’s the excel file of the 50,000 sentences:
https://www.jnlp.org/cgi-priv/download.cgi?id=SNOW/T15
Here’s the site I got the file from:
https://www.jnlp.org/GengoHouse/snow/t15

1 comment
Leave a Reply
You May Also Like