There have been a number of posts referring to the use of ChatGPT as a resource for learning Japanese. It could be used as a resource, but there are a number of problems with doing so. It also helps if you understand what ChatGPT actually is and why it might give unreliable (yet convincing) answers.
# What is it?
There are a few videos (linked below) that explain all this in much greater detail, but I’ll try to summarize it for you.
GPT is, at base, a language model. What’s a language model? Well, unsurprisingly, it’s a model of a language. Such a model has many uses, but one particular use that you’ll probably be familiar with is the text prediction function on your phone’s messaging apps. It takes a word or two that you’ve just typed and generates a selection of words that you might choose from for the next word. You can actually write an entire message by putting in a starting word and then continuously selecting one of the offered options. Doing this blindly will result in obvious nonsense because the language model available on your phone is limited by memory, processing power, data set size and algorithmic implementation. Despite this it is fundamentally the same kind of “AI” as GPT.
GPT stands for “Generative Pretrained Transformer”. ‘Generative’ in that it generates text to follow some input (like your phone’s app does); ‘Pretrained’ in that it has gone through an intensive training phase to set the weights of its architecture (basically read a lot of language text from a data source – such as the internet – e.g. reddit, wikipedia, and other specialized data sets); ‘Transformer’ in that it’s algorithmic approach uses the concept of ‘attention’ to what’s come before – i.e. it can remember stuff that is in the original prompt and it’s output so far and can use that to make better guesses at what to generate next.
The original GPT was quite small and only used 117 million parameters (the number of ‘knobs’ it can twiddle in it’s ‘brain’ in order to improve it’s abilities). As an indication of size, GPT-2 used 1.5 billion parameters and GPT-3 used 175 billion parameters (over 1000 times as many as GPT-1). In addition, the training data set size for each increased significantly – more data generally means more opportunities to learn.
GPT can be further trained, or ‘fine tuned’ for specific tasks – e.g. the accuracy of a fictional story generating application has different requirements to that of a medical advice application. Outputs from the initial training can be further trained to meet specific quality responses with poor outputs effectively discarded. ChatGPT is essentially a version of GPT-3 that has been fine tuned to be good as a chatbot – i.e. good at generating it’s part of a natural sounding conversation.
# Problems
As you might imagine, given its parameter size and training data size, GPT-3 is quite an accomplished language model. ChatGPT is particularly good at producing fluent and convincing output. There are hazards for inappropriate use of ChatGPT however, that particularly apply to Japanese learning…
* **A misalignment problem**. People asking for answers to questions expect a truthful answer – that is (normally) their goal. ChatGPT’s goal is not specifically to provide accurate truthful answers – it’s goal is to provide convincing responses to questions based on what it has already seen in the corpus of data on which it was trained. To see how this may cause issues, [watch this video](https://youtu.be/w65p_IIp6JY) (which doesn’t specifically mention ChatGPT, but the issues are basically the same). So you can’t tell whether or not you’ll get a factual answer and the way you ask the question may get you a different answer.
* **A gullibility problem**. As ChatGPT is fluent and can seemingly provide explanations for it’s ‘facts’ it appears to be an expert. In addition, its fast responses can be interpreted by naive users as indication of established knowledge (because that’s how people answer – if they already know something the response is quick). This can easily lead to people accepting answers because the answers look good and appear to come from an expert.
* **A testability problem**. How do you know if the answer you get is correct – and correct in the context you asked it in? The data set used to train ChatGPT will contain more than factual answers to questions – it will include different valid answers based on context, misunderstandings and deliberate misinformation. ChatGPT is designed not to provide answers to things it doesn’t know, but if it has seen an answer in it’s training data then, as far as it’s concerned, it “knows” an answer. People usually ask questions because they don’t know the answer (or haven’t found it elsewhere) so the problem of testability can be huge – especially for beginners in Japanese language learning who don’t have a solid framework of the language that might otherwise flag the answer as flawed.
# Conclusion
I think that AI in general can be a very useful tool in learning Japanese – especially if that AI has been fine tuned to the task. However, users need to recognize the fundamental limitations of these tools and should always take answers from them as ‘*pending verification*’ by some other means. AI can be useful as part of an investigation of an answer but great caution is needed particularly as AI systems become more and more fluent, human-like and convincing.
# Further Information
Robert Miles has done an informative series of videos on language models and GPT. Worth a watch for a greater understanding of the subject.
* [AI Language Models and Transformers](https://youtu.be/rURRYI66E54). Basic stuff on what they are and how they work.
* [Unicorn AI](https://youtu.be/89A4jGvaaKk). A story GPT-2 wrote about English speaking Unicorns. Shows the benefits of ‘attention’ in Transformers.
* [GPT-3: An Even Bigger Language Model](https://youtu.be/_8yVOC4ciXc). Interesting stuff about GPT-3.
2 comments
This particular AI is for sure not at all viable for language learning.
But I am kind of excited for how this technology could be used in the future. Like you said, If someone creates an AI specifically for language learning that has been trained only on accurate data, it could become an amazing resource.
Imagine a chat bot that tailors its responses to your proficiency level and lets you chat in your target language, while peppering in some new vocab, etc, and corrects your mistakes while explaining what was wrong. Or just being able to get (factual) answers to your questions. Or what if it wrote sentences using the vocab you’re working on via a WaniKani api key?
The future of these models is very exciting but it’s not here yet.
I literally just asked it if it could help me learn Japanese and it said no and gave me generic language learning advice. I don’t think anyone is advertising it as a language learning assistant