Member-only story

Learn Language through Lyrics

Scraping the web for Mandarin lyrics and scoring their difficulty

Jye Sawtell-Rickson
7 min readOct 13, 2018

One of the hardest parts about learning a language is staying committed to the cause. As a beginner Mandarin student I’m constantly looking for new mediums to practice with in order to maintain my engagement as well as reduce the likelihood of gaps in my knowledge. Just recently I’ve discovered Chinese music and there’s a lot that I like, but often the lyrics are very complicated and learning them can just be too difficult. I searched for recommendations on songs to learn, but the majority were outdated with little indicator on the actual difficulty. With this in mind, I set out to classify the songs and give myself something to practice with.

This article describes how to crawl the web for Chinese songs and apply basic language processing techniques to judge their difficulty and finally make recommendations for learning.

If you’re just looking for songs to learn, you can check this article with the results: https://medium.com/@jyesawtellrickson/6-great-songs-for-chinese-beginners-1a679b4c6392

Learning with (Fun) Repetition

Before we proceed any further, we should formulate the problem in a little more detail. In order to increase learning efficiency, we need songs which contain new characters, but not too many. By learning new characters in the context of old ones it is far more likely that one can remember them, which is what…

--

--

Jye Sawtell-Rickson
Jye Sawtell-Rickson

Written by Jye Sawtell-Rickson

Talking about data science, product analytics, and artificial intelligence.

Responses (2)