Archive for 'Guides'

Busting Personal SRS Myths, and an Anki Simulation in Python

I am a faithful user of flashcards to study Chinese words, with Anki as my software of choice to take care of the spaced repetition rescheduling. Even though I try to keep my queue empty on a daily basis, there are still days when I feel like I’m swimming against the tide. If I look at my forecast of upcoming cards, the level of daily cards quickly drops to a low baseline after a week or so. Yet, I never seem to reach the level that Anki’s forecast graph promises me. Then there are other days where I get weary of the constant drilling and skip a few days. When I come back to study, I have a large queue of overdue cards waiting for me (as expected). However, once those cards are cleared, Anki’s forecast of future cards is surprisingly good—maybe better than if I hadn’t skipped those days. Am I being punished for my diligence? Is this just my perception of the flashcard experience, or am I encountering something tangible related to SRS scheduling?

A way to test various theories was to create a simulation of Anki’s SRS scheduling. › Continue reading…

Tags: , , ,

The Lancaster Corpus of Mandarin Chinese as an SQL Database

In my Chinese studies, the Lancaster Corpus of Mandarin Chinese (LCMC) has been a useful source of data—word and character frequencies, collocations, phrase usage, parts of speech, etc. The corpus is freely available for non-commercial and research use. However, the native form of its data is in a set of XML files, which is not an easy format to work with. In addition, the XML data is slow to read data from, because all those XML tags and the entire data structure needs to be parsed. A much better format for the data is an SQL database. Stored in a database, many kinds queries and reports can be executed very efficiently. Depending on the software, these queries and reports can return results very quickly, much faster than in the XML format.

I have made available a Perl script and some other related tools to assist with extracting the LCMC files into a SQLite database. SQLite is a lightweight relational database management system intended for portability and ease of use. Because it functions as a standalone program (not client-server), it is easy to install and use. It’s more ubiquitous than you might think. It’s how the Firefox and Chrome browsers stores its history, cookies, and preferences. But it’s also used, for example, by the Anki program as the storage format for flashcard data, and by the Calibre e-reader program to store information on installed e-books.

› Continue reading…

Tags: , , , , , , ,

Creating Audio Flashcards with Transcriber, Audacity, and Anki

Transcriber, Audacity, and Anki are three programs, all free and open source, that are useful for language study. At some point in the future, I hope to write more on each of these. In the meantime, I wanted to announce two export plugins I created for Transcriber. One export creates a label file for Audacity, for splitting an audio file into individual clips, and the other creates an import file for Anki, associating the transcribed text with the audio segments. Below are step-by-step instructions for the 6 steps involved, starting from a raw audio file and finishing with a set of Anki flashcards.

› Continue reading…

Tags: , ,

There has been a scarcity of posts on the blog lately, as I’ve been working on a web application for the site. This is a page anyone can use to estimate their knowledge of Chinese words. The start page for the test is here.

Wordtest - screenshot of test page
› Continue reading…

Tags: , ,

I recently bought an Amazon Kindle, for the primary purpose of reading more Chinese. It has turned out to be a great investment, since I am no longer tied to my computer screen for reading things I find online. I had been collecting bookmarks to online books sites for a long time without making much use of them. Now that I am a bigger consumer of reading material, I’m starting to make use of them. In particular, I need sites that allow for downloading the raw text, so that I can convert it into a formatted book. › Continue reading…

Tags: , , ,

Recording Streaming Radio with VLC

Being able to stream internet radio stations from all over the world is a great opportunity for language learners. Living in a country full of native speakers is the ideal environment for listening practice, but not everyone has the chance to travel or to spend significant time in the target country. For the rest of us, listening to live radio online gives us a touch of authentic culture, whether it’s a call-in show, traditional or pop music, or even commercial ads. In addition to streaming radio, podcasts are another way to listen to native speakers. As podcasts are individual files that can be downloaded, they make suitable material for listening practice, with the ability to repeat or slow down sections that are unclear. But podcasts in Chinese are few in number. Being able to record streaming radio would give the learner a wealth of practice material with massive variety. Or it can just record hours of music for your listening pleasure. › Continue reading…

Tags: , , ,
Back to top