While I do enjoy studying Chinese words using Anki, I must admit it can feel like a chore much of the time. I tend to aggressively focus on challenging words, which often means that my queue for the day is filled up to the daily limit, and also that I’ve given myself a challenging workout. My usual habit at the beginning of a session is to check Anki’s forecast graph to check, even though I had a lot to study today, when my diligence will pay off.
It is a real encouragement to see how great the future looks. While I have 99 cards to study today, all I have to do is study hard for about a week, and the number of cards due will be down by half. The graph continues to indicate reductions in the number of cards day by day, leveling off after two weeks to about 25 cards per day. I can imagine how awesome my study routine will be two weeks from now, when the spaced repetition algorithm will just require me to pull up Anki a few minutes a day of flashcards to in order to retain learned material.
The next day when I fire up Anki, I still have a lot of cards due. That’s alright, because I knew I had a few days before the daily load started it’s decrease. It’s still about two weeks before the magic leveling off of 25 cards. I did have a few lapsed reviews during the session, which probably explains it.
A week later, I’m still waiting for the promise of a smaller daily schedule. The graph below is my forecast a week after the first graph. Not only does the forecast look no better than it did a week earlier, but the number of cards due tomorrow is 35% more than it was a week ago!
After two more weeks, I’m looking forward to the green pasture of 25 reviews a day. But wait, what’s this? What the…??
So after three weeks of studying daily, my daily queue has gone down from 99 cards, to … 90! What is going on with the forecast graph, and why is it so far from reality?
When the Forecast Isn’t a Forecast
Actually, the forecast graph is doing just what it was programmed to do. When it claims that I will have 99 cards to study tomorrow, it is entirely correct. Where it starts to go off the rails is in all the days after tomorrow. The counts for those days do not account for the cards I will study tommorow which will end up as lapsed. Cards that lapse tomorrow will be rescheduled on day 2, increasing the actual number of reviews from what the graph shows. It also doesn’t account for all the successful reviews tomorrow, which will all be rescheduled sometime further in the future, with some of them being within the 30-day window of the graph. All of these rescheduled cards will add to the original forecast in those future days.
In other words, the forecast graph isn’t really meant to be a “forecast” of the number of cards you are likely to study. The graph is really just showing the number of cards due on each day in the future, as of this date. In fact, the source code for the graph uses terms like Due, IsDue, etc., for it’s functions, never referring to it as a forecast. A real forecast would be much more complex, and would need to run projections on probable number of successful and failed cards for each day in the future to adjust the daily counts. That would be a pretty cool feature, but isn’t what the software currently does.
An SRS-yphean Task
What about my experience of doing daily reviews, yet not making clear progress in lower my daily queue? Can this all be from a small number failed reviews?
Anki has a graph that directly shows the propabilities for success and failure for a particular deck, or for all decks. This is the “Answer Buttons” graph (“The number of times your have pressed each button”). This charts the number of times each answer button (1=Lapsed, 2=Hard, 3=Good, 4=Easy) was pressed for different types of cards. Ignoring the blue Learning section on the left (which just indicates how many items you already knew), the Young and Mature areas of the graph indicate how well you knew items as they were reviewed. The Young subchart represents items that were last reviewed 21 days ago or less, while the Mature chart on the right side represents cards that were last seen over 21 days ago. The Young subchart includes some items that are difficult to remember and are repeatedly forgotten, so the average Answer Buttons tend to look less favorable compared to the Mature subchart. Mature cards have all been seen and answered successfully many times in a row, so the Mature subchart is more representative of the overall success rate of the deck. The Anki documentation suggests that the success rate (buttons 2 through 4) should typically be around 90%.
In fact, the SRS algorithm is constructed by default to aim for this level of performance, based on the model of memory where the chance of forgetting a fact increases over time. If you wanted a higher percent of success in your SRS sessions, you would need a more aggressive daily schedule because you would be reviewing items more frequently. If you wanted a more relaxed schedule, you would be doing less reviews but would have more lapses. The various algorithms in SuperMemo allow fine tuning of many scheduling factors based on empirical data from past reviews. Anki allows a small amount of control via the Interval Modifier deck option.
A success rate of 90% seems great. It’s enough to get an “A” grade here in American schools. But a failure of 1 item in 10, or of 10 items in 100 somehow seems worse. For an SRS future schedule, it’s a significant amount. Here is the effect it has for a hypothetical deck. Say that I have a schedule of reviews as follows:
|Day||Number of cards due|
I have 100 cards due today. After my review session, I have successfully remembered 90 of them but forgot 10. These forgotten cards are repeatedly shown until I remember them, at which point they will be rescheduled for tomorrow. Before my review session today, there were 90 cards due tomorrow. After my session, there are now 90+10 or 100!
The next day, the situations looks slightly better. While I still have 100 cards to review, there are only 80 cards the following day instead of 90. I am likely to have around 10 lapses again (note that, except for yesterday’s lapses, these are all a different set of items than yesterday, so the chance of forgetting is the same). Thus, on day 2 I will have 90 reviews instead of just 80. In addition to the lapses, the 10 cards that were lapsed on day 0 and successful on day 1 will be rescheduled for day 2, 4, or 5, depending on how easy I marked the card.
Which has a greater influence on increasing the future queue — lapses or rescheduling? The answer wasn’t immediately obvious to me, so I set up a simulation, using the Anki simulation in Python discussesd in a previous post. I set up an initial deck to match the cards and review history in my actual deck. The particular deck has a lapse rate of about 15% (more precisely, 14.7%). Once loaded, I let the simulation run through a number of days of reviews, recording lapses and reschedules for cards that were reviewed from the first day (day 0) of the simulation forward. The answer is immediately clear:
The graph above conveniently combines two different but important data sets: the forecast as of day 0 of the simulation and the actual reviews on each day. The bottom area in blue is the number of cards due on future days as of day 0. Thus, it is equivalent to the forecast graph in Anki. You can see that the number of reviews quickly drops, reaching 20 reviews or below as of day 26. The combined area of all three data sets represents the number of items that were actually reviewed on each day. It is the combination of the original forecast plus any items reviewed from day 0 through day N-1 that were rescheduled. The items that were successfully reviewed and rescheduled again during the 30 day window is a minor contribution. The major contributor to the daily queue is clearly from lapsed cards. It’s such a major factor that as of day 25, the majority of reviews are coming from items that had lapsed in the previous 24 days. It’s like slow, rolling wave of forgetting.
What does it look like with a success rate better than 85% — say 90, 95, or even 100%? It’s hard to do a direct comparison because other factors come into play. While a lower rate of lapses obviously will lower the number of rescheduled reviews with all other factors being equal, a deck with fewer lapses will already have fewer scheduled cards on any given day. The graph below simulates forecast atarting from day 105, in a deck made by the addition of 20 cards a day for 100 days, followed by another 5 days of reviews with no further additions. In a deck with a 15% lapse rate, the number of cards due tomorrow is around 140, which in a deck with a 0% lapse rate, there are 80 cards due tomorrow. Two interesting observations are that the forecasts in both decks level off to 25 cards a day after about 3 weeks, and that the forecasts for both are equal after about 8 days. The dramatic difference is that the high-lapse case promises a quick drop in the number of reviews to reach that point. As we now know, that promise is not fulfilled.
If the prospect of a small number of reviews in the near future keeps you motivated to keep up with your flashcard study, don’t let this analysis discourage you! But if you find yourself repeatedly surprised by how your daily reviews are higher than Anki’s rosy forecast, know that you’re not doing anything wrong. Just keep plugging away. Your daily reviews will slowly but steady decrease over time, just not nearly as quickly as Anki’s forecast would have you believe.