Scaffolding Animations

Stills versus animations

Tversky et al (2002) reviewed a number of studies that examined the relative effectiveness of stills versus animations and found that static images often outperform animated versions in terms of student learning. Recently (Hoffler and Leutner 2007) conducted a meta-analysis of 26 studies and revealed that instructional animations appear to be more effective when highly realistic animations are being employed or when the acquisition of procedural-motor knowledge is required. There has also been persuasive argumentation that certain subject matter involving complex physical systems is more amenable to instruction through dynamic visualisation (Hegarty 2005).  Even so, there is a growing consensus that animations pose a number of challenges for both the multimedia designer and the end-user.

Animations are transient by nature and thus tend to overwhelm the available cognitive resources of the user (Kalyuga 2008). Whilst working memory is attempting to process the current frame it must also maintain information from previous and upcoming frames. Dynamic visualisations of complex structures may involve several interacting elements displaying change over time and subsequently cause the user to split his visual attention between competing sources of information. These factors tend to contribute to a level of cognitive overload not often found in static representations. Nonetheless, animations have an advantage in that they can depict essential fine-grain movements and thus represent an efficient way of displaying an extended sequence of graphical images.

Stills, on the other hand, are static! Thus they may be viewed concurrently and re-inspected visually a number of times. They are often better suited to signalling devices such as labels, highlighting and arrows. There is also the suggestion that stills encourage the user to “mentally animate” from the given information, thereby inducing a more active level of cognitive processing than animations.

Given that both animations and stills have their respective strengths and weaknesses, it is not surprising that cost-effective stills are sometimes preferable to resource-intensive animations. However the choice between one and the other format is not obligatory and there are recent studies investigating the use of both formats concurrently when dealing with complex information. Arguel and Jamet (2009) conducted studies in first aid procedures whereby video plus static pictures produced better results than either format alone. The rationale was that key snapshots presented during the video would effectively leave a trace for working memory and thus help overcome the difficulties associated with the inherent transience of animations.

Designing a multi-stage animation

The need for a combined use of stills, animations and learner-controlled interactivity in a multimedia resource for the assimilation of complex information can be illustrated with a real-world example. The athletic triple jump will suffice as the subject matter but we could have chosen a dance routine, surgical procedure, the rapid assembly of intricate machinery or any number of tasks requiring complex human movement.

Stage 1: Stills
A good starting point is to incorporate Kalyuga’s (2008) recommendation that learners may move from static to animated format as their level of expertise increases. We will work on the presumption that novices will form part of the learning cohort and thus commence with stills. When learners with expertise are involved they have the option to quickly scan the static material and move onto stage II if they consider it is appropriate.

Figure 1 The use of critical stills with explanatory text and signaling forms stage 1 of the tiered animation.

Why begin with stills?

  • Segmentation and content representation
    Stills that focus on key points of the animation essentially segment the content and act as a visual form of content representation. This type of overview helps orientate the student and facilitates learning by minimising searching behaviour.

  • Signalling as a means of guiding attention
    Stills at key points during a triple jump will provide an opportunity for the content expert to attune the learner to the thematically relevant aspects of the task through the incorporation of signalling devices such as arrows, labels and highlighting.

  • Pre-training and deeper understanding
    Textual explanations can assist to explain underlying mechanisms and thus promote deeper understanding (Kriz and Hegarty 2007). The identification of key events within the procedural task acts as an essential form of pre-training.

  • Low cognitive load
    Stills may be revisited and viewed concurrently thus ensuring the learner is not cognitively overwhelmed by the transience of animations.

Stage 2: The scrollbar
The learner is primarily concerned with copying or modelling the behaviour of the subject who is performing the triple jump. Consequently the fine details of the movement are of some interest to the student. Unlike with many physical or mechanical processes it would be difficult to mentally animate between the stills due to the unpredictable variance in the patterns of human movement. Consequently this second phase is a crucial element prior to viewing the task in real-time speed during the final stage.

Figure 2 The scrollbar allows the user to examine fine-grain movements. Note that the multiple views of the triple jumper are
there to accentuate the variation of his movement and do not appear in this manner when scrolling.

Why include scrolling?

  • Self-paced examination of fine grain movements
    Learner controlled interactivity allows the user to examine the details of the human movement in an iterative self-paced manner. Self-regulated scrolling facilitates schema construction.

  • Intermediary component
    If we started with the scrollbar a novice would be unable to determine the critical points around which he should focus his attention. On the other hand, animation at full speed would not allow an appropriate examination of the fine details of human movement necessary to perform optimally in a triple jump.

  • Building information around critical steps

  • "From research on event cognition, it is known that people conceive of events such as assembling an object as discrete rather than continuous, and as hierarchical, organized at the higher level around objects or large parts and at the fine level around actions on the separate objects or object parts."  (Zacks and Tversky 2003)

Stage 3: The animation
Now we are ready for the cognitively demanding task of viewing a complex animation. By this point we have some foundational understanding and have examined the procedural task carefully in a self-paced and informed manner.

Figure 4 The real time video allows the learner to view the performance objective at speed

Do we still need the final animation?

  • Real time speed of human movement.
    The relative importance of viewing the video at original speed depends on whether performance of the procedural task is time-critical. A dance routine would have a specific rhythm and seeing the execution of the task at normal speed would be beneficial. Assembling machinery could be less time dependant in that the exact rhythm would not be critical to performing the task and as such the first two stages may be adequate.

  • Performance objective
    The video at original speed constitutes a representation of the objective to be attained. It could also have been shown at the beginning of the presentation as a form of representing the learning objective for which the learning scenario was designed.

  • Empirical Evidence
    Hoffler and Leutner’s (2007) meta-analysis suggests that highly realistic animated material (e.g. video) and animations rquiring the acquisition of procedural-motor tasks are the type of dynamic visualisations that are likely to be  more effective than their static counterparts.

The example of an athletic event served to illustrate the need for several stages of presentation in order to guide the novice from a fundamental knowledge through a stage of visual rehearsal and finally to viewing the film clip which would otherwise have imparted little in terms of meaningful learning.


Arguel, A., Jamet, E., Using video and static pictures to improve learning of procedural contents, Computers in Human Behavior, v.25 n.2, p.354-359, March, 2009
Hegarty, (2005) Multimedia learning about physical systems. In: Mayer, R.E. (Ed.), The Cambridge Handbook of Multimedia Learning, Cambridge University Press, Cambridge. pp. 447-465.
Hegarty, M. & Kriz, S. (2008). Effects of Knowledge and Spatial Ability on Learning from Animation. In R. Lowe & W. Schnotz (Eds.), Learning with Animation: Research and Implications for Design (pp. 3-25). New York: Cambridge University Press.
Hoffler and Leutner (2007) cited from Ayres, P. & van Gog, T. (2009). State of the art research into cognitive load theory. Computers in Human Behavior, 25, 253-257.
Kalyuga, S. (2008). Relative effectiveness of animated and static diagrams: An effect of learner prior knowledge. Computers in Human Behavior, 24(3), 852-861
Kriz S., Hegarty M., (2007) Top-down and bottom-up influences on learning from animations. International Journal of Man-Machine Studies 65(11): 911-930
Source of Triple Jump Footage,
Tversky, B., Morrison, J. B. & Betrancourt, M., 2002. Animation: Can it facilitate? International Journal of Human-Computer Studies, v. 57, n. 4, p. 247-262.
Zacks, J. M. and Tversky, B. (2003).  Structuring information interfaces for procedural learning. Journal of Experimental Psycholoogy:  Applied, 9, 88-100.