163Chapter 11Sustained Learning in 4th and5th Graders but not 7th Graders:Two Experiments with a TalkingPedagogical AgentBruce L. MannMemorial University, CanadaHenry SchulzMemorial University, CanadaJianping CuiBow Valley College, CanadaShannon AdamsBrother Rice High School, CanadaABSTRACTIn this chapter, agent movement and temporal speech cueing were designated for empirical study. InExperiment 1 an agent presented students in grades 4 and 5 (n 133) with instruction, practice, andfeedback on the proper usage of the apostrophe to show singular and plural ownership. Analyses of thedata in Experiment 1 showed that modality effects favoured speech cueing over text cueing but agentanimation had no effect. In Experiment 2, a different agent presented students in grade 7 (n 91) withexamples and practice questions on multiplying and dividing fractions. Experiment 2 data showed noeffects for modality or agent animation. The data reflects previous findings of inconsistent effects inmodality research.DOI: 10.4018/978-1-4666-0137-6.ch011Copyright 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Sustained Learning in 4th and 5th Graders but not 7th GradersINTRODUCTIONPedagogical agents are computer-generated characters that can be programmed to move around thescreen and provide advice to students in speechor text bubble. These animated characters canbe expected to focus student attention on criticalinformation in a computer program (Dehn & vanMulken, 2000) or even provide individualizedscaffolding on an educational website (Lester,Towns, Callaway, Voerman, & FitzGerald, 2000;Moreno, Mayer, & Lester, 2000), though the optimal role for an agent in educational multimediais unclear (Kim & Baylor, 2006).LITERATURE REVIEWIn the absence of a comprehensive theory of learning from pedagogical agents, researchers must relyon relevant research in human psychology andeducation. Theory development in educationalpsychology has added insight to our understanding of learning from educational media including,though not limited to: that students comprehendoral and written discourse differently (Hildyard& Olson, 1982), fuzzy trace theory (Brainerd &Reyna, 1990), the interactive-compensatory hypothesis (Stanovitch, 1980), dual coding theory(Paivio, 1986), the separate streams hypothesis(Penney, 1989), the split-attention theory of multimedia learning (Chandler & Sweller, 1991), themodel of working attention (Baddaley, 1992), thestructured sound function model of instructionaldesign (Mann, 1992, 1995, 1997a), the dual processing theory of working memory (Mayer &Moreno 1998), the cognitive theory of multimedialearning (Mayer, 1997, 2001), and the attentionalcontrol theory of multimedia learning (Mann,2006, 2008a, 2009).164LISTENING AND READINGIN ADULTSFrom these advances and others, we know thatwhen adults listen and look at educational multimedia, they integrate spatial and verbal sensationsin their working memory for a short time as theygenerate meaningful relationships between thespatial store and the verbal (language) stores. Mostadults can systematically and completely integrateinformation from listening and reading (Pressley &McCormick, 1995), by self-initiating an executivecontrol of these different mental processes (listening and reading), as suggested in the literature(Halliday, 1987; Higginbotham-Wheat, 1991;Penney, 1989). During listening, adults acquirethe gist or meaning from the auditory sensations(Hildyard & Olson, 1982; Penney, 1989; Reyna,1992; Brainerd, 1993), and from reading text,acquire the details or surface features (Tannen,1985), sometimes known as verbatim informationlearning (Martin & Briggs 1986; Penney, 1989).However when our attention is overloaded ordistracted, features can be combined inappropriately. We know that students learn better when theinstructional material does not require them to splittheir attention between multiple sources of mutually referring information (Chandler & Sweller,1992; Mayer & Moreno, 1998; Mousavi, Low,& Sweller, 1995). Meaningful learning occurswhen adults select relevant information in eachstore, organize the information in each store intoa coherent representation, and make connectionsbetween corresponding representations in eachstore (Mayer, 1997).AUTOMATIC AND CONTROLLEDPROCESSINGWhenever adults read and listen to easy or familiarcontent, they use automatic processing (Schneider& Shiffrin, 1977) also known as pre-attentive processing (Treisman, 1986). Pre-attentive processing
Sustained Learning in 4th and 5th Graders but not 7th Gradersof easy or familiar content can occur in parallel;that is we can handle two or more idea elementssimultaneously. Adults implement pre-attentiveprocessing on easy tasks or with highly familiaritems. On difficult tasks or with unfamiliar itemshowever, adults will apply controlled processing(Schneider & Shiffrin, 1977); also known as attention focusing (Treisman, 1986). Attention focusingon divided-attention tasks is serial; only one taskis handled at a time. Adults will use attention focusing on difficult or unfamiliar divided-attentiontasks, once described as “the glue that binds theseparate features of a stimulus- such as the colourand shape- into a unitary object” (Matlin, 1989, p.57). When we pay attention to a visual stimulus,cerebral blood-flow studies show increased activity in the visual and parietal cortex (i.e., the top,back region of the brain) (Robinson & Peterson,1986). When we shift our attention from visualto auditory modality our blood flow is increasedin our prefrontal cortex (located at the top, frontregion of the brain) (Matlin, 1989). In this way,we gain and hold our own attention, whether it’sa response to an emergency message or the beginning stages of a lesson. Figure 1 illustrates twomodels of attention on a difficult or unfamiliartask, one showing (a) high mental effort associated with high cognitive load from an unbalancedinput from both channels; the other (b) normalizedmental effort associated with normal cognitiveload from a balanced input from both channels.Adults and older adolescents can usually beexpected to examine stimuli systematically andcompletely, as suggested in Pressley and McCormick (1995). School-age students however,may have a different experience. At 12 years oldthe executive process that makes a child fullyconscious of their problem solving abilities andallows him or her to relate prior knowledge to acurrent problem in a systematic way, is still ma-Figure 1. Two models of attention on a difficult or unfamiliar task. Adapted from Mann, Newhouse,Pagram, Campbell, & Schulz (2002).165
Sustained Learning in 4th and 5th Graders but not 7th Gradersturing. For this reason, student enjoyment ofmultimedia is either uncorrelated or negativelycorrelated with their learning outcome (Clark,2001; Clark & Feldon, 2005). Frequently studentsfind feedback following an error more interestingthan feedback following a correct response (Ragsdale, 1988). “Many learners will not notice theoption to read directions or will try to save timeby skipping them” (Alessi & Trollip, 1991, p. 22).“While aesthetically pleasing, feedback providedin text will go unnoticed by the student” (Alessi& Trollip, 1991, p. 72). With increasing age andexperience, a child’s processing becomes moreefficient. By the time the child reaches adolescence, the executive process permits him or herto reason in a systematic and logical fashion(Mussen, Conger & Kagan, 1974).This chapter addresses this persistent educational problem of ignoring or forgetting to readimportant instructions and feedback presentedin text or other visual displays from a computerscreen. The general goal was to determine whetherschool-aged children would learn from “agentanimation” (movement or no movement) and“modality” (auditory or on-screen text). Agent“Genie” tutored 4th and 5th graders on the properusage of the apostrophe to show singular and pluralownership. Agent “Peedy” tutored 7th grade students on the multiplication or division of fractions.The interfaces were designed so that the agents’movements and speech prompts would activateassociations between relevant prior knowledge andthe new information. It was expected that theseyoung learners would retain more knowledge,and find more creative solutions to problems thanstudents using agents that did not move or talk.At present, recent findings suggest that someschool-aged students using educational multimedia are unable to generate sufficient gist to solveproblems; that their under-developed biologicalcapacity to extract gist from speech limits theirmental ability to generate sufficient referentialconnections between the speech prompts and166the limited text, and the speech prompts anddiagrams. For example some researchers (Shilling, 1991; Weiner, 1991) believe that positivefindings from using speech in educational multimedia may only be generalizable to particularpopulations of learners, such as adults and olderadolescents. Shilling’s (1991) study with eightyone kindergartners reported statistically nonsignificant learning effects between students usingconventional writing materials and/or computerswith and without available synthesized speechfeedback over an eight month period. Similarly,Wiener (1989) investigated the differential effectsof presentation conditions; visual only and visualscued with speech, on sight-word learning withfifty-five handicapped third graders. The resultsindicated non-significant statistical differencesbetween the presentation conditions. Two yearslater however, Weiner (1991) reported differentresults with twenty-four junior-high students. Arecent study with 7th graders in Australia (Mann,Newhouse, Pagram, Campbell, & Schulz, 2002)reported no differences between speech cueingand on-screen text cueing. Witteman and Segers(2010) speculated that the experimental materialsin their study were not difficult enough to splitattention, and may have accounted for what theycalled “a reversed modality effect,” though it mayhave more likely been attributable to an expertisereversal effect (Kalyuga, Ayres, Chandler, &Sweller, 2003; Kalyuga, Chandler, & Sweller,1999). In a within-subjects design by Segers et al(2008) children received the same four lessons inthe same order but in a different format that waspresented in a mixed research design: (1) writteninformation only; (2) written information accompanied by representational pictures; (3) oral information only or (4) oral information accompaniedby representational pictures. Analysis of the datashowed that differences between the presentationconditions were statistically non-significant. It ispossible that the design of the study itself contaminated the visual-only treatment with memory
Sustained Learning in 4th and 5th Graders but not 7th Graderstraces from the speech treatment, as suggestedpreviously (Mann, 1997a). Repeated measures inspeech research should only be implemented byoccasion, and then only after time lapses betweenoccasions.DESIGN UNCERTAINTYMost of the published research on multimedialearning calls for clearer directions on whetheraudio should replace or enhance onscreen instruction and feedback (Barron & Kysilka 1993;Koroghlanian & Klein, 2004)—an audio modelfor computing (Buxton, 1989). Some authorswant purposeful advice with a strong cognitivefoundation (Mayer, 1997), others a pragmaticemphasis (Reiser & Gagne, 1983; Smaldino &Lowther, 2007), still others straight narration(Mayer, 2001) or something more than narratedscreen text (Bishop, Amankwatia, & Cates, 2008),or something beyond word lists (Gyselinck, Jamet,& Dubois, 2008). Most researchers want “something more,” without saying what somethingmore entails or how this might be achieved. Webelieve that as long as the function of the audiowith the onscreen visuals is left unclear, research-ers will keep asking for advice for designingmultimedia instruction. Aside from the declaredneed for purposeful design advice, there is theissue of testing. Multimedia materials are usuallytested in factorial experiments to determine theirimmediate impact (e.g., see Mayer, 2001, for areview), instead of repeated measures to determine the more lasting effects of the treatments(Mann, 1994, 1995, 1997a; Segers, Verhoeven,& Hulstein-Hendrikse, 2008).DESIGN HEURISTICSResearchers typically use one of eight differentdesign heuristics for integrating audio into computer-based animations, graphics and onscreen textfor learning (Mann, 2008; 2009a). Each heuristicdiffers from the others in its scope or depth of advice, and carries a different assumption about howpeople learn from multimedia. See Mann (2009)for details about each of these design heuristicsand corresponding assumptions about multimedialearning. Table 1 outlines the eight different designheuristics for enhancing animations, graphics andonscreen text with audio for learning.Table 1. A summary of design heuristics and corresponding underlying assumptions about how peoplelearn from multimediaDesign HeuristicUnderlying AssumptionStructured Sound FunctionLearning occurs when student attention is self-controlled, where the gist is heard and the details readCognitive LoadMaximum ImpactBalanced InputLearning occurs when extrinsic load is reduced and germane load is sufficientLearning occurs when auditory and visual stimuli are sufficiently strongLearning occurs when the logogens and imagens make referential connectionsFavorite MethodLearning occurs when the right instructional method is usedFavorite FeatureLearning occurs when the right technology is usedDesign-By-TypeLearning occurs when an instructional software taxonomy is followedWhatever WorksLearning occurs when the developer‘s intuition is consideredAdapted from Mann (2009)167
Sustained Learning in 4th and 5th Graders but not 7th GradersThe present research is only concerned withthe Structured Sound Function (SSF) model ofinstructional design. The SSF model has fivefunctions and three structures for relating thesound with the visual events. Figure 2 illustratesthe structured sound function (SSF) model fordesigning the modality of instruction with convergent temporal sound cueing highlighted.Convergent temporal sound cues promotethinking by focusing the learner’s attention in astepwise procedure toward a specific solution(Mann, 2006, 2008, 2009b) that can set the stageor serve as a signal for specific behaviors to takeplace (Burton, Moore, & Magliaro, 2004). Duringconvergent goal setting for example, the learneris encouraged to use a variety of sources to solvea problem (e.g., answer look-up) to produce anacceptable result. Selecting the goal and constancy for a sound cue for a visual event is themost important of the three structural componentsin the SSF model (Mann, 2006).Figure 2. The structured sound function (SSF) model for designing the modality of instruction. Adaptedfrom Mann (2008).168
(Mussen, Conger & Kagan, 1974). This chapter addresses this persistent educa-tional problem of ignoring or forgetting to read important instructions and feedback presented in text or other visual displays from a computer screen. The general goal was to determine whether school-aged children would learn from “agent