Zengo Sayu is an immersive, interactive virtual environment designed to teach Japanese prepositions to students with no prior knowledge of the language. The underlying pedagogical theory is based on ideas expressed in Total Physical Response and other whole language instructional approaches. We show how virtual reality overcomes some of the limitations of these methods while retaining all of the advantages. The immersive aspects of a virtual environment will help students develop an understanding of Japanese through natural, physical interaction, thus strengthening both linguistic acquisition and recall abilities, while reducing the need for translation into the first language. Our prototype application uses full immersion with a head mounted display, digitized voice samples for natural speech reproduction, and voice recognition and body tracking technology to allow the user to interact and affect the world through combined vocal and gestural commands. Although designed for Japanese language, our approach can be adapted for a wide range of different language and educational applications.
KEYWORDS Virtual Reality, Japanese Language, Education.
Recent advances in virtual reality technology have extended the boundaries of how the technology can be applied to teach foreign language. This paper describes a prototype application that uses virtual reality (VR) to solve practical problems faced by Japanese instructors and students. These include:
* The challenge faced by Japanese language students in building linguistic fluency.
* The anxiety associated with learning a foreign language.
* The problem for teachers in presenting a rich, stimulating classroom environment.
* The difficulty for students and teachers in giving and receiving adequate practice time, correction and attention, especially in distance learning programs.
A more general concern is the need for greater research into the efficacy of VR and computer technologies in education. While there has been some anecdotal research, there have been few empirical studies to guide the development of educational VR applications, and none in the area of language education. Zengo Sayu hopes to address this deficiency.
In this paper we first establish the need for effective Japanese language education and then discuss the limitations of existing instructional methods. We show how VR can overcome these limitations while maintaining the advantages of current pedagogical techniques. Finally, we describe the Zengo Sayu application and the proposed study for evaluating it's effectiveness in an educational setting.
A NEED FOR EFFECTIVE JAPANESE EDUCATION
Over the last decade there has been an explosion of interest in learning Japanese language in the United States. Japanese is now being offered not only at the post-secondary level, but also in many high schools, junior high schools and even in elementary schools across the country. The Modern Language Association reports a 94.9% increase (from 23,454 to 45,717) in college students studying Japanese between 1986 and 1990[1]. Other innovative programs establish full language immersion classrooms for elementary school students.
Japanese language is also being taught by distance education. In 1986 the U.S. Congress established the STAR program to send satellite Japanese courses into public schools unable to afford to hire their own instructors. The program has grown to over 10,000 students nationwide. In 1993 the U.S. Department of Defense declared the lack of Japanese speaking Americans in high tech industries as a potential threat to U.S. national security and has responded by naming key universities around the country as centers for Japanese study. A number of these centers are currently implementing a closed broadcast distance learning program in Japanese language targeted at training engineers and others in the high technology field.
As can be seen from these figures there is a clear need for increased Japanese language education to meet the growing demand. But there is also a need for higher quality education. Learning Japanese and other non-Indo-European languages can be very difficult for Americans. While American students require an average of 720 hours of instruction to reach level 3 proficiency in oral skills in French or Spanish; from 2,400 to 2,760 hours is required to achieve the same fluency in Japanese or Chinese[2]. Japanese students face additional challenges such as high levels of anxiety, psychological pressures, and limited opportunity to converse with native speakers[1]. It is thus apparent that the study of Japanese requires great commitment and persistence and there are some significant impediments preventing success for many American students. Considering these rather formidable obstacles, it is not surprising that Mills, Samuels & Sherwood[3] report attrition rates as high as 80% occurring in Japanese language programs on a regular basis. This brings into question both the current methods for teaching the language, and the instructional tools employed.
The audio-lingual and grammar-translation methods, which enjoyed widespread popularity for many years in the post-war era, have come under increasing scrutiny. Since then, various new approaches to language instruction based on communicative competency have been pioneered. One well known example is Asher's Total Physical Response (TPR) strategy[4,5,6]. TPR studies report improved understanding, attitude and retention when students take a physically active role in learning language. TPR is a direct assimilation method where meaning of the target language is conveyed through physical demonstration and it does not use any form of translation into first language. TPR enjoyed a brief window of attention in the 1960's and `70's, but interest has since waned due to the limitations outlined below.
Total Physical Response couples physical activity with commands in the target language to facilitate the direct assimilation of language[4]. The basic instructional strategy of TPR is that students respond to commands and are not expected to merely mimic the teacher's utterances. For example, students respond to the commands such as "Stand up. Say your name. Go to the window." The teacher demonstrates the action along with the command, and the students are expected to listen and imitate. This a direct approach to language assimilation facilitates long term retention, increases the ratio of high achieving students, and improves confidence and attitude toward foreign language learning[6].
Asher's initial findings have been supported by a small group of studies using subjects of different age ranges and modifying various testing parameters[7,8,9]. However, Omaggio[10] criticizes TPR's extensive use of imperatives as being limited to what is easily reproduced and she warns that unless this mode of instruction is supplemented with other types of practice, students will have little opportunity to internalize natural language used for authentic purposes. Baltra adds her own criticism of TPR, saying that many adults may find the activity too degrading or improper to participate[11].
Subsequent theorists have modified Asher's methods to try to capitalize on the obvious strengths of TPR. Terrell's Natural Approach (NA)[12] is an attempt to build a more generalizable teaching method on the foundation of TPR and communicative competency. Terrell adopted Asher's TPR techniques because he found them effective, particularly in the early stages of language learning[11]. Like TPR, the NA recognizes the need for a "silent period" where students absorb language without the stress of audio-lingual-type listen and repeat drills. The proposed merit of a silent period is supported by other researchers [13,14,15]. Another key element of NA is the emphasis on aiding the student in binding meaning to form using three fundamental approaches [12]:
* acquisition in incremental stages,
* developing concrete associations,
* using speech techniques to draw attention to critical, target language.
Natural Approach pedagogy is also concerned with the psychological tension experienced by beginners. Terrell emphasizes the need to make language learning enjoyable in order to diminish stress typically associated with it[11]. A number of studies have also indicated a strong relationship between attitude toward learning a foreign language and student performance[16,17,18]. Factors of attitude, motivation and anxiety have a significant impact on educational outcomes. Thus instructional approaches which reduce stress, particularly for beginners, are especially desirable.
To summarize, the TPR and NA approaches have the following characteristics:
* interactivity
* physical demonstration
* no translation into first language
* silent periods where students absorb language.
* coupling of physical activity with spoken commands
These methods seem to benefit greatly from their highly interactive, full-body participation and achieve substantial language learning gains. Students are also more engaged and show lower levels of anxiety because the learning tasks and expectations are clear.
However one significant limitation is the reliance on physical activity and demonstration. While this is the feature that makes these methods so successful, there are severe restrictions on what can be demonstrated and experienced within a typical language classroom. It is also often difficult to separate layers of abstraction from the real world. For example, if the instructor is trying to teach color and holds up a red block while repeating the word for "red", will the students associate this word with the color or the shape of the block? In the next section we show how virtual environments may provide the vehicle to surpass these physical limitations.
Preliminary research at the Human Interface Technology Laboratory at the University of Washington[19] and elsewhere[20,21] shows that VR is potentially highly useful as an educational medium. Virtual learning environments are self-paced, immersive and interactive, allowing students to receive audio and visual instruction while interacting with virtual objects in real time.
The interactive nature of virtual environments lends itself well to constructivist approaches to instruction and suggests the potential for an entirely new form of experiential learning. Constructivists put the student in charge of the learning process. In the constructivist model, Winograd and Flores describe the teacher's role as supporting constructive learning activities so that students' efforts at constructing understanding, using their cognitive tools, becomes transparent or ready-at-hand[22]. One of our long-term research goals is to test VR as a medium for making the teaching process "transparent", so students can focus on content rather than falter with the mechanics of instruction.
Another strong argument for the use of VR in education is it's experiential nature. Duffy and Jonassen state that "The experience in which an idea is embedded is critical to the individual's understanding of and ability to use that idea."[23] In other words, experience is a vehicle for knowledge creation and recall. Virtual Reality can give students the experience needed to build their understanding from the ground up. Winn[24] suggests that VR can give students a physical and intuitive understanding of abstract concepts prior to tackling symbolic representations of the domain. The key to developing intuitive understanding lies in the interactive nature of VR.
Currently there are only a handful of empirical studies which evaluate the educational efficacy of VR in any knowledge domain and none in foreign language learning. To the extent which virtual reality has actually been employed in teaching, it has almost exclusively been used for training limited skill sets under very specific conditions such as flight simulators, military training, robotics and industrial applications.
However, as the technology much more accessible there is increasing opportunity for using virtual reality in more general educational applications. An example of what such applications might look like comes from Winn and Bricken[25], who proposed a virtual environment to teach elementary algebra. They believe that VR is an optimal instructional medium because the axioms and behaviors of algebra can be built right into the virtual world. Their approach is consistent with the constructivist paradigm that the most effective learning takes place when the learner is in control of the process.
In their work, Winn and Bricken list six promising aspects of VR in education:
* Virtual objects behave in concrete ways.
* Virtual worlds are totally engaging, entirely immersing the student cognitively in the environment.
* Interaction is intuitive because students manipulate objects in natural ways, by grasping, pointing, etc.
* The virtual world can be programmed to provide various types of guidance to students.
* The system can automate some procedures, allowing students to concentrate on others.
* Students can explore and return to the same place repeatedly, building an increasingly sophisticated understanding of concepts and procedures.
These characteristics fit well with the TPR and the NA, paradigms, suggesting that virtual environments can provide the same language learning benefits. Virtual Reality also overcomes the limitations of using TPR and NA in the real world. A virtual language learning environment can almost limitlessly extend the boundaries of what can be portrayed and experienced. Student interactions can be made far more complex and intellectually appealing than being repeatedly told to "Sit down" or "Stand up." Finally, virtual environments can be created to mimic the real world, or imbued with abstractions and behaviors which would be unattainable in the physical world, considerably simplifying the task of teaching abstract language.
Zengo Sayu maximizes these advantages to create a highly interactive and engaging learning environment. We predict that using this application students will experience many of the benefits reported for communicative competency approaches such as improved understanding and recall, and a better attitude toward language education. Being able to natural manipulate objects using voice and gesture, and hearing digitized human speech feedback, will enhance the quality of the students' experience in the virtual environment, and hopefully increase their learning.
This approach is radically different from the current array of instructional aids used in the language classroom environment; such as audio tape, pictures and video. Unfortunately, each of these aids suffer from practical limitations. For example, none are interactive, open ended, or allow students to create their own learning experience, as VR is capable of doing. Perhaps most important, VR could become the testbed to settle some of the controversies around communicative competency methods by virtue of the fact that the learning environment is highly controllable and consistent.
To be able to support the TPR and NA methods effectively a virtual environment must have certain characteristics. At the interface level it must be highly interactive, allow multimodal gestural and voice commands and provide natural speech feedback. The application itself must support language acquisition in incremental stages and facilitate the developing of concrete associations between language and meaning.
To meet these requirements we have coupled speech and gesture recognition with a virtual environment and digitized speech output. The setting is a Japanese style tatami room in a Japanese house. The room contains a table, chairs and a number of boxes and orb as shown in Figure 1. These contents change as the user adds and deletes objects during the various stages of language acquisition..

Figure 1 The Zengo Sayu virtual environment
Graphics rendering, virtual environment interface devices, and interactions are controlled using Division Ltd's dVS and dVISE software with additional custom functions for speech and gesture recognition. The speech recognition software is a research prototype that uses statistical methods for speaker dependent continuous phrase recognition[26]. It supports a one hundred phrase vocabulary with a recognition accuracy of over 90%. Audio feedback is accomplished using Japanese speech samples, digitized and played on a Silicon Graphics Indy computer. The virtual interface runs on a Silicon Graphics Onyx computer, while the voice recognition software is on a DEC Alpha; all the three machines are networked together using UNIX sockets allowing the user to send spoken commands to the virtual environment and receive audio feedback. Despite being distributed the interface responds almost instantaneously to vocal and gestural commands. The student wears a fully immersive head mounted display with connected microphone and uses a magnetically tracked six degree of freedom input wand for pointing and picking gestures.
One of the unique aspects of this interface is the use of combined voice and gesture recognition in an educational setting. This is a very powerful way of interacting with the virtual environment because the two modalities compliment each other. Cohen [27] has shown how natural language is ideally suited for descriptive tasks, while gestural interaction is ideal for direct manipulation of objects. Hauptman and McAvinny have also shown that users prefer using combined voice and gesture interaction with computer graphics over either modality alone[28].
There are a number of different ways in which this interface could be used to teach language and it is simple to develop alternative applications. All of the speech and gesture recognition functionality, socket communication and object representations are accessible from a dVISE MAZ file; a text file defining world attributes and object behaviors. By modifying this file it is easy for educators to customize the application without resorting to low level programming or recompiling code. This permits rapid development of applications which meet specific instructor and student needs.
The application we have designed, Zengo Sayu, is a whole language approach for teaching Japanese prepositions to students with no prior exposure to the language. The target vocabulary includes five colors (red, blue, white, black, yellow), one noun (box), one pronoun (that), five prepositions (on, under, next to, front, behind) and two verbs (`is', a verb of existence, and `put'). Because this is a whole language approach it does not teach discreet language elements such as grammar, pronunciation or syntax, but students are exposed to these elements within the context of their interactions and experiences in the environment. The environment is designed to be experienced totally in Japanese without the need for English translation, ensuring a total immersion language experience for the student.
The Teaching Method
We use a three-step method based on the Natural Approach principles of gradual knowledge acquisition and the development of concrete associations between language and meaning. Students can move through these steps at their own pace and each step relies on knowledge gained during previous stages.
First we teach colors, nouns and prepositions by allowing the student to freely explore and experience the world. When objects are touched the digitized teacher's voice responds with the object color or name. Prepositions are discovered through the use of the virtual Query wand shown in figure 2. When this wand touches one object another the teacher's voice responds with the object relationships. For example, if a red box is next to a blue box and the student touches them both they will hear "Akai hako wa aoi hako no tonari ni arimasuto" ("The red box is next to the blue box"). This also introduces the student to discrete sentence elements. These methods of exploration and explanation allow the student to learn at their own pace in a stress free environment.

Figure 2 The Query wand
The second stage uses animation sequences and vocal commands to teach verbs. Initially the student watches animation sequences of block structures being built. Before each animation the digitized speech gives the appropriate vocal command. For example the student may hear "Kiiroi hako o kuroi hako no shita ni oite kudasai." ("Put the yellow box under the black box") and then see a sequence of a yellow box moving under a black box. After each sequence, commands are issued directly to the student and object collisions monitored to see if they respond correctly. If the system says "Koroi hako o kuroi hako no shita ni oite kudasai.", ("Put the white box under the black box"), it will wait until a collision is detected between the top face of the white box and bottom face of the black box before issuing the next command. Corrective and congratulatory feedback is also given through the digitized speech samples. By the end of this step students should be familiar with verbs and how to issue commands using them.
Lastly, the student plays a game in the virtual environment where they try to assemble stacks of blocks that match sample configurations shown to them. Object collisions with the user's virtual hand are disabled so the user is forced to give complete vocal and gestural commands to build the block structures. They can either give complete vocal commands such as "Aoi hako wo akai hako no ue ni oite kudasai" ("Put the blue box on the red box"), or combined voice and gesture commands like "Sore o shiroi hako no mae ni oite kudasai." ("Put that in front of the white box.") while pointing at an object. Again collisions are monitored to check when the target configuration is reached.
If the student forgets vocabulary they can use feedback from the environment to refresh their memory. By pointing at an object and saying "Are wa nan desuka?" ("What is that ?") they will be told the object name and color. Alternatively they can use the query wand to check prepositions by touching two objects with the desired relationship. These two methods are designed to minimize any stress and anxiety felt while using the application.
To show how each of these steps work in practice we now describe a typical lesson using Zengo Sayu. Actual student interaction will vary from student to student, but by the time they have completed the lesson, all students will have covered essentially the same content. Due to its interactive nature there is a lot of flexibility in the sequencing, pace and methods which individual students may choose to cover that material. It may be completed in one long sitting or spread over weeks of instruction interspersed with more traditional language instruction.
Colors, Nouns and Prepositions
When the student first enters the environment she is given the opportunity to explore. She will find five glowing orbs hovering in one corner of the room which, when touched by her virtual hand, speak the names of their colors in Japanese. The digitized voice of the teacher will say "Aka" ("red"), "Ao" ("blue"), or "Shiro" ("white") in response to touching the corresponding orbs. She may repeat this as often as she wishes.
Nouns are introduced in a similar way. Next to the colored orbs is a translucent box which says "Hako" ("box") when she touches it. The box is intentionally made translucent in order to clarify that what is being spoken is the name of the object itself, not its color. The color orbs and translucent box are examples of how virtual environments can be designed to express what cannot be easily represented in the physical world and to isolate layers of abstraction from objects. Consistent use of these design elements helps the student develop a language of nouns and adjectives which can later be built upon.
Once the student has learned to recognize the five colors and the noun, she can leave the virtual world to train the voice recognition system with the words she has learned. The voice recognition system requires the student to record between five and ten examples of each speech sample. This serves as an opportunity for the student to practice under the supervision of an instructor who checks for accurate pronunciation. Voice training is an iterative process where the student simultaneously builds their own vocabulary, and the capacity for vocal interaction with the virtual world.
Returning to the virtual environment, the student is now able to use speech to interact with the environment. Text representations of spoken input are bound to specific events in the virtual world MAZ file. For example, when she says "Aka" correctly, the voice recognition system will sends a text string to the virtual world that causes a the red orb to bounce, telling the student that she has been understood and did indeed say "red". This visual feedback is a motivation for the student to speak with the same correct pronunciation used during training. The system will also only recognize words that it has been trained for, so incomprehensible or unknown phrases will not be passed to the virtual environment
The student may also then try combining colors with the noun box. By picking a color orb and placing it in the translucent box, she creates an opaque box of that color. For example, placing the red orb in the translucent box it creates a red box. When she touches this new box it says, "Akai hako" ("red box"). The process of creating and manipulating boxes continues as long as she wishes, followed by more training of the voice system for these noun color combinations.
Once she feels comfortable with colors, nouns and noun color combinations the she is ready to learn prepositions. She can now create colored boxes by simply naming them instead of using the orbs. By creating two boxes and moving them next to each other they will snap into place and the digitized teacher with say the resulting relationship. For example, in Figure 3 the white box has just been placed beside the blue box so the student hears, "Shiroi hako wa aoi hako no tonari ni arimasu." ("The white box is beside the blue box."). By continually repeating the process with different block orientations the student learns prepositions and complete sentence elements. In this way the student will begin to discern sentence elements based on their meaningful context.
She can also query relationships of objects using the Query wand. So she may construct a complicated structure by creating and placing blocks, and then spend time examining the relationships between them using the query wand.
Verbs and Commands
After colors, prepositions and the noun the student is ready to learn verbs and complete commands. This replies on the use of animation sequences and digitized spoken commands. First she watches while the interface creates red, blue and yellow boxes by saying their names. The boxes are then automatically stacked on top or next to each and the equivalent vocal commands heard; For example, "Aoi hako wo akai hako no ue ni oite kudasai." ("Put the blue box on the red box") is spoken just before the blue box moves onto the red box.

Figure 3 The white box beside the blue box.
After seeing the animation it is her turn. The stack of these three colored boxes remain in the room, while three other boxes of the same colors appear scattered on the floor. The command, "Aoi hako wo akai hako no ue ni oite kudasai." is issued and the systems waits for her response. Recognizing the words for blue and red she may picks up the blue box and place it on the red box. If so, teacher says "Yokatta." approvingly to reinforce that she's done it correctly. The student cannot use vocal commands herself because she hasn't trained the voice recognition software. Once the student has mastered the animation sequences, commands are issued without a preceding animation, forcing her to rely only on her verbal understanding. Following verb and command mastery the student will train the voice recognition system with the complete command phrases she has been listening to.
Multimodal Interaction
Finally, the student is able to use complete voice and gestural commands to interact with the environment. When she re-enters the environment she is presented with a stack of boxes that she is to build using only voice and gestural commands. She can say "Kiiroi hako o kuroi hako no shita ni oite kudasai." ("Put the yellow box under the black box") and the yellow box promptly flies under the black box, or point to the red box and move it in front of the white box with the command, "Sore o shiroi hako no mae ni oite kudasai." ("Put that in front of the white box."). In this case the interface uses the position and orientation of her hand to find the object she is pointing at and resolve the mixed modal command. This continues until she has constructed a duplicate of the target stack and then over a range of target stacks of varying complexity to reinforce her language ability. If at any time she forgets vocabulary she can use the querying methods described above to refresh her memory.
Although the Zengo Sayu prototype is complete, subject trials have yet to be conducted to test it's educational worth. Determining the educational efficacy of VR requires a two phase investigation. First, comparing VR with real-world and computer-based instruction to measure it's effectiveness. It is our hypothesis that VR will yield significantly better achievement on listening comprehension and oral production tests than with standard computer based instruction, but will not differ significantly from the real-world instruction. This is due to the increased interactivity that VR offers over standard computer instruction, the support for multimodal input, and the immersive quality of the experience. If these results hold true they will show that VR experiences are as useful as real world education and in some cases could be used instead of real world experience. A secondary hypothesis is that students who receive VR instruction will show significantly more positive levels of attitude than those who receive both computer based and real world treatments due to the immersive nature and novelty of VR.
If VR does prove to be as effective as real world education a second phase will be necessary to examine the use of VR within the context of a complete language curriculum. Of particular importance is identifying the difficulties that must be overcome to develop stand-alone systems for language instruction. These include the challenge of accurate voice recognition, the difficulty of integrating diverse input and output devices, the technical skill required to program virtual environments and the high cost of VR systems. Virtual Reality is particularly attractive for networked distance education, but these difficulties are compounded when magnified by distance.
In this paper we have presented an immersive virtual environment for Japanese language instruction. This application was designed to show that Virtual Reality could be an ideal medium for TPR and NA based language instruction, based on the similarities between virtual environment attributes and communicative competency requirements.
Zengo Sayu fulfills most of the needs for TPR and NA based learning. It's strongest features are the physically active role the students take in language learning, the amount of motivation it provides, the stress free environment and the total language immersion. Language learning through physical demonstration occurs at all stages in the application. Although at the beginning the student merely listens or responds to commands, by the later stages they are issuing commands to the environment and watching it respond. This empowerment is a strong motivating factor. As the student continues to learn they are able to manipulate the world in new and interesting ways, keeping them motivated, until at last they can use an intuitive combination of voice and gesture commands. The language learning is gradual to ensure the student always feels at ease, and there are also non threatening ways for them to reclaim lost knowledge. Finally, there is total language immersion with all the vocal interactions and feedback in Japanese.
This initial application is simple, but sufficient to show the benefits of using VR for TPR and NA learning. These include the ability to provide language abstraction, to give objects behaviors impossible in the real world. and to enable the student to experience TPR an NA at their own pace and comfort level. The use of VR becomes particularly compelling when trying to use TPR for teaching more abstract verbs and adjectives not physically demonstrable. For example it is impossible for students to actually fly in the real world and so may be difficult for them to learn words associated with flying through TPR; such as "swoop", "roll", "dive", "stall", or "glide". However in a virtual environment they can be given a sensation of flight and experience these motions firsthand while being taught the language equivalents.
In the future we plan to test the Zengo Sayu interface in an educational setting to gain further empirical understanding of the VR's strengths and weaknesses for teaching complex domains of knowledge. We also plan to increase the complexity of the environment and so improve its usefulness as an educational tool. An essential part of this will be using feedback from subject studies to isolate the advantages that VR offers over real world language education and incorporate them into our interface. Finally, we hope to use this work as the basis for establishing VR as an effective medium for applying constructivist theory in the real world.
Thanks to Paul Schwartz for the significant amounts of code contributed to this project and to Jesus Savage for assistance with the speech recognition.
[1]Aida, Y. (1994). Examination of Horwitz, Horwitz and Cope's construct of foreign language anxiety: The case of students of Japanese. Modern Language Journal 78(2),155-167.
[2] Liskin-Gasparro, 1982; cited in Samimy K.K., & Tabuse, M. (1992). Affective variables and a less commonly taught language: A study in beginning Japanese classes. Language Learning, 42(3), 377-98, pp. 390.
[3]Mills, Samuels & Sherwood, 1987; cited in Samimy K.K., & Tabuse, M. (1992). Affective variables and a less commonly taught language: A study in beginning Japanese classes. Language Learning, 42 (3), 377-98, pp. 390.
[4]Asher, J., Kusudo, J., & de la Torre, R. (1974). Learning in a second language through commands: The second field test. The Modern Language Journal, 58, 24-32.
[5]Asher, J.J. (1966). The learning strategy of the total physical response: A review. Modern Language Journal, 50 (2), 79-84.
[6]Kunihira, S., & Asher, J.J. (1965). The strategy of the total physical response: An application to learning Japanese. IRAL, 3 (4), 277-89.
[7] Wolge D.E., & Jones, G. (1982). Integrating total physical response strategy in a level 1 Spanish class Foreign language Annals, 14 (4), 273-80.
[8]Hammond, R. M. (1988). Accuracy versus communicative competency: The acquisition of grammar in the second language classroom. Hispania, 71, 408-417.
[9]Gary, J. (1975). Delayed oral practice in initial stages of second language learning. In M. Burt & H. Dulay (Eds.), Papers from the Annual TESOL Convention (9th, Los Angeles, CA, (March 4-9, 1975) On TESOL '75: New Directions in Second Language Learning, Teaching and Bilingual Education. (pp. 89-95).
[10]Omaggio, 1986; cited in Baltra, A. (1992). On breaking with tradition: The significance of Terrell's Natural Approach. Canadian Modern Language Review, 48 (3), 564-591.
[11]Baltra, A. (1992). On breaking with tradition: The significance of Terrell's Natural Approach. Canadian Modern Language Review, 48 (3), 564-591.
[12]Terrell, T.D. (1986). Acquisition in the Natural Approach: The binding/access framework. Modern Language Journal. 70(3), 213-227.
[13]Mangubhai, F. (1991). The processing behaviors of adult second language learners and their relationship to second language proficiency. Applied Linguistics 12(3) 268-297.
[14]Atherton, M. (1993). The Visions of Hildegard and the "Silent Period": A case study of language acquisition through aural and visual memory. System, 21(4), 503-508.
[15]Winitz, H., & Reeds, J. (1973). Rapid acquisition of a foreign language by the avoidance of speaking. International Review of Applied Linguistics, 11, 295-317.
[16]Lai, C. (1994). Communication failure in the language classroom: An exploration of causes. RELC Journal, 25 (1), 98-126.
[17]Horwitz, E. K., Horwitz, M. B. & Cope, J. (1986). Foreign language classroom anxiety. The Modern Language Journal, 70, 125-32.
[18]Ganschow, L., Sparks, R., Anderson, M., Javorshky, O., Skinner, D. and Patton, H. (1994). Differences in language performance among high-, average-, and low-anxious college foreign language learners. Modern Language Journal, 78 (1), 41-55.
[19]Bricken, M., and Byrne, C. (1993). Students in Virtual Reality: A Pilot Study. In Alen Wexelblat (Ed.)Virtual reality: Applications and explorations (pp. 246-273) San Diego, CA: Academic Press.
[20]Loftin, R.B., Engelberg, M. and Benedetti. (1993). Applying virtual reality in education: A prototypical virtual physics laboratory. In Proceedings of the IEEE Symposium on Research Frontiers in Virtual Reality. (pp. 67-74) Los Alamitos, CA: IEEE Society Press.
[21]Regian, J. and Shebilske, W. (1992). Virtual reality: An instructional medium for visual-spatial tasks. Journal of Communication. 42 (4) Autumn. 136-149.
[22]Winograd, T., Flores, F. Understanding Computers and Cognition: A New Foundation for Design. Norwood, N.J. Ablex Pub. Corp. 1986.
[23]Duffy, T.M., Jonassen, D.H. Constructivism and the Technology of Instruction: A Conversation. Hillsdale, N.J: Lawerence Erlbaum Associates Publishers, 1992.
[24]Winn, W. D. (1993). A Conceptual Basis for Educational Applications of Virtual Reality. (Human Interface Technology Laboratory Technical Report #R-93-9). Seattle, WA: Human Interface Technology Laboratory
[25]Winn, W. D. and Bricken, W. (1992). Designing virtual worlds for use in education: The example of experimental algebra. Educational Technology, December, 12-19.
[26]Savage, J., Holden, A., Billinghurst, M. A Hybrid System with Symbolic AI and Statistical Methods for Speech Recognition. HITL Technical Report R-94-2, HIT Lab, University of Washington, 1994.
[27]Hauptman, A.G. and McAvinny, P. Gestures with Speech for Graphics Manipulation. Intl. J. Man-Machine Studies, vol.. 38, pp. 231-249, 1993.
[28]Cohen, P.R. The Role of Natural Language in a Multimodal Interface. 1992 UIST Proceedings, pp. 143-149.