Reaction and Learning Evaluation of a Non-immersive Virtual Reality Application for Children with Autism Spectrum Disorder [version 1; peer review: awaiting peer review]

Background: Autism Spectrum Disorder (ASD) is a complex developmental condition that involves persistent challenges in social interaction, speech, and nonverbal communication, in addition to repetitive or restrictive behaviours. For decades, children with ASD have been familiarising themselves with information and communication technologies (ICT) in their training and diagnosis. One of the ICT areas, namely non-immersive virtual reality (NIVR), has become a noticeable tool to help ASD children in their social training. It provides extensive virtual interaction, a safe environment, and is affordable. An NIVR application is developed to assist the intervention on ASD children. However, the whole experiences of the training need to be validated to conclude its effectiveness. Methods: A case study was employed as the research method. An evaluation of NIVR application using multiple sources of evidence was guided by Kirkpatrick Model of Evaluation (KME) which was executed via questionnaires, pre-and post-test. The main objectives of this research were to evaluate level 1 and 2 of KME. The target for Level 1 is to assess the reactions to the NIVR application. Level 2 is to gauge the knowledge,


Introduction
ASD is a complex developmental condition that involves persistent challenges in social interaction, speech, and nonverbal communication, and is typified by repetitive or restrictive behaviours. 3 Autism is classified as spectrum due to the diversity of symptoms and the variable levels of severity experienced by Autistic people, regardless of their ethnicity, social lifestyle, intelligence level, and economic status. There is no cure for this disorder, nor can it be outgrown. Therefore, individuals with ASD, especially children, require ample support to develop their social intelligence. One strategy is to provide an early diagnosis and preventive treatment, which aims to improve symptoms. The National Research Council 8 reported that early interventions and diagnosis for autism have significant long-term positive effects on symptoms and the later skills of children. The common signs of autism in children are usually noticeable between the age of 2 and 3 years old. Therefore, early interventions are necessary to provide better treatments and improved quality of life for children with ASD. This gives them the best chance of developing to their full potential provides them with the best start available.
For decades, children with ASD have been familiarising themselves with ICT in their training and diagnosis. One of the information and communication technologies that has been used is Virtual Reality (VR). VR can be used as an intervention strategy and could encourage user interaction in situations which in turn helps to evaluate real-life responses to environmental stress. 5 VR is a three-dimensional (3D), computer simulated and generated environment, which allow user to explore and interact. There are two common types of VR: immersive and non-immersive. Immersive VR (IVR) is accomplished with the utilisation of head-mounted display (HMD) and input readers, whereas non-immersive VR (NIVR) is accomplished through a desktop display and multiple devices (mouse, keyboard, or microphone), which help user to interact with its contents. To increase comfort for users, NIVR has lower immersion to facilitate users who are less tolerant towards IVR. By considering these factors, we believe NIVR is a suitable and good medium to support children with ASD in their social training. By providing an extensive virtual interaction, a safe environment, and affordability, their training can become a more entertaining experience.
The aim of this paper is to evaluate the effectiveness of an NIVR application using the Kirkpatrick Model of evaluation. The NIVR application has been developed to assist in interventions for children with ASD. 7 The application has been developed based on literature reviews and feedback from experts. The application consists of three main levels, which depicts a real-life scenario and two additional mini levels, that focus on emotion and concentration skills building. The training experiences need to be validated to conclude its effectiveness on children with ASD.

Methods
We developed the NIVR application using Unity3D version 2019. A case study was employed as the research method as it allows the researchers to gather data that are not known before, and no control of the setting is required. The current research conducts an evaluation of the NIVR application using multiple sources of evidence that is guided by KME. KME has been widely accepted in business and industry where it has been extensively used to evaluate training programs as well as being the basis for the evaluation of various educational programs. 2,4 According to Kirkpatrick,6 there are four levels of evaluation which consist of Level 1 (reaction), Level 2 (learning), Level 3 (behaviour) and Level 4 (results). The advantages of using this model include its simplicity, practicality, flexibility, and completeness. 11 For this research, the first two levels were used to evaluate perceptions and learning of children with ASD, as Level 2 evaluations are done before and after the training session. There were three participants, children with ASD with an age range of 12-15.
Level 1 evaluates the reaction of participants. This evaluation helps us to understand how well the overall training is being received by the participants, ranging from how engaged they were, how actively they made contributions, to how they reacted to the training. Their reactions can be measured using smile sheets, online assessments, interviews, or questionnaires and the assessment is made directly after the training ends. In this research, data were collected from questionnaires via Google Forms, which is one of the Google Docs' infrastructures (Google Docs, RRID:SCR_005886). The questionnaire consists of three questions which recorded responses and feedback for the training application.
Level 2 evaluates the participants' comprehension of the training. Evaluation on this level aims to explore participants' knowledge development before and after having a training session. The suggested evaluation methods are interviews, observation, or printed or electronic type examination. In this research, a pre-test, and a post-test via Google Forms were given to participants. There are 12 identical questions in each test which were designed to determine participants' knowledge development.
The session begins with the participant opening the pre-test on a web browser and taking the test. Next, the participant runs the NIVR application and completes the game. Afterwards, the participant closes the game and proceed with the Level 1 evaluation, which is done again in the web browser. Subsequently, the post-test is done directly after participants have finished the Level 1 evaluation, which ends the research session. The tests and questionnaire are opened by a researcher at the beginning of the session. During the session, the webpage for each evaluation is loaded up on the web browser for easy access.

Results
In this section we will illustrate the results based on the two levels of evaluation. The results confirm that the NIVR application has good first impressions amongst participants. Data are evaluated from three evaluations namely the pretest, the questionnaire and lastly the post-test.
We deployed a questionnaire for the first level of evaluation directly after the participant had finished with the NIVR session. There were three questions which asked participants whether they liked the application, whether they had fun after the session, and lastly would they like to have a session again. The results lead to the same conclusion where all the participants agreed with the questions. From these results, participants are highly engaged with the NIVR application, and the training is well received. Table 1 shows the answers by the participants in the Level 1 evaluation.
The second level evaluation has a different approach, being deployed before and after the session. The pre-test is given to participants before their session begins and post-test is given after the Level 1 questionnaire has been answered. Both tests consist of 12 identical questions where all answers are accompanied by a visual aid and the questions are tailored based on the content of the application. Additionally, each question has only two choices of answers. These characteristics are implemented to assist participants in deciding on the answers more effectively. The first three questions are about identification of key assets used in the NIVR application, as well as to show participants indirectly how the assets look like. All assets listed down in the test are available in the NIVR application. The other nine questions asked are about general knowledge with examples of real-life scenarios which are related to the NIVR application's activities. Out of 12 questions the participants managed to answer 50% of the total questions with the expected answers. All participants managed to answer the first three questions correctly giving 100% results in identification of application assets. This shows that participants are aware of the object's names and shapes. The next three questions focused on common actions and knowledge about real life scenarios. For instances participants were asked to define what a supermarket is for. 66.7% percent defined a supermarket as a place to eat instead of a place to buy. When asked about the place that a person should go to after taking an item, only 33.3% answered to go to the counter, while the rest answered to exit the place. The last five questions focused more on their understanding of emotion in certain situations. Here, two participants managed to give the correct expected answers. Table 2 shows the questions and results from each participant.
The second part of the Level 2 evaluation is the post-test. The post-test is deployed after the participants have finished with their session and directly after answering the questionnaire of Level 1. The post-test has the same 12 questions, and the purpose of the post-test is to evaluate improvements made by participants in term of their knowledge level. It leads to good results as all participants managed to give back the expected answers to all the questions. The results are then compared with the results in pre-test and based on the improvements, the results provide evidence that the participants have increased in knowledge and are motivated to make changes after having a session with the NIVR application.

Discussion
NIVR is selected because there are limitations to IVR implementation. Parsons 10 reported that many researchers gave more focus on the level of immersion to produce better research outcomes. IVR is implemented without consideration of how comfortable ASD children might feel towards it. 9 Side effects, such as motion and cyber sickness should not be overlooked as this will affect the overall research by a huge margin. There is a big gap in the research of NIVR for children with ASD due to more focus being given to IVR. The use of NIVR is still scarce in helping children with ASD with their social training. Thus, a valid evaluation is needed to validate the robustness of NIVR application, which should in time decrease the research gap The first two levels of KME demonstrate that on average the ASD children had a good experience and were able to improve their social skills with the NIVR application. Based on the Level 1 evaluation. we found that the participants had a favourable reaction towards the NVIR application. This indicates how they are invested to learn more on the next level. Based on Level 2 evaluation, the improvements made by the participants in the post-test suggested how much they have learned from the training as well as having their social skills improved. These findings provide a good indication that NIVR training has a potential in helping children with ASD.
The key components of the NIVR application are educational game, analytics, and specific VR type. We believe that these combinations provide better data assessment, facilitate a comfortable environment, and can be an effective intervention for children with ASD.

Conclusion
The main conclusion that can be drawn is that the NIVR application received positive feedback based on Level 1 and 2 of KME. With a positive trend on the first two levels, future investigations will be fruitful when proceeding to Level 3, which evaluate the participants behavior on how they applied their training after a time period. However, the results could be improved with a higher number of participants. For this research only a limited number of research participants could be obtained due to the current pandemic situation. Future studies should aim to replicate results in a larger number of participants. This should, we hypothesise, demonstrate a similarly favourable evaluation result. We also concluded that the NIVR application has a good potential to be used in ASD training.

Data availability
All data underlying the results are available as part of the article. However, data for Level 1 and Level 2 are also available in the following repository: This project contains the following underlying data: • Level 1 (Reaction) -Questionnaire • Level 2 (Learning) -Pre-test • Level 2 (Learning) -Post-test Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

Consent
Participants and participants' parents or guardians were first provided an informed consent form about the details on handling data privacy and all of them have agreed with this research activities and policies.