Abstract
Background: Surgical simulators provide a safe environment to learn and practise psychomotor skills. A goal for these simulators is to achieve high levels of fidelity. The purpose of this study was to develop a reliable surgical simulator fidelity questionnaire and to assess whether a newly developed virtual haptic simulator for fixation of an ulna has comparable levels of fidelity as Sawbones.
Methods: Simulator fidelity questionnaires were developed. We performed a stratified randomized study with surgical trainees. They performed fixation of the ulna using a virtual simulator and Sawbones. They completed the fidelity questionnaires after each procedure.
Results: Twenty-two trainees participated in the study. The reliability of the fidelity questionnaire for each separate domain (environment, equipment, psychological) was Cronbach α greater than 0.70, except for virtual environment. The Sawbones had significantly higher levels of fidelity than the virtual simulator (p < 0.001) with a large effect size difference (Cohen d < 1.3).
Conclusion: The newly developed fidelity questionnaire is a reliable tool that can potentially be used to determine the fidelity of other surgical simulators. Increasing the fidelity of this virtual simulator is required before its use as a training tool for surgical fixation. The virtual simulator brings with it the added benefits of repeated, independent safe use with immediate, objective feedback and the potential to alter the complexity of the skill.
Surgical simulators provide a safe environment where a surgical trainee can learn and practise psychomotor skills. A goal for these simulators is to achieve high levels of fidelity so that they can be used in surgical curricula. The fidelity of a simulator is important, as it helps determine the extent to which a trainee is able to learn from the simulated experience and transfer the learning to the real environment. Fidelity can be defined as the extent to which the appearance and the behaviour of the simulation match that of the real environment.1 Rehmann and colleagues2 proposed a useful typology of fidelity based on an instructional trainer’s perspective, including environment, equipment and psychological domains. Physical fidelity can be divided into the environmental domain (i.e., the degree to which the simulator replicates sensory, motion and visual information from the true environment) and the equipment domain (i.e., the extent to which the simulator replicates the look and feel of the real system).2 Haptics is included in the equipment and environment domains, and is used to provide the sense of resistance that would normally be felt in the real situation as objects come into contact with each other. It has been suggested that haptics will increase the fidelity of a simulator.3,4 Psychological fidelity is the degree to which simulation imitates psychological factors, such as stress and fear, which can be experienced in the real environment.3 Psychological fidelity is considered to be of greater importance. Higher levels of psychological fidelity in a simulator may be associated with higher degrees of skill or knowledge transfer.3,5
After the development of a virtual simulator, it is important to assess how well its users believe it recreates the tools, environment and feel of the real procedure. A self-report questionnaire is one of the most valuable tools to determine how well a user feels the simulator has been made and helps determine how valuable the simulator may be in the future. Most studies examining virtual simulators include short questionnaires for the users to complete after they have used the simulator; the questionnaires use either Likert scales or yes/no responses to determine the face validity and effectiveness of a simulator.6–14 Longer self-report questionnaires have the ability to be more specific and can help determine user-friendliness, training capacity for the simulator, first impressions of the design and users’ experience with the simulator.15,16 These questionnaires can also inform the further development of the simulator.17 Possibly of most importance, these questionnaires may also help with assessing the fidelity of the simulators. The components that favour learning in simulation are not completely known, but unquestionably involve interactions of environment, equipment and psychological fidelity.5 Although we found no specific surgical fidelity questionnaires in the recent literature, we did find presence questionnaires. Presence is similar to psychological fidelity, and for the purpose of surgical simulators, presence may be defined as moments during scenarios where the trainees actually feel as though they are in the operating room (OR).3,6 Presence questionnaires have been used to help determine a user’s presence in virtual environments, but not yet in the surgical field.
Research has demonstrated that surgical simulators are useful tools for learning surgical skills.18–20 Increasing costs of training a resident in terms of operating time, the development of minimally invasive procedures that require different technical skills, the shortening of a resident’s work week and the increasing pressure for safe practice mean methods must be developed to train surgical residents outside the OR. Many authors comment that virtual reality may actually decrease training costs overall in the long-term if the technology is used in conjunction with current teaching methods. There is not enough evidence in the literature to determine if a virtual reality simulator is less expensive than the lower fidelity simulators currently being used or to determine what degree of fidelity is necessary for transference of skills. Therefore, a cost analysis will help determine a simulator’s feasibility.
There has been no development of virtual simulations with haptics that allow residents to practise the surgical fixation of common orthopedic fractures. The purpose of this study was, first, to develop a reliable surgical simulator fidelity questionnaire and, second, to assess whether a newly developed virtual reality simulator for surgical fixation of an ulna with haptics (force feedback) has comparable levels of fidelity for surgical trainees as performing the same subtasks on Sawbones. A secondary purpose was to assess the feasibility of this new simulator by performing a cost analysis.
Methods
Participants
We recruited residents from a single North American orthopedic surgery program to participate in this study. We obtained informed consent from all participants before beginning the procedures. Ethical approval was obtained from the Conjoint Health Research Ethics Board at the institution before commencing this study.
Materials
A virtual simulator for fracture fixation of the ulna was developed in collaboration with the department of electrical and computer engineering at our institution. This simulator consists of an ulna and all the tools required for surgical fixation of the ulna. Tabs displayed around the top and sides of the display screen provided the optional tools to use for this surgical fixation. A haptics device (PHANTOM 1.5/6DOF; SensAble Technologies Inc.) provided realistic force feedback during the procedure, allowing the user to move the tools around the screen and feel resistance when the bone was touched.
We compared the new virtual simulator with the current standard for simulation, the Sawbones model. Sawbones have previously demonstrated similar external bending properties and pullout strength for screws as cadavers21,22 and have been used to test surgical fixation of fractures.23 We used a Sawbones ulna as a comparative method of internal fixation (model 1017, Sawbones; Pacific Research Laboratories). The Sawbones and required equipment for surgical fixation of the ulna were supplied by Synthes Ltd.
Fidelity questionnaire
It has been proposed that the 3 key components of simulation that establish fidelity are environment, equipment and psychological domains.2 A literature review of medically related questionnaires for virtual simulators provided many types of questions.6–16,24–26 The questions most closely associated with this type of procedure and these domains of fidelity were modified, and we created new questions relating to our specific objectives. The questions were designed to assess the environmental, equipment and psychological fidelity domains, as defined by Rehmann and colleagues,2 of both the Sawbones and virtual simulator in relation to ulna fixation. Face validity was assessed during the pilot study and was based on reviews by a medical educator. This resulted in 2 simulation-specific questionnaires. Each questionnaire contained the same questions, but the questions were specific to the simulation used and compared it to a real-life procedure. The participants responded to the questionnaires after completing surgical fixation of the ulna with that specific simulator.
Procedure
Participants were stratified by postgraduate year (PGY) and sex, and then randomly assigned, using a computer-generated randomized number, to begin with either the Sawbones model (group 1) or the virtual simulator (group 2) procedure. Participants were allowed 10 minutes to learn the basic control features and tools for both the virtual simulator and Sawbones, and they were then asked to perform internal fixation of an ulna using a neutralization plate. Group 1 performed the fracture fixation of the ulna with the Sawbones simulator first, using the same tools normally found in the OR and at procedural skills training courses. The same participants were then asked to perform the procedure using the virtual simulator with a haptic device (providing force feedback). This consisted of a computer screen and a hand-held device that provides the user with simulated haptics that would be expected from completing surgical fixation of an ulna on a real patient. Group 2 performed the procedure using the virtual simulator with haptics device first, followed by the Sawbones. All participants completed a postprocedure simulation-specific fidelity questionnaire.
Statistical analyses
We assessed the reliability coefficient (Cronbach α) for each fidelity domain of the questionnaires to ensure that the same construct was measured for each domain. We measured each domain separately and then as a total fidelity score (all 3 domains). Effect size differences were evaluated using Cohen d, with d = 0.2–0.49 being a small effect, d = 0.5–0.79 being a medium effect, and d ≥ 0.80 being a large effect size difference.27
The fidelities of each of the 3 domains (environment, equipment, psychological) were compared between groups using independent samples t tests, with simulator (Sawbones [group 1], virtual simulator [group 2]) and postgraduate year (PGY; junior [PGY-1 and -2], senior [PGY-3, -4 and -5]) as between-subjects factors. We compared the fidelities of the 2 simulations within groups using paired sample t tests. Criterion-related validity was investigated with correlation coefficients on the fidelity questionnaire domains within and between simulators.
We performed a cost analysis of both simulators to assess the feasibility of developing a virtual surgical fixation of an ulna. To do this, we obtained prices of all the required equipment and development costs. Start-up costs and average annual costs were calculated.
Results
Of the 26 residents available from a single North American orthopedic surgery program, only 4 residents did not participate in this study. Two had participated in an initial pilot project to review the materials and questionnaires used, 1 was not in the country at the time of the study, and 1 is an author of this study (J.L.). The 22 participants were randomly assigned to the Sawbones or virtual simulator groups, with 11 participants in each group.
The reliability coefficients for the 3 fidelity domains for the Sawbones (group 1) were environment α = 0.74, equipment α = 0.76, psychological α = 0.78, and overall α = 0.89. The reliability coefficients for the 3 fidelity domains on the self-report questionnaire for the virtual simulator (group 2) were environment α = 0.56, equipment α = 0.78, psychological α = 0.83, and overall α = 0.88. An independent samples t test that compared the scores for each domain of fidelity for the first simulation used showed no significant difference in the scores between groups.
To compare the scores of fidelity domains among the junior and senior resident groups, independent samples t tests were conducted. The junior residents reported significantly higher scores overall with the virtual simulator than the senior residents (mean 73.3, standard deviation [SD] 6.46 v. mean 63.3, SD 11.03; t1,24 = 2.71, p = 0.014), with a large effect size difference (Cohen d = 1.11). No other fidelity domains showed any significant differences between resident level group scores.
The residents’ ratings of the Sawbones versus the virtual simulators across the 3 fidelity domains (environment, equipment, psychological) were compared between the 2 simulations using paired samples t tests. In all 3 domains, the mean scores of the Sawbones model were significantly higher than those of the virtual simulator (Table 1).
Paired samples t tests were conducted to assess if each individual domain held significantly different scores than the others, comparing within simulators. The virtual simulator demonstrated significantly different mean scores among all fidelity domains (all p < 0.001; same mean and SD from Table 1). The mean Sawbones simulator scores between fidelity domains for environment and equipment scales did not differ significantly, but both differed significantly from the psychological domain scale (p < 0.001; same mean and SD from Table 1).
Table 2 provides correlation coefficients between each fidelity domain subscale (environment, equipment, psychological) on the questionnaire, as well as the overall fidelity scores. All bivariate correlations except that between the virtual simulator environment and the Sawbones equipment subscale scores were significant. Each domain correlated significantly with all other domains of the same simulator and with its specific fidelity domain between the 2 simulators. The overall total fidelity score of the virtual simulator correlated with the Sawbones simulator (r = 0.71).
The cost associated with creating a virtual simulator and the costs of using Sawbones were compared, based on the number of 25 trainees in the residency program. The start-up cost for the ulna virtual haptics surgical simulator is $85 000, which includes 1 computer hard drive and display, 1 haptics device, software and personnel time. The initial cost of the ulna Sawbones model for fracture fixation is $4225, which includes the Sawbones (1 for each resident), plates, cortex screws, drill bits and guides, tap, T-handle, drill, base/vice, depth gauge, reduction forceps and screwdriver.
We estimated annual costs for each simulator. The developers of the virtual simulator provided approximate costs over 5 years, and the annual costs were derived from these. The virtual simulator would cost about $4000 per year for personnel and upgrades. Similarly, we approximated the annual costs for the Sawbones model, taking into account the wear and tear of equipment and the replacement of ulna Sawbones for each procedure. The annual costs were about $4000.
Discussion
The fidelities of the Sawbones and virtual simulators were assessed with the postprocedure simulator-specific questionnaires. We found no previous literature evaluating the fidelity of Sawbones. The assessment of the fidelity of Sawbones was required to determine if the fidelity of the newly created virtual surgical fixation simulator was comparable to the current standard for fracture fixation of the ulna. As we found no previous surgical fidelity questionnaires in the literature, a new one was created for surgical simulation. To determine if the questionnaire was a reliable tool, we calculated the internal consistencies of the fidelity domains. In all domains except virtual simulator environment, the Cronbach α was > 0.70, indicating the domains are measuring the same construct. The virtual environment had lower internal consistency, likely because this simulation was run in a laboratory, which lacked the many visual and auditory cues provided by a real OR. To improve on the environmental domain of both simulators, we suggest that these procedures be done in a mock surgical suite with the sounds of the anesthesia machine and OR nurses to better simulate the environment. We also assessed the criterion-related validity of the questionnaires using the Pearson coefficient, which demonstrated significant correlations both within and between simulators. These add evidence that this fidelity questionnaire is both a reliable and valid tool, which will enable other groups to assess surgical simulators in the future.
The questionnaire findings were analyzed by residents’ experience levels. Junior residents scored the virtual environment significantly higher than the senior residents, with a large effect size difference (Cohen d = 1.11). This is likely because the senior residents had had more experience than the junior residents with actual surgical environments. The effect size was measured for all parametric measurements, all demonstrating large effect size differences. Effect size is reported, because it helps define the meaningfulness of statistically significant results28 and because it has been recommended by the American Psychological Association task force on statistical inference to be reported for all primary outcomes.29
The participants assigned the Sawbones simulator significantly higher levels of fidelity in all domains than the virtual simulator. Both simulations demonstrated that the psychological domain was the most difficult fidelity domain to recreate, while the environment domain was the easiest. It is frequently assumed that higher levels of technology are synonymous with increased fidelity. Our findings were similar to those of another study reporting that virtual technology was actually of lower fidelity than the hands-on simulator.30 It has been argued that high levels of fidelity are not important for all simulators, especially at junior training levels.30,31 Virtual reality simulators, at this time, appear to be better for junior residents who are acquiring knowledge and learning or practising basic surgical skills. Higher fidelity simulators are ideal for more advanced surgical skills that require multiple tasks, and experienced surgical trainees will likely benefit more from these simulators. Neither of the simulators used in our study received a high score for psychological fidelity, which may be acceptable at the novice training level. When participants were asked which surgical simulator they would use to practise surgical fixation of the ulna, most stated they would use the Sawbones instead of the virtual simulator, as seen in another virtual simulator study.7
When a new simulator is comparable to the current standard, and if surgical trainees consider it useful and are willing to use it for learning and practising procedures, an important factor in choosing one simulator over another is cost. We performed a cost analysis to provide further comparisons between these simulators based on a program of 25 residents. The largest costs for a virtual surgical simulator are often the start-up costs.32 This simulator, using a sophisticated 6 degrees of freedom haptics device, cost about $85 000 to develop. This included costs of personnel time for the development and maintenance of the simulator. In comparison, purchasing Sawbones for surgical fixation of the ulna with all its required tools would initially cost about $4225. This low cost is owing to the prior development of the Sawbones model ulna and its equipment. The annual costs for each simulator were about $4000 for 25 residents. The largest difference is that this cost is set for each resident to practise 1 procedure per year for the Sawbones procedure, whereas the resident can have unlimited attempts with the virtual ulna each year. The advantages of repeated practice time without added cost and with the possibility for the simulator to provide immediate objective feedback and incorporate surgical difficulties or more complicated fracture patterns may provide the participant with more unique reasons to use virtual simulators in the future.
Limitations
The major limitations of this study were funding for more advanced equipment and the use of senior simulator developers. These have an impact on achieving higher fidelity for the simulator. The participating engineering supervisor and laboratory have been involved in previous surgical simulations in thoracic surgery.33,34 They already had all the necessary basic equipment and software to develop this simulator in their laboratory. Additional haptics devices and newer devices that could be purchased to increase the fidelity of the model were not available for this study. Another limitation was the time given to complete the simulation. The goal of this study was to assess if a high fidelity surgical simulator with haptics could be developed in 1 year. If more time was provided, perhaps a higher quality surgical simulator may have been developed for use in this study. Using a simulator for the first time is also a limiting factor. Many simulators have associated learning curves. Owing to costs and time, it was only feasible for the participants to practise this procedure on the virtual simulator for 10 minutes.
Conclusion
To our knowledge, this is the first study to assess the fidelity of Sawbones using reliable questionnaires that were specifically created to assess 3 separate fidelity domains of simulators. The level of fidelity in the new virtual simulator does not meet the same standard as the Sawbones simulator, and the start-up costs for a virtual simulator are much higher than the cost of buying all the materials required for a Sawbones simulator. However, once established, the annual costs of running the simulators are comparable. A high fidelity, validated simulator that can accurately evaluate a surgical trainee and demonstrate transfer of skills to a real life operation is a goal for surgical education. There are a multitude of virtual surgical simulators available; however, many of them need to be properly evaluated to determine what upgrades are required to attain this goal. The newly developed fidelity questionnaire is very useful to determine which simulators are closest to real life.
At this time, the cohort of orthopedic surgery residents that participated in this study preferred to use the current standard Sawbones simulator for surgical fixation of the ulna, mostly owing to the realism of the tools being used, the ability to use both hands simultaneously and the benefits of hearing sounds produced as the fixation of the ulna is being completed. This is similar to general surgery residents who favour video box trainers to virtual laparoscopic simulators.35 With more experience in developing these virtual simulators, increased funding and the advent of new software and hardware, virtual simulators may one day match our current standards. The ultimate goal is not to replace the current simulators or methods of teaching surgical skills, but to use virtual simulators as an additional resource for training.
Acknowledgements
The authors would like to thank the developers of the virtual ulna simulator from the Schulich School of Engineering (Calgary), all participants from the University of Calgary orthopaedic surgery program, and Synthes (Canada) Ltd. for their contribution.
Footnotes
Presented as a poster at the 28th Annual Surgeon’s Day Research Symposium, Calgary, Alta., June 25, 2010, and at the 2010 Canadian Orthopaedic Residents Association (CORA) Meeting, Edmonton, Alta., June 18, 2010.
Conflicts of interest: None declared.
Contributors: All authors contributed to the study design, analyzed and interpreted the data, reviewed the article and approved its publication. J. LeBlanc acquired the data and wrote the first draft of the article, and C. Hutchison and T. Donnon revised it for important intellectual content.
Funding: The COREF grant from the University of Calgary Division of Orthopaedic Surgery funded this study by allowing for the design and development of the virtual simulator.
- Accepted August 28, 2012.