We agree in part with the criticism of Schlachta and colleagues that our conclusion that the simulator we used “is a valuable tool for training and evaluation of basic tasks in laparoscopic surgery” is not substantiated by the methods used. However, our article is only one of several that have been presented at national and international peer-reviewed meetings and have been published or accepted for publication in this and other peer-reviewed journals.1–3
Validity is a matter of degree and does not exist on an all-or-none basis. Finding a significant correlation between performance scores and level of training from junior to senior residents suggests a degree of construct validity. Further, in this pilot study, where residents were followed through their training, residents’ scores and the total score increased as they underwent more training in 2 out of 3 tasks. Practice effects could confound such results; however, these residents were only evaluated at 2 points in time and had no practice on the simulator in the interim. The original 7 inanimate tasks developed1 were modelled after fundamental laparoscopic techniques rather than isolated psychomotor skills, thus adding face validity. Face validity was further ensured by consensus of more than 20 well-known advanced laparoscopic surgeons that these tasks were meaningful representations of components of laparoscopic surgery. In another study2 we found that residents who practised in this inanimate model performed better in a live animal model and acquired skill more quickly than a peer group at the same PGY3 level of training who had not practised in the inanimate model. The scores in the animate model for the group that practised were also superior to those of the group without practice,3 and the scores in the inanimate model correlated significantly with analogous skills measured in the live animal in the operating room.2 All of these data support the validity of the inanimate system for measuring laparoscopic skills.
We agree that this model will require further validation by ultimately correlating performance in the model with level of surgical skill in the operating room. At this point there is no measure of skill in the operating room that can act as the “gold standard.” We are in the process of conducting a large multicentre study to test the reliability and validity of such a scoring system.