Methodology of an e-Delphi study to explore future trends in orthopedics education and the role of technology in orthopedic surgeon learning
How to cite this article: Rüetschi U, Salazar CM. Methodology of an e-Delphi study to explore future trends in orthopedics education and the role of technology in orthopedic surgeon learning. J Musculoskelet Surg Res 2022;6:121-37.
To present the methodology of an e-Delphi study conducted to learn about trends and technology in the present and future continuing medical education for trauma surgeons at different stages of their careers.
Apply proven tools for gathering expert opinions on complex questions and obtain consensus by controlled, quasi-anonymous feedback collection using electronic means.
A three-round e-Delphi achieved consensus.
The e-Delphi methodology applied can successfully be used to predict future trends in surgeon education and the impact of learning technology.
Surgeon continuing medical education
The field of education has a history of adopting digital technology to enhance learning, with varied results. Providers of continuing medical education (CME) is keen to incorporate technology into learning options, but it can be challenging to gauge what would best serve surgeon needs. CME providers are faced with the constant introduction of new surgical techniques and skills, and diverse educational needs that vary based on a surgeon’s career stage, geographical location, access to and familiarity with technology, and other variables. Gaining a deeper understanding of how surgeons currently use technology for learning and their opinions on how technological trends will impact learning can help CME providers better tailor educational offerings to optimize learning outcomes.
The AO Foundation was established in 1958 as a non-profit organization that conducts research and development and provides education for surgeons working with musculoskeletal injuries. To proactively assess the evolving educational needs of the group’s global network of trauma surgeons, it was decided to gather and analyze expert opinions on trends in surgeon CME and the role of technology in surgeon learning. In addition, identifying regional differences were also of importance. The results of study results were published in a medical education research journal in 2019.
Selection of study methodology
There are several ccepted protocols for collecting and evaluating qualitative data that focus on building consensus within a group, for example, the nominal group technique, Glaser’s approach, the National Institutes of Health (NIH) consensus development process, and the Delphi process.[5-7] Each has its advantages and disadvantages and there are certain situations where one may be indicated over the others.
Fink et al. stated that the nominal group process requires an expert panel to meet in person and be facilitated through “highly structured” or “brainstorming” discussions. It has been pointed out that consensus reached through in-person group interaction “is an emergent property of the group interaction, not a reflection of individual participants’ opinions.” Individual personalities may dominate the discussion and motivate conformity, producing unrepresentative results. The quality of results from the nominal process or a focus group is highly reflective of the skill of the facilitator and the kind of interactions participants engages in. The NIH consensus development process also involves bringing together a panel of experts to identify “current levels of the agreement.”
As described by Fink et al., Glaser’s approach requires the researcher to assemble a small core of experts who then invite additional members to join the panel at different stages and provide input on a draft position paper core group has authored. Input is then reviewed by the core panel and the paper successively redrafted. Each level of the discussion under the moderator’s guidance is informed by professionals in the field.
Kadam et al. found that the nominal and Delphi methods produced similar results, recommending either method for health service research. However, for this study, the Delphi method was selected as it is widely accepted to be an effective method for gathering opinions from experts without geographical or time constraints.[13,14] It has been called a “flexible, effective, and efficient research method.” Linstone and Turoff characterize Delphi “as a method for structuring a group communication process so that the process is effective in allowing a group of individuals, as a whole, to deal with a complex problem.” Indeed, it has been suggested that the Delphi model lends itself well to modification based on a researcher’s needs,[17,18] thus providing an element of flexibility.
Delphi methodology offers the opportunity to build consensus between individuals on specific topics without the need for face-to-face meetings. The pool of expert surgeons targeted by our study question was international in nature. The potential for face-to-face meetings was severely restricted by cost and schedules.
Another benefit of the Delphi methodology, one that bypasses the undue influence of group dynamics, is the fact it can be conducted in a completely anonymous manner. Distribution and collection through electronic means provide this anonymity, reduces costs, and allows participants to complete surveys at their convenience. As new technology is incorporated into daily use, the toolbox for administering Delphi surveys has evolved: From pen and paper surveys delivered through the postal service, to facsimile, and more recently, email or online surveys. Using digital channels to conduct Delphi surveys introduced the term “e-Delphi.”[8,18,21]
Central to Delphi methodology is the goal of reaching consensus in a structured way. Structured, repeated, and anonymous survey rounds provide experts the opportunity to reflect on multiple iterations of a questionnaire. After each round, the questionnaire is revised to reflect participant input and controlled feedback is delivered to participants. The process is repeated for a predetermined number of rounds or until consensus is reached, as defined by a consensus rule, or experts will no longer alter their opinions.
Delphi method: Application
The Delphi method is a popular consensus method that has been used in various fields, including business, health, and medicine, for some time.[6,7,18,23-25] Since its development, it has been used particularly by fields that grapple with complex problems. The Delphi study methodology was notably developed by Olaf Helmer and Norman Dalkey of the RAND Corporation in the 1950s during the Cold War as a method for predicting future events as they related to national defense.[7,26,27]
MATERIALS AND METHODS
The Delphi technique allows for collecting and aggregating informed opinions from a group of experts over multiple iterations. While this is the basis of the technique, over time, there have been numerous modifications to what is labeled “Delphi” as there are no universally agreedon guidelines for a Delphi study’s structure. However, Green credits Stewart and Shamdasani for delineating generic steps in a Delphi study [Table 1].[29,30]
|1||Develop the initial Delphi probe or question|
|2||Select the expert panel|
|3||Distribute the first-round questionnaire|
|4||Collect and analyze Round 1 responses|
|5||Provide feedback from Round 1 responses, formulate the second questionnaire based on Round 1 responses and distribute|
|6||Repeat Steps 4 and 5 to form the questionnaire for Round 3|
|7||Analyze final results|
|8||Distribute results to panelists|
Despite a large number of published Delphi studies, there is “very little scientific evidence” to support decisions on the optimal number of rounds for a study. It can be challenging to reach a consensus with too many rounds, which take place over weeks or months and may require large blocks of panelists’ time to complete. There is a risk of participant fatigue and dropping out if the study continues for too long.
A three-round Delphi study was designed, as shown in [Figure 1]. We categorized our study as an “e-Delphi” as Keeney et al. stated that this sub-category follows “similar processes to a classical Delphi but [is] administered by email or online survey.” It should be noted that the study administrators were open to four rounds should it be indicated. If over 80% of statements had consensus after Round 3, the study would be considered complete, and another round would not take place.
Expert panel selection
Gordon states that “key to a successful Delphi study lies in the selection of participants.” However, there is insufficient research to support a claim of optimal panel size and selection procedures for Delphi studies. Some panel size recommendations include 10–15 participants, and 15–35 participants. Sekayi et al. called more than 30 participants an “unwieldy” number of panelists.
The literature suggests a panel be comprised of experts who are international and heterogeneous or homogeneous, depending on the aim of the study. Therefore, four experts from each of the five regional divisions in the organization (North America, Europe, and Southern Africa, Asia Pacific, Latin America, and the Middle East and North Africa) were nominated for participation by the respective AO Trauma Education Commission representative aiming for a heterogeneous panel of opinions. Nominated experts needed to meet all of the following criteria: (1) Demonstrated profound insight and interest in medical education both of residents and practicing orthopedic trauma surgeons; (2) possessed an interest in new developments in surgical simulation, emerging technologies, and the latest research in medical education; (3) recognized as clinical leaders in medical education; and (4) open-minded and forward-thinking.
The nominated surgeons were invited to participate through an individualized introductory email. Interested individuals expressed their willingness to participate by return email. Out of 20 nominated panelists, five declined participation – representing all five regions equally.
To avoid dominance or affiliation biases from influencing the study, the identities of all participants were anonymized. All efforts were made to maintain anonymity throughout the study process. No identifying information was supplied to participants at any time that would enable them to identify fellow panelists. Panelists were assigned an alpha-numeric identifier (e.g., A1, A2, B1, etc.), which was used to distinguish a specific individual’s input within the various rounds. Letters A, B, C, D, and E were used to identify each of the five geographical regions involved in the study. The numbers 1, 2, 3, and 4 each corresponded to an invited expert panelist.
However, it should be noted that panelists, if known to each other, may have communicated about their participation. It was not possible to control for this variable; international experts may have minimized its possibility. Furthermore, the study administrator was aware of the identities of participants, which made the study quasi-anonymous. This is an unavoidable trait of most, if not all, Delphi studies.
Controlled feedback is an important characteristic of Delphi studies. This generally consists of organized summaries of all responses being distributed to all participants. They are then able to see where their input falls within the spectrum of responses, clarify/revise their position, and/or provide additional insight.
This study solicited input as either written statements from participants (Round 1) or a numerical agreement ranking (Round 2 and Round 3). The study administrator provided the participants with summaries that captured the input range and asked each panelist to rank their agreement with the statements. To maintain anonymity, anonymized identifiers were attached to each summarized statement, so respondents could see the number of panel members who expressed each idea.
Surveys of literature have revealed that varied methods are used to determine consensus and there is not a single agreedon or employed definition of consensus. It is, therefore, up to the individual researcher to set and abide by consensus rules of their selection. [Table 2] for the consensus thresholds used for this study. This guided decision-making about which statements panelists agreed or did not agree on (Round 2 and Round 3), as well as which statements were to be excluded (Round 3).
|The mean/ average score is>4 on the 5-point Likert scale||The mean/ average score is<4 on the 5-point Likert scale||After three rounds, if the mean/average score is<4 on the 5-point Likert scale, the statement is excluded|
Round 1 – Open-ended questionnaire
The study administrator developed a set of nine questions [Appendix I] targeting the study’s aim that solicited panelists’ opinions on surgeon education in the present and future, with consideration of both global and regional differences. In addition, the questions asked for predictions for the year 2022 and how surgeons at different stages of their careers would use technology to access CME, where and when they would access it, and what entity would fund their CME.
[Figure 2] is an illustration of the matrix of information requested for each level of the surgeon (trainee/resident, practicing, and expert). An open-ended questionnaire was distributed by email as a Microsoft Word document to panelists who agreed to participate. They were asked to provide their opinions as longformat responses typed directly into the document.
The Round 1 questionnaires were anonymized on receipt with each panelist’s alpha-numeric identifier, for example, A1, A2, etc., attached to their input. Keywords and main ideas were extracted from the panelists’ written responses and these were tabulated. Twenty-six summary statements were synthesized by the study administrator, which aimed to capture the range of panelists’ Round 1 input (Example: “Conflict of interest and compliance/regulation (A3, C3, and E4) issues currently exist and may need addressing in the future.”
Round 2 – Ranking evaluation
Twenty-six summary statements based on Round 1 input were used to formulate the Round 2 questionnaire [Appendix II]. This questionnaire used the anonymous identifiers to indicate to participants how many panelists echoed each statement, as well as identify where their opinions fell within the range of responses as they knew their own opinion on the question.
The Round 2 questionnaire included a 5-point Likert scale for respondents to indicate their level of agreement with each summary. Each summary was also accompanied by a comment field to allow for the provision of feedback, revision to their position, and/or justification for their ranking. It was distributed to and returned by, the expert panel email.
upon receipt of the Round 2 questionnaires, the ranking scores of each question were tabulated by the study administrator. Comments, if provided, had their keywords and ideas extracted to determine if and what summary statement revisions were indicated. The decision to make revisions was based on consensus thresholds [Table 2], that is, if consensus was indicated (mean >4), then no change was made to a statement.
Median, mean, standard deviation and 95% confidence intervals were calculated for each question. The standard deviation is commonly used to “access consensus.” These values allowed the study administrator to determine if consensus had been reached based on the predetermined consensus rules [Table 2]. In addition, these analyses indicated where the expert group’s opinions converged, diverged, and were potentially influenced by extreme outliers.
Round 3 – Ranking evaluation
If a lack of consensus was indicated (i.e., mean <4), the corresponding Round 2 summary statement was revised. Comments that gave insight into the source of disagreement were referenced to inform the revisions. It should be noted that not all respondents provided comments to support their scoring decisions. The anonymized identifiers were modified with the addition of the number one at the front of the code to indicate that a statement had been added or revised based on feedback from Round 2, for example, 1A1, 1E2, etc. This allowed the panel to identify the aspects of each summary statement that was different from the previous round.
These revised statements, as well as the questions that had established consensus, were then issued as the Round 3 questionnaire [Appendix III]. It consisted of 25 summary statements and included a 5-point Likert scale for each question but no comment field. The number of statements in Round 3 was reduced by one due to an oversight and accidental deletion of a question (note a). The Round 3 questionnaire was distributed email to all panelists.
upon return of the Round 3 questionnaires, mean, median, standard deviation, and 95% confidence intervals were calculated for each summary statement to determine if consensus had been reached based on the predetermined consensus rules [Table 2].
Final summary distribution
Seven key findings were distilled from the statements that remained after decisions regarding which to discard were made. These decisions were aligned with consensus rules set during the study design phase. The key findings were synthesized into a final summary document that was circulated to participants email [Appendix IV].
The Delphi technique (e-Delphi) was used to gather the opinions of a geographically scattered expert panel around the role of technology in surgeon CME. Future trend predictions, regional variation, as well as differences in surgeon educational habits at different career stages, formed a matrix of information that was of interest [Figure 2].
The Delphi method is an acknowledged technique for examining complex questions and offers flexibility to researchers. This study’s questions [Appendix I-III] were not factual – they required predictions and opinions, which were used to build consensus – therefore, it was a suitable study type to use for this research.
However, the Delphi technique also has limitations. One drawback is the time it takes to complete, which can lead to participant fatigue and dropout. Our study did not experience this issue. Of the 20 invited panelists, 15 agreed to participate and all 15 completed all three rounds of the survey. This is a 75% initial response rate and within this cohort, a 100% completion rate; this is extremely unusual for a Delphi study.
A diminishing response rate over each iteration of the survey is an acknowledged risk of the Delphi process and one that can compromise the quality of the information. A high response rate may indicate the panel felt that the study was worthwhile. The ease of communication and distribution/ submission email may have also contributed to the unusually high response rate as panelists were free to complete the questionnaires when most convenient. The email made the distribution and submission process instantaneous to all panelists regardless of location, eliminating a time lag that the postal service would introduce.
Some researchers have pointed out that in-depth conversations are beneficial when seeking consensus as this offers the opportunity for deeperlevel thinking to reveal a “conceptual basis” for opinion, that is, a rationale for an opinion. If a Delphi study includes a face-to-face meeting of panelists at some point in the process, it is labeled as “modified.” Due to cost, time, and scheduling constraints, an in-person meeting of our panelists was not possible. It should be noted that this decision is aligned with the design of a traditional or “classic” Delphi study and does not invalidate any results.
Face-to-face meetings can have drawbacks. Groups have the inherent risk of being dominated by stronger personalities, possibly preventing some members from freely expressing their opinions. Social conformity is welldocumented in human populations and humans are “highly susceptible to social influence.” Aside from erasing the anonymous aspect of the study, a face-to-face meeting introduces an element of social influence that may result in normative conformity. During face-to-face discussions, panelists may experience social pressure to conform with a particular opinion, even if it is not an opinion they share, and maybe socially rewarded for doing so. Our study did not include in-person interaction to minimize this probability and anonymized input to further remove participants from normative conformity pressures.
The three-round e-Delphi method suceeded in achieveing a consensus among the experts.
This e-Delphi study was conducted to learn about predicted future trends in surgeon CME as well as surgeon use of technology in learning. The Delphi method is a technique that has been shown to be successful in building consensus on a given topic. We have obtained a series of statements that the expert panel members agree to represent predicted trends in surgeon CME that incorporate regional differences and the needs of surgeons at different stages of their careers. The organization will use this information to integrate appropriate technology into the modification and development of existing and future learning resources and courses.
UR and CMOS designed and directed the project. UR performed the study and wrote the article.
The authors confirm that this study had been prepared in accordance with COPE roles and regulations. Given the nature of the study and the fact that it did not include any patient-related data, the IRB review was not required.
DECLARATION OF PARTICIPANTS’ CONSENT
The authors certify that they have obtained all appropriate participants consent forms. In the form, the participants have given their consent for their images and other clinical information to be reported in the journal. The participants understand that their names and initials will not be published and due efforts will be made to conceal their identity, but anonymity cannot be guaranteed.
FINANCIAL SUPPORT AND SPONSORSHIP
This study did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
CONFLICTS OF INTEREST
The authors report no conflicts of interest. Urs Rüetschi is an employee of the AO Foundation, Switzerland.
- Durham: Education Endowment Foundation; 2012.The Impact of Digital Technology on Learning: A Summary for the Education Endowment Foundation, School of Education, Durham University
- [Google Scholar]
- AO Foundation. Available from: https://www.aofoundation.org/Structure/the-ao-foundation/pages/about.aspxAbout the AO Foundation. 2018.
- [Google Scholar]
- Electron J Bus Res Methodol. 2005;3:103-16.A generic toolkit for the successful management of delphi studies.
- [Google Scholar]
- Qual Rep. 2017;22:2755-63.Qualitative Delphi method: A four round process with a worked example.
- [Google Scholar]
- Boston: Addison-Wesley Educational Publishers Inc.; 1975.The Delphi Method: Techniques and Applications
- [Google Scholar]
- http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6017716 [Last accessed on 2018 Oct 18]Using the Delphi Method. 2011.
- [Google Scholar]
- London: Jessica Kingsley Publishers; 1996.Gazing Into the Oracle: The Delphi Technique and its Application to Social Policy and Public Health
- [Google Scholar]
- West Sussex: Wiley-Backwell; 2011.The Delphi Technique in Nursing and Health Research
- [Google Scholar]
- Applied Social Research Methods Series. Vol 20. Newbury Park: Sage; 1980.Focus groups: Theory and practice In:
- [Google Scholar]
- Practical Assessment, Research and Evaluation. Vol 12. 2007.The Delphi technique: Making sense of consensus In:
- [Google Scholar]
- Futures Research Methodology, AC/UNU Project. 1994.The Delphi Method In:
- [Google Scholar]
- Adler M, Ziglio E, eds. Gazing into the Oracle: The Delphi Technique and its Application to Social Policy and Public Health. London: Jessica Kingsley Publishers; 1996. p. 89-132.A comprehensive study of the ethical, legal and social implications of advances in biochemical and behavioral research technology In:
- [Google Scholar]
- Adler M, Ziglio E, eds. Gazing into the Oracle: The Delphi Method and its Application to Social Policy and Public Health. Vol 2. London, Philidelphia, PA: Jessica Kingsley Publishers; 1996. p. :34-55.Theoretical, methodological and practical issues arising out of the Delphi method In:
- [Google Scholar]