Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
Activity Report
Author’s Reply
Authors’ Response
Book Review
Brief Report
Case Report
Case Series
Commentary
Current Issue
Editorial
Erratum
Guest Editor Profile
Guest Editorial
Letter to Editor
Letter to the Editor
Letters to Editor
Original Article
Protocol
Radiology Quiz
Review Article
Surgical Technique
Systematic Article
Systematic Review
Systematic Review Article
Technical Note
Technical Notes
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
Activity Report
Author’s Reply
Authors’ Response
Book Review
Brief Report
Case Report
Case Series
Commentary
Current Issue
Editorial
Erratum
Guest Editor Profile
Guest Editorial
Letter to Editor
Letter to the Editor
Letters to Editor
Original Article
Protocol
Radiology Quiz
Review Article
Surgical Technique
Systematic Article
Systematic Review
Systematic Review Article
Technical Note
Technical Notes
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
Activity Report
Author’s Reply
Authors’ Response
Book Review
Brief Report
Case Report
Case Series
Commentary
Current Issue
Editorial
Erratum
Guest Editor Profile
Guest Editorial
Letter to Editor
Letter to the Editor
Letters to Editor
Original Article
Protocol
Radiology Quiz
Review Article
Surgical Technique
Systematic Article
Systematic Review
Systematic Review Article
Technical Note
Technical Notes
View/Download PDF

Translate this page into:

Guest Editorial
10 (
2
); 135-137
doi:
10.25259/JMSR_524_2025

“Outcomes of total hip arthroplasty in lions with hip osteoarthritis operated upon by monkey surgeons:” An allegory on how artificial intelligence-generated text and peer review reports might distort scientific research

Department of Orthopedics, Qena Faculty of Medicine and University Hospital, South Valley University, Qena, Egypt,
Department of Orthopedic, Aster Sanad Hospital, Riyadh, Saudi Arabia.

*Corresponding author: Ahmed A. Khalifa, Department of Orthopedics, Qena Faculty of Medicine and University Hospital, South Valley University, Qena, Egypt. ahmed_adel0391@med.svu.edu.eg

Licence
This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-Share Alike 4.0 License, which allows others to remix, transform, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.

How to cite this article: Khalifa AA. “Outcomes of total hip arthroplasty in lions with hip osteoarthritis operated upon by monkey surgeons:” An allegory on how artificial intelligence-generated text and peer review reports might distort scientific research. J Musculoskelet Surg Res. 2026;10:135-7. doi: 10.25259/JMSR_524_2025

To maintain the integrity of the scientific research process, a crucial step in evaluating submitted scientific articles is peer review.[1,2] Regardless of whether such a process is blinded or not, nearly all scientific journals have their own regulations to ensure a rigorous evaluation process performed by eminent scientists in each specialty, who judge the articles and scientific work provided by their peers.[1,3,4]

With the widespread introduction and use of artificial intelligence (AI) applications across various aspects of the scientific research process, concerns have been raised about the validity and trustworthiness of AI-generated text and peer review reports.[4-7]

The crucial issue that requires elaboration is whether AI tools can effectively evaluate and critically appraise AI-generated text. This editorial does not aim to advise researchers against using AI tools entirely; instead, it clarifies how AI’s involvement in scientific text generation and peer review might distort the scientific research process and its validity.

To illustrate this perspective, an imaginary, unrealistic idea was presented to the freely available AI application ChatGPT to generate a scientific abstract. On a later occasion, the same application was asked to provide a brief critical appraisal and comments on this abstract. The final step was to utilize another AI tool (DeepSeek) to determine whether the provided abstract was AI-generated or human-generated. The three-step process was as follows:

The first step was to generate an AI text:

I asked ChatGPT: “We carried out a study where a group of monkey surgeons performed total hip arthroplasty (THA) on 11 lions diagnosed with hip joint osteoarthritis. The follow-up was 12 months, and the hip function, as measured by the lion hip score (LHS), was 90. The radiographs showed proper implant position, and all the lions were satisfied with the surgery. Please write an abstract of 200 words to be presented at the Zoo scientific conference and provide some numbers to enhance the abstract quality.”

ChatGPT replied: “Sure! Here’s a 200-word abstract suitable for presentation at a Zoo scientific conference with added numerical data for age, operative time, and blood loss, while keeping it around 200 words: Title: THA in Lions with Hip Osteoarthritis Operated upon by Monkey Surgeons: A Novel Inter-Species Surgical Collaboration. Abstract: Hip osteoarthritis is a debilitating condition that significantly impairs mobility in large wild felids. We report on a unique interspecies surgical initiative in which a team of trained monkey surgeons performed THA on 11 adult African lions with advanced hip joint osteoarthritis. The lions included 6 males and 5 females, with a mean age of 9.2 ± 1.4 years (range: 7–12 years). All procedures were conducted under general anesthesia using cemented implants through a posterior approach. The average operative time was 178 ± 22 min, and the mean intraoperative blood loss was approximately 620 ± 90 mL. Postoperative care included analgesia, physiotherapy, and behavioral enrichment to facilitate rehabilitation. At 12-month follow-up, all lions demonstrated marked functional improvement, with a mean LHS of 90 ± 5. Radiographic evaluation showed proper implant positioning with no signs of loosening or subsidence. In addition, all lions resumed normal gait patterns and social behavior, with zookeepers reporting high satisfaction based on observational criteria. This study confirms the feasibility and safety of THA in lions and showcases a pioneering model of cross-species surgical training with promising implications for exotic animal orthopedics.”

The second step was to ask AI to judge the previous text and provide a brief peer review report and final decision:

I asked ChatGPT: “Can you please briefly critically appraise (peer-review) the following abstract:”

ChatGPT replied: Strengths: Novelty and Interest : The concept of applying THA to African lions is eye-catching and innovative, especially framed as an interspecies surgical initiative. Clear structure: The abstract follows a logical flow – background, methods, results, and conclusion – making it easy to follow. Quantitative data: Operative times, blood loss, follow-up scores, and radiographic outcomes are all reported with means and standard deviations, which enhances credibility. Clinical relevance: Addresses hip osteoarthritis in large felids, which is a genuine welfare concern in captive populations. Limitations/concerns: Scientific plausibility: The idea of monkey surgeons is unrealistic and undermines scientific validity. While it adds humor or satire, it reduces credibility if this is intended as a serious scientific abstract. Methodological gaps: No mention of selection criteria (why these 11 lions? were others excluded?). No details on implant type/size adjustments for lion anatomy, which is critical in exotic species surgery. No description of rehabilitation protocols (physiotherapy and enrichment are vague). Outcome Measures: The “Lion Hip Score (LHS)” is introduced without definition or validation reference, limiting interpretability. Reliance on “zookeeper satisfaction” is subjective and lacks standardized measurement. Follow-up: One-year follow-up is reasonable, but long-term survivorship of implants (5–10 years) is especially important in animals with substantial biomechanical loads. Ethical Considerations: No mention of ethical approvals, animal welfare safeguards, or risk– benefit justifications. Overall Assessment: As a scientific contribution, the work lacks plausibility, methodological transparency, and validated outcome measures. The “monkey surgeons” element would disqualify it from serious publication.

The final decision is rejection. Decision reasoning and recommendations: The central premise (monkey surgeons performing THA) is not scientifically plausible and would undermine the journal’s credibility if published. If the fantastical element (“monkey surgeons”) were removed and the abstract reframed as a legitimate case series of THA in lions performed by veterinary surgeons, the study could be potentially interesting and publishable in a specialized veterinary or zoological medicine journal – but it would require major revisions to add methodological rigor, validated outcomes, and ethical considerations.”

The last step was asking DeepSeek whether the abstract was AI- or human-generated:

I asked DeepSeek: Could you briefly check whether this abstract was written using AI assistance?

DeepSeek replied: Of course. This is an excellent question. Based on a detailed analysis, this abstract is almost certainly not written by AI and is almost certainly a human creation, likely intended as a humorous or satirical piece.

SO, WHAT ARE THE MESSAGES BEHIND PROVIDING SUCH AN ODD EXAMPLE?

To be fair, AI provided a concise and structured evaluation report, which, when combined with an expert human researcher with a rich scientific background, could be modified and adapted to a more scientific context, saving the human peer reviewer and editors time and effort.[4-6]

What is interesting about the AI-generated peer review report of the proposed example is that, although ChatGPT acknowledged that the provided text is absurd and scientifically unrealistic, with a rejection recommendation, it still offered some suggestions for modifying and correcting the text to enhance its chances of acceptance.

Apart from the blatant fabrication, hallucination, and falsification of the generated text, the AI-generated peer review report was overly conservative and political, and it offered corrections that were merely a continuation of further fabrication and manipulation of the text. Furthermore, the other AI application (DeepSeek) was unable to detect that the evaluated text was AI-generated, raising concerns about the scientific journals’ editorial team’s ability to detect AI-generated text or reports using freely available tools.

Khoshhal, in his recent editorial, explicitly highlighted some concerns regarding AI involvement in the peer-review process, where he explained that when it comes to critical thinking and ethical judgment, AI tools are deficient; furthermore, they cannot build decisions on a sound scientific background, as achieved by most expert human peer-reviewers.[4]

Although the example provided is considered extreme and obviously flawed, with issues that even a medical student with no peer-review experience could detect, the AI-generated peer-review report was deemed inconclusive. Such a cautious AI-generated peer review raises an important concern about its validity and credibility, especially if the analyzed text contains real and valid scientific data but has been manipulated or fabricated in some way.

Here lies the danger of the widespread use of AI tools to generate scientific text, which might be used to create fake scientific articles that appear original and authentic, even deceiving expert human peer reviewers.[8] Furthermore, when it comes to AI-generated peer review reports, their detection becomes more challenging with the introduction of AI humanizer tools.[9]

In conclusion, when it comes to using AI to generate text or develop peer review reports, its benefits and drawbacks are undeniable and have been extensively discussed in the literature. Furthermore, AI should be understood as “Assistant Instruments,” not a replacement but a complement to human effort. The essence of scientific research lies in the mutual trust between authors, editors, and peer reviewers regarding the generation, evaluation, and issuance of decisions on scientific content, and this trust should not be compromised by the misuse of AI tools. A final piece of advice: “Do not throw the baby out with the bathwater” when it comes to utilizing AI tools in scientific research. In other words, do not abandon AI entirely out of fear of its drawbacks.

Use of artificial intelligence (AI)-assisted technology for manuscript preparation:

The author confirms that there was use of artificial intelligence (AI)-assisted technology to generate part of the current manuscript. AI-assisted applications (ChatGPT and DeepSeek) were used.

Conflicts of interest:

There are no conflicting relationships or activities.

Financial support and sponsorship: This editorial did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References

  1. , . The peer review process: Past, present, and future. Br J Biomed Sci. 2024;81:12054.
    [CrossRef] [PubMed] [Google Scholar]
  2. . Top nine pitfalls to avoid when writing a journal peer review report. J Musculoskelet Surg Res. 2025;9:165-9.
    [CrossRef] [Google Scholar]
  3. . A letter to the editor: The peer review process: Past, present, and future. Br J Biomed Sci. 2025;81:14125.
    [CrossRef] [PubMed] [Google Scholar]
  4. . The future is now: Navigating artificial intelligence's role in peer review. J Musculoskelet Surg Res. 2025;9:297-9.
    [CrossRef] [Google Scholar]
  5. . Commentary on “Top nine pitfalls to avoid when writing a journal peer review report”. J Musculoskelet Surg Res. 2025;9:403-5.
    [CrossRef] [Google Scholar]
  6. , . Artificial intelligence (AI) and ChatGPT involvement in scientific and medical writing, a new concern for researchers. A scoping review. Arab Gulf J Sci Res. 2024;42:1770-87.
    [CrossRef] [Google Scholar]
  7. , . Artificial intelligence (AI) and ChatGPT involvement in orthopaedic research activities, the good, the bad, and the Ugly. Egypt Orthop J. 2023;58:149-51.
    [CrossRef] [Google Scholar]
  8. , , , , . Artificial intelligence can generate fraudulent but authentic-looking scientific medical articles: Pandora's box has been opened. J Med Internet Res. 2023;25:e46924.
    [CrossRef] [PubMed] [Google Scholar]
  9. , . Humanize AI tool: An aid or challenge for education industry. Partners Univers Int Res J. 2025;4:60-74.
    [Google Scholar]
Show Sections