Evaluating ChatGPT’s efficacy in addressing common patient questions in plastic surgery consultations
Abstract
Aim: This study aims to investigate the ability of ChatGPT to answer common patient questions related to the five most common aesthetic procedures.
Methods: We asked three questions, two of which were the same across procedures, for each of the five most common plastic surgery procedures as determined by the 2022 American Society of Plastic Surgery (ASPS) Procedural Statistics Release. These were posed to ChatGPT on the same day, using the same account. Then, they were compared to the corresponding information on the ASPS website.
Results: We found that ChatGPT provides accurate, organized, and grammatically correct responses to common patient questions. It was comparable to the ASPS website in terms of comprehensiveness of the complications listed across procedures. However, its responses regarding recovery time were less detailed than the corresponding ASPS articles. It included accurate information on recovery time that was unavailable on the ASPS site. For procedure-specific questions, ChatGPT was more detailed 2/5 times, less detailed 1/5 times, and provided a completely different answer than the ASPS website 2/5 times.
Conclusion: This study provides support for ChatGPT’s utilization as a tool to improve the efficiency of consultations for aesthetic procedures. However, it is important to recognize ChatGPT’s limitations in answering questions in a patient/procedure-specific way. Therefore, it is not a substitute for an experienced surgeon consultation. Further research is needed to assess the reliability of ChatGPT before it can be fully recommended as an ultimate patient learning tool.
Keywords
INTRODUCTION
ChatGPT is a generative artificial intelligence (AI) platform developed by OpenAI. It was trained on extensive text datasets in multiple languages to be able to generate human-like responses to text-based input[1]. Since its release, numerous studies have been published on how generative AI could revolutionize our lives and improve efficiency in fields such as computer programming, environmental studies, and medicine[2,3]. Specifically in healthcare, future iterations of generative AI have the potential to analyze speech patterns and imaging for early diagnosis of psychiatric illness and cancer, along with the potential to create accurate models of biological processes to streamline drug development and testing[4]. The medical field, which is constantly seeking ways to improve patient outcomes, increase productivity, and enhance patient satisfaction, was quick to adopt and research this technology. Current research indicates that AI can perform comparably to humans on board exams and diagnosing patients[3,5,6]. Additionally, studies have been published on how ChatGPT can be used to enhance research productivity, aid in patient education, and help with clinical decision making[5,6]. However, ChatGPT and other AI models may provide inaccurate or outdated information because they cannot distinguish between reliable and unreliable resources, and they may also fabricate information entirely. They are additionally sensitive to the phrasing of questions and struggle to clarify ambiguous prompts[5]. These findings prompt further exploration into whether ChatGPT could be a valuable tool to support or even replace certain functions of physicians within the healthcare system.
One area where ChatGPT’s utility can be further evaluated is in medical consulting and patient education. Existing studies present mixed results on ChatGPT’s ability to perform this function. While one study showed that ChatGPT is comparable to physicians in terms of responding based on evidence-based guidelines, another study showed it to be unreliable in its completeness and accuracy[7,8]. A third study went so far as to claim that ChatGPT exceeded physicians in accuracy, completeness, and overall quality[9]. These conflicting results underscore the need for additional investigations.
Inconsistent findings could be attributed to variations in ChatGPT itself or in the type of prompts presented. For example, one study presented ChatGPT with 123 prompts. While they found most of its answers to be above average, some responses were described as “hazardously incorrect and incomplete”. This study also showed that the style of the question - whether it demonstrated health literacy, used negation, or asked a question - significantly influenced the response received by ChatGPT[10]. Another review, which analyzes fewer prompts, emphasized ChatGPT’s consistency as one of its strengths over human physicians[11]. Understanding the contexts in which ChatGPT is reliable is crucial, as variations in its accuracy and content could impact its recommendation as a patient education tool.
Due to the elective nature of aesthetic procedures, patient education is essential in deciding if they are suitable. Past research has consistently shown ChatGPT to be easy to understand and correct when answering questions in a mock plastic surgery consultation. Discrepancies still exist in its ability to provide individualized advice. Three articles explore the ability of ChatGPT to answer broad questions related to a specific plastic surgery procedure in a mock consultation setting for rhinoplasty, abdominoplasty, and breast augmentation. All concluded that ChatGPT was consistently correct, comprehensive, and well-organized. However, it lacked the ability to provide personalized advice[12-14]. Another study using vignette-style questions, giving a description of a patient before asking a question, found ChatGPT to outperform physicians in accuracy, completeness, and overall quality based on ratings from physicians[9]. This suggests that ChatGPT can provide more individualized advice when given general descriptors of a patient, but it cannot provide this information without a prompt. Understanding the level of information required to elicit specific responses from ChatGPT is important when advising patients on using it for education.
Depending on the results of further research, ChatGPT could either supplement patient education and improve efficiency in initial consultations with surgeons, or could replace the majority of patient education in these consultations. While previous studies have focused on specific consultations for procedures such as rhinoplasties or breast augmentations, limited literature exists comparing ChatGPT’s responses across various plastic surgery procedures. To address this gap, this study aimed to investigate the ability of ChatGPT to answer common patient questions related to the five most common aesthetic plastic surgeries. The main objectives of this study were to determine the comprehensiveness, accuracy, and understandability of ChatGPT’s responses to common patient questions to evaluate its ability to be a source of patient education. Based on previous literature, we believed ChatGPT’s responses would be organized, understandable, and generally accurate, with only minor potential inaccuracies. However, they might not be entirely comprehensive or specific. We employed ChatGPT-3.5 to evaluate its capacity, accuracy, comprehensiveness, and efficacy in providing perioperative responses to patients.
METHODS
We asked ChatGPT three questions related to the five most common aesthetic procedures as determined by the 2022 American Society of Plastic Surgery (ASPS) Procedural Statistics Release[15]. The first two questions were the same for all five procedures and were based on suggested questions from the same report. These two questions were chosen because they were applicable to all five procedures, were not specific to patient or physician, and provided a relatively complete picture of the surgery and its risk profile. The last question was procedure-specific and was chosen based on common complications and procedure specificity. These questions were asked at the same time on the same account in the order they are presented below. The questions were pasted directly from a Word document, where they were assessed for grammatical and syntactical errors. ChatGPT’s responses were then compared to corresponding blogs or articles on the ASPS website and assessed for accuracy and comprehensiveness. For comprehensiveness, if ChatGPT’s response covered the key points of the corresponding ASPS article, it was considered comprehensive. If it omitted information, it was considered less comprehensive, and if it included information not listed, it was considered more comprehensive. Comprehensiveness was not assessed when ChatGPT provided a completely different answer from the ASPS article. The information in ChatGPT’s response was considered accurate if it was comparable to the information in the ASPS article.
RESULTS
Liposuction
What are the risks and complications associated with liposuction?
This response begins with an explanation of what liposuction is and a description of the procedure’s general safety. It then lists 12 possible complications [Figure 1] with a sentence or two of explanation. The response ends with a statement about the importance of following up with a qualified surgeon to discuss the mitigation of these risks.
Figure 1. ChatGPT’s response to the question “What are the risks and complications associated with liposuction?”
ChatGPT provides a suitable explanation of liposuction and some of the associated complications. The ASPS lists the complications associated with liposuction:
• Anesthesia risks
• Bruising
• Cannula breakage
• Change in skin sensation that may persist
• Damage to deeper structures such as nerves, blood vessels, muscles, lungs, and abdominal organs
• Deep vein thrombosis, cardiac and pulmonary complications
• Fluid accumulation
• Infection
• Irregular contours or asymmetries
• Irregular pigmentation
• Need for revision surgery
• Poor wound healing
• Rippling or loose skin, worsening of cellulite
• Swelling
• Thermal burn or heat injury from ultrasound with the ultrasound-assisted lipoplasty technique
In comparison to this list, the ChatGPT response is similarly comprehensive, but both lists include complications not specifically listed by the other source. The ASPS article does not include explanations like the ChatGPT’s response does. Both ChatGPT and the ASPS article address the importance of discussing risks with your surgeon, though only ChatGPT discusses how individual differences may play a role.
How long of a recovery period can I expect, and what kind of help will I need during my recovery?
This response begins with a statement about how recovery time can vary depending on the individual patient. It then describes the general recovery timeline, which is divided into 6 periods [Figure 2]. Each period is accompanied by a couple of sentences on what the patient may be experiencing in terms of recovery. This section is followed by a few general types of assistance the patient may need during their overall recovery. This response ends with a statement on the importance of following the surgeon’s advice and understanding that everyone’s recovery varies.
Figure 2. ChatGPT’s response to the question “How long of a recovery period can I expect, and what kind of help will I need during my recovery?”
In comparison with an ASPS blog on liposuction recovery, the ChatGPT information on the recovery period is similarly organized (by time period) but less detailed. It covers the main points but is missing some specifics on aspects such as drains, when compression garment use can be discontinued, and when work can be resumed. The ASPS was much less detailed about help a patient may need following liposuction, and only mentioned information contained in ChatGPT’s “help at home” point.
Can fat return after liposuction?
This response begins by explaining that liposuction is not a substitute for a healthy lifestyle and that fat can return after the procedure [Figure 3]. It elaborates on this by further explaining liposuction and the importance of maintaining a healthy lifestyle following the procedure. The last paragraph is a statement on the importance of the surgeon’s skills, the patient’s postoperative instructions, and the importance of discussing goals with the surgeon.
Compared with a blog by the ASPS, the response by ChatGPT is less detailed. It covers the main points of the ASPS article: liposuction permanently removes fat cells, but that fat can return if the patient gains weight. Both responses recommend maintaining a healthy lifestyle as a way of preserving postoperative results, but the ASPS includes more specific tips on how this can be achieved.
Breast augmentation
What are the risks and complications associated with breast augmentation?
This response begins with an explanation of what breast augmentation is and a few statements about the importance of considering risk factors before undergoing this procedure. It then lists the following potential risks and complications: infection, bleeding, anesthesia risks, scarring, capsular contracture, implant rupture or leakage, changes in sensation, changes in breast and nipple position, seroma and hematoma, visible rippling or wrinkling, unsatisfactory results, and breastfeeding challenges. Each of these potential risks is accompanied by a sentence or two of explanation. The response ends with a statement about the importance of choosing an experienced surgeon, having a discussion with this surgeon about your medical history, following pre- and postoperative care instructions, and attending follow-up appointments.
ChatGPT provides a suitable explanation of breast augmentation. The ASPS lists the complications associated with breast augmentation:
• Anesthesia risks
• Breast implant-associated anaplastic large cell lymphoma (BIA-ALCL) or other very rare cancers in the capsule around the breast, such as breast implant-associated squamous cell carcinoma (BIA-SCC)
• Bleeding
• Changes in nipple or breast sensation
• Fluid accumulation (seroma)
• Formation of tight scar tissue around the implant (capsular contracture)
• Hematoma
• Implant leakage or rupture
• Infection
• Persistent pain
• Poor scarring
• Possibility of revision surgery
• Wrinkling of the skin over the implant
• Wrong or faulty position of the implant
The list of complications provided by ChatGPT is similarly comprehensive to the one provided by the ASPS, and it explains each possible complication in a concise and easily comprehended manner. Both lists include complications not listed by the other source, such as breastfeeding challenges (listed by ChatGPT) and BIA-aLCL (listed by ASPS). The ASPS article discusses screening recommendations and how pregnancy, weight loss, and menopause may influence breast augmentation, which ChatGPT does not mention. ChatGPT puts more emphasis on following up with your surgeon.
How long of a recovery period can I expect, and what kind of help will I need during my recovery?
This response begins with a statement about how recovery time can vary from person to person based on a variety of factors. It then divides the postoperative time frame into 6 sections: immediate postoperative period (first few days), first week, first two weeks, two to four weeks, four to six weeks, and six weeks onward. The details of these sections are very similar to the corresponding section in the liposuction sample and include the time frame for swelling and bruising reduction, pain medication and surgical bra usage, and resuming increasingly strenuous activities. It also discusses how full recovery can take several months, and how scarring can be managed. The next section, “assistance during recovery”, is nearly identical to the sample liposuction section.
In comparison to a blog on breast augmentation recovery by the ASPS, the ChatGPT response has a less detailed summary of the recovery process. It includes the main points addressed by the ASPS. It also uses a similar time breakdown, but includes a time period of 6 weeks onward, providing additional information on full recovery and exercising. The ASPS blog had no information on assistance that may be required during the recovery period.
How can I expect my breast implants to look over time?
The response begins with a statement about how breast implants change over time due to various factors and how it is important to understand that they may require maintenance or replacement. Then it lists general considerations that affect their appearance over time with a bullet point or two of explanation. The general considerations are aging and changes in breast tissue, pregnancy and breastfeeding, weight fluctuations, capsular contracture, implant rupture or leakage, natural settling and drop, changes in nipple and breast position, and implant revision or removal. The conclusion is a couple of paragraphs emphasizing that breast implants do not last indefinitely and that maintenance and follow-up with a plastic surgeon are imperative for the longevity of the implants.
This response is much more detailed than the ASPS article on breast augmentation. The main points are similar: they are not lifetime devices and require maintenance or replacement. The ASPS article does list aging, weight fluctuations, hormonal factors, and gravity as things that may affect the appearance of breast augmentation over time. It does not include descriptions of these factors.
Tummy tuck
What are the risks and complications associated with tummy tucks?
This response begins with an explanation of tummy tucks/abdominoplasties and a few statements regarding the importance of considering risk factors before undergoing this procedure. It then lists the following potential risks and complications: infection, bleeding, delayed healing, seroma and hematoma, changes in sensation, skin necrosis, scarring, asymmetry, deep vein thrombosis (DVT) and pulmonary embolism (PE), unsatisfactory results, and anesthesia risks. Each of these potential risks is accompanied by an explanatory sentence or two. The response ends with a statement about the importance of choosing an experienced surgeon, having a discussion with this surgeon about risks and complications, undergoing a thorough preoperative evaluation, following postoperative care instructions, and attending follow-up appointments.
ChatGPT provides a suitable explanation of tummy tucks but omits the fact that the procedure often restores weakened or separated muscles. The ASPS lists the complications associated with tummy tucks:
• Anesthesia risks
• Asymmetry
• Bleeding
• Deep vein thrombosis, cardiac and pulmonary complications
• Fatty tissue found deep in the skin might die (fat necrosis)
• Fluid accumulation (seroma)
• Infection
• Numbness or other changes in skin sensation
• Persistent pain
• Poor wound healing
• Possibility of revisional surgery
• Recurrent looseness of skin
• Skin discoloration and/or prolonged swelling
• Skin loss
• Suboptimal aesthetic result
• Unfavorable scarring
ChatGPT’s response is similarly comprehensive to the list provided by the ASPS. All its listed complications are included along with a concise and easily understandable description. The ASPS article does include a couple of complications not listed by ChatGPT, such as skin loss. However, it does not place as much emphasis on consulting a qualified plastic surgeon, following pre- and postoperative care instructions, and maintaining realistic expectations.
How long of a recovery period can I expect, and what kind of help will I need during my recovery?
Here, ChatGPT starts with a statement about how recovery time can vary from person to person based on a variety of factors. It then breaks up the postoperative time frame into 6 sections: immediate postoperative period (first few days), first week, first two weeks, two to four weeks, four to six weeks, and six weeks onward. The details of these sections are very similar to the corresponding section in the liposuction sample. It includes the time frame for swelling and bruising reduction, pain medication and compression garment usage, and resuming increasingly strenuous activities. It also discusses some assistance the patient may need, how full recovery can take several months, and how scarring can be managed. The next section on assistance the patient may need during recovery is nearly identical to the sample liposuction section.
In comparison to a blog on tummy tuck recovery by the ASPS, the ChatGPT response has a less detailed summary of the recovery process. It does include the main points addressed by the ASPS. The ASPS blog breaks the time into immediately after surgery, when you are at home, daily maintenance, resuming normal life, and long-term effects, which is different from the timeline used by ChatGPT. This blog also includes information on taking care of surgical dressings as well as the incision, drains, resting at an angle, and smoking and drinking alcohol during recovery. ChatGPT does not. It does not have a specific section on assistance the patient may need but discusses that patients may require assistance with childcare and household chores. Most people opt to take a month off work.
How will pregnancy affect my tummy tuck?
The opening paragraph explains that pregnancy can impact tummy tuck results and describes the importance of considering this before undergoing the procedure. It then lists the following considerations with 1-3 explanatory sentences: stretching of abdominal muscles, changes in skin elasticity, development of stretch marks, weight gain, potential for revision surgery, and C-section considerations. The concluding paragraphs discuss the importance of communicating with your surgeon, exploring strategies to minimize the effects (like healthy lifestyle and procedure timing), and considering delaying until the patient has completed their family before a tummy tuck procedure. It also explains that pregnancy affects everyone differently and the impact on results will vary.
A press release by the ASPS discusses pregnancy-related widening of abdominal muscles following pregnancy in tummy tuck patients. The site generally encourages patients to wait until after completing their family. The ASPS does not focus as much on the aesthetic effects of pregnancy after this procedure. Instead, it employs medical jargon and study results to explain the effects on the abdominal muscles. ChatGPT, while it does mention this issue, does not highlight the possible complications as much. It does not strongly encourage patients to consider family planning in their tummy tuck decisions.
Breast lift
What are the risks and complications associated with breast lifts?
The response commences with a definition of breast lift/mastopexy and a few statements about the importance of considering risk factors. It then lists the following potential risks and complications: scarring, changes in sensation, infection, bleeding, delayed healing, asymmetry, changes in breast shape or position, implant risks (if combined with augmentation), loss of breastfeeding ability, unsatisfactory results, and anesthesia risks. Each of these potential risks is accompanied by explanatory sentences. The response concludes with a statement about the importance of choosing an experienced surgeon, following pre- and postoperative care instructions, and attending follow-up appointments.
ChatGPT provides a suitable explanation of breast lifts. The ASPS lists the complications associated with breast lifts:
• Anesthesia risks
• Bleeding or hematoma formation
• Breast asymmetry
• Breast contour and shape irregularities
• Changes in nipple or breast sensation, which may be temporary or permanent
• Deep vein thrombosis, cardiac and pulmonary complications
• Fatty tissue found deep in the skin might die (fat necrosis)
• Fluid accumulation
• Infection
• Poor healing of incisions
• Possibility of revisional surgery
• Potential partial or total loss of nipple and areola
The list provided by the ASPS is similarly comprehensive compared to the ChatGPT list. Both include complications not listed by the other platform. For example, ChatGPT includes loss of breastfeeding ability, unlike the ASPS. The ASPS includes loss of nipple and areola, but ChatGPT does not. These are, however, all potential complications of breast lifts. ChatGPT additionally includes descriptions of these risks and emphasizes following pre- and postoperative instructions and maintaining realistic expectations, while the ASPS does not.
How long of a recovery period can I expect, and what kind of help will I need during my recovery?
This answer begins with a statement about how recovery time can vary from person to person based on a variety of factors. It then divides the postoperative time frame into 6 sections: immediate postoperative period (first few days), first week, first two weeks, two to four weeks, four to six weeks, and six weeks onward. The details of these sections are very similar to the corresponding section in the liposuction sample. These include the time frame for swelling and bruising reduction, resuming increasingly strenuous activities, and supportive surgical bra usage. It also discusses how full recovery can take several months and scarring management. The next section on assistance the patient may need during recovery is basically identical to the sample liposuction section.
Unlike with the other procedures, there is no ASPS blog on the timeline of recovery for breast lifts. There is an article on what to expect during recovery. In comparison to this, ChatGPT’s response is much more detailed and provides a timeline of what to expect over the course of the recovery. ChatGPT again leaves out information on changing dressings, caring for the wound, and drains, which this ASPS article does discuss. There is no ASPS blog on assistance the patient may need during recovery and no assistance is mentioned in this article.
What will my breast lift look like over time?
The response begins by explaining that breast lifts can be affected by factors including aging, lifestyle, genetics, and surgical techniques and that it is important to have a realistic expectation of the results. It then lists general considerations with 2-3 sentences of explanation: immediate results, changes in swelling, scarring, natural aging and gravitational effects, weight fluctuations, lifestyle factors, breast support, and follow-up care. It ends with two paragraphs elucidating the importance of communicating with your surgeon and following maintenance guidelines, as well as explaining how outcomes vary between individuals and that some people desire adjustments.
An ASPS article briefly discusses that the results may change over time due to aging and gravity. It does not discuss in depth the way ChatGPT does or touch on the majority of the other considerations discussed in ChatGPT’s response. Additionally, the ASPS discusses how weight maintenance and a healthy lifestyle can help the patient maintain their look longer, and the ChatGPT response does not discuss this. Finally, the ASPS article emphasizes the importance of waiting until childbearing is completed to undergo breast lifts, as pregnancy can minimize or reverse the results. ChatGPT does not discuss pregnancy’s impact on breast lift results.
Eyelid surgery
What are the risks and complications associated with eyelid surgery?
ChatGPT’s response opens with an explanation of blepharoplasty and a few statements about the importance of considering risk factors before getting this procedure. It then lists the following potential risks and complications: bleeding, infection, scarring, changes in vision, dry eyes, difficulty closing eyes completely, ectropion or entropion, unsatisfactory anesthetic result, anesthesia risks, hematoma, numbness or changes in sensation, and recovery issues. Each of these potential risks is accompanied by a sentence or two of explanation. The response ends with a statement about the importance of choosing an experienced surgeon, having a discussion with this surgeon about risks and complications, undergoing a thorough preoperative evaluation, following postoperative care instructions, and having realistic expectations.
ChatGPT provides a suitable explanation of blepharoplasties. The ASPS lists the complications associated with blepharoplasties:
• Anesthesia risks
• Bleeding from the incision lines
• Changes in skin sensation or numbness of the eyelashes
• Difficulty closing your eyes
• Dryness to the eyes
• Ectropion, an outward rolling of the lower eyelid
• Infection
• Lid lag, a pulling down of the lower eyelid, may occur and is often temporary
• Pain, which may persist
• Possible need for revision surgery
• Sensitivity to sun or other bright light
• Swelling and bruising
• Temporary or even permanent change in vision, and very rare chance of blindness
• Unfavorable scarring
Compared to the ASPS list, the ChatGPT list of complications is similarly comprehensive, though it includes explanations of each complication, and the ASPS article does not. The ASPS article includes a few complications not listed by ChatGPT, such as lid lag and light sensitivity. The ChatGPT response places more emphasis on discussing complications with a surgeon, following care instructions, and having realistic expectations.
How long of a recovery period can I expect, and what kind of help will I need during my recovery?
This response begins with a statement about how recovery time can vary from person to person based on a variety of factors. It then breaks up the postoperative time frame into 6 sections: immediate postoperative period (first few days), first week, first two weeks, two to four weeks, four to six weeks, and six weeks onward. The details of these sections are similar to the corresponding section in the liposuction sample. They include the time frame for swelling and bruising reduction, cold compress and pain medication usage, head elevation, makeup and contact lens usage, and resuming increasingly strenuous activities. It also discusses how full recovery can take several months, when sutures can be removed, and how scarring can be managed. The next section on assistance the patient may need during recovery is essentially identical to the sample liposuction section.
Similarly to the corresponding breast lift section, there is no blog on the timeline for blepharoplasty recovery. There is an article that explains what to expect during recovery. ChatGPT’s points on following postoperative instructions, including the use of cold compresses, are reflected here. The ASPS article mentions that most patients can go out in public 10-14 days after surgery. No specific timeline is included in the ChatGPT response. However, no other recovery timeline (like when to resume strenuous activity) is mentioned. Nothing is mentioned by the ASPS regarding the assistance the patient may need during this recovery.
How long will my eyelid surgery last?
This response begins with a short paragraph discussing how results vary and blepharoplasty does not stop the natural aging process. It then lists the following considerations with 2-3 sentences of explanation: aging process, genetics, sun protection, lifestyle choices, skincare, weight fluctuations, and follow-up care. It ends by discussing how results can vary, the potential need for secondary procedures, the importance of realistic expectations, and the necessity of choosing a qualified and experienced surgeon.
An article on eyelid surgery results by the ASPS is much less detailed compared to the ChatGPT response. It discusses how results can vary, will be long-lasting, can be affected by aging, and how sun protection can help maintain the results. Follow-up care is the only other point discussed by ChatGPT that is addressed in the ASPS article. None of the points are addressed in as much detail as in the ChatGPT response.
DISCUSSION
Overall, these results demonstrate that ChatGPT is capable of providing accurate, organized, and grammatically correct responses to common patient questions. Across the three procedures, ChatGPT provided a list of complications that was similarly comprehensive to the corresponding ASPS article. It included additional details on the complications. There were also complications listed in every section, such as anesthesia risks. Conversely, ChatGPT’s response on recovery periods was less detailed than the corresponding ASPS articles except in regard to the assistance needed during the recovery period. However, there were no articles on recovery periods for breast lift and eyelid surgery. The formats of the answers to the repeated questions were the same, indicating reliability in answer structure. Some of the sections were also identical or near identical across procedures, again indicating reliability, but also showing a lack of specificity.
For the procedure-specific questions, ChatGPT was more detailed than the corresponding ASPS article 2/5 times and less detailed 1/5 times. In the remaining two instances, the ChatGPT and ASPS articles provided dissimilar responses. ChatGPT’s response to the questions on tummy tucks and pregnancy did not emphasize the importance of waiting until after childbearing to undergo a tummy tuck. This was the only instance where ChatGPT’s answer missed the main point of the corresponding ASPS article. The comprehensiveness of the overall data is represented in Figure 4. Though there does not seem to be any literature directly comparing ChatGPT’s responses with those found on reputable websites, our results do seem consistent with papers that found it to be a reliable and organized source of basic preoperative information that does not provide patient-specific advice[12-14]. There was no evidence to suggest ChatGPT may be able to outperform physicians in this area, as some articles claim. This was not directly tested.
Figure 4. The comprehensiveness of ChatGPT’s responses, which were categorized as less comprehensive, more comprehensive, equally comprehensive, or different.
This study demonstrates that ChatGPT is a reliable source of preoperative information most of the time. It may not be specific or detailed enough for patients to get all the information they need prior to a procedure. ChatGPT is able to generate responses for questions that may not have reputable answers on sites like the ASPS, making it a good source of information in those cases. This supports past literature indicating that ChatGPT can be a tool for patient education. However, in our opinion, it should not be used as a substitute for the initial consultation, a point that ChatGPT makes in nearly all its responses. The consistency of ChatGPT’s responses across procedures makes it easier to be recommended as a tool for patient education, as the content of the responses is likely to be similar for corresponding questions. The questions in this study were all similar in style. This could explain the similarities in responses based on what is known about ChatGPT from previous studies[10]. Future investigation may ask the same question with different phrasing and health literacy, on different days, and with different OpenAI accounts to see if the consistency remains.
While the current iteration of AI holds promise, it cannot be fully recommended due to the risk of disseminating inaccurate information. Although this issue was not observed in our study, it has been seen in previous literature and remains an important consideration when using ChatGPT for patient education[5]. Inaccurate information could cause unnecessary stress in patients or lead to an incomplete understanding of their procedures. It is important to exercise caution when using ChatGPT for any advice, especially medical advice. Clarification and validation of this information with a medical professional is always necessary. Additionally, the use of ChatGPT in patient consultation presents another ethical consideration in the realm of patient privacy. By asking patients to discuss their medical conditions and procedures with ChatGPT, we would be encouraging patients to reveal their data to an unprotected and relatively unregulated service. Some studies have suggested providing ChatGPT with vignettes to facilitate more specific responses. However, asking patients to do this with their own medical information would further violate their privacy. It is important that patients are made aware of these potential risks if physicians advise their use.
Limitations
Since this study was conducted on the same day under the same account, no conclusions can be made about ChatGPT’s responses across accounts and on different days. Additionally, this study was conducted using ChatGPT-3.5, which has been shown to do worse on boards than ChatGPT-4; however, since ChatGPT-3.5 is the free version, this is more likely used by patients to ask their preoperative questions[16]. Since ChatGPT-4 has been shown to be more accurate but not necessarily more complete, its results could be more comprehensive but are not guaranteed to be. This presents an important direction for future research. This study compared ChatGPT’s responses to articles on the ASPS website. This may not accurately represent how patients obtain their information. Additionally, our method of assessing accuracy and completeness was limited. This study compares ChatGPT to information patients are likely to use, such as articles. However, articles are not peer-reviewed and may not have entirely accurate or precise information even when they are from reputable sources, such as the ASPS. We also did not utilize any numeric system of comparison, which could be beneficial in the objective evaluation of these responses. Future studies could create a numeric system of comparison based on current literature or reputable sources such as UpToDate, or they could compare ChatGPT’s answers to the responses of plastic surgeons. Additional investigation should focus on different types of surgeries and procedures to see if these results are generalizable. Finally, our methodology also makes the assessment of comprehensibility and practicality difficult, so future research could assess these topics by asking patients to rate these features.
Recommendation
While ChatGPT cannot replace the initial consultation with a surgeon, it could be recommended to patients prior to these consultations to make them more efficient. Physicians could provide a pre-visit pamphlet as part of virtual check-in that had instructions on the use of ChatGPT and potential questions patients could ask prior to their visit. This would give patients at least a basic understanding of their procedure and possible complications, so the majority of the visit could be spent on clarifying or patient-specific concerns. Additionally, patients would have had time prior to the visit to digest some of the procedure-specific information, reducing their possibility of becoming overwhelmed and allowing them to have a more informed discussion. Guidelines could be established on which questions are best for patients to address before the visit. Potentially, standardized, understanding-based questions could be asked by the physician at the visit to ensure the pre-visit education has been successful. This could be further improved if the patient was also referred to a reputable site like the ASPS, which would make their pre-consultation education even more comprehensive.
In conclusion, this study provides support for ChatGPT’s usage as a tool to improve the efficiency of consultations for aesthetic plastic surgery procedures and provides insight into the consistency of responses to the same question across procedures. It is, however, important to recognize ChatGPT’s limitations in answering questions in a patient/procedure-specific way. Therefore, it cannot replace an in-person consultation with an experienced surgeon. Further research is needed to fully assess the reliability of ChatGPT across days and accounts before it can be recommended as a patient learning tool.
DECLARATIONS
Authors’ contributions
Performed data acquisition as well as manuscript transcription and contribution to the design of the study: Brochu BM
Made substantial contributions to the conception and design of the study, as well as assisting in manuscript revision and technical support: Mirsky NA
Made substantial contributions to the conception and design of the study, as well as providing administrative support and assisting in manuscript revisions: Thaller SR
Availability of data and materials
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Financial support and sponsorship
None.
Conflicts of interest
All authors declared that there are no conflicts of interest.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Copyright
© The Author(s) 2024.
REFERENCES
1. Sallam M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare 2023;11:887.
3. Fraser H, Crossland D, Bacher I, Ranney M, Madsen T, Hilliard R. Comparison of diagnostic and triage accuracy of Ada health and WebMD symptom checkers, ChatGPT, and physicians for patients in an emergency department: clinical data analysis study. JMIR Mhealth Uhealth 2023;11:e49995.
4. Al Kuwaiti A, Nazer K, Al-Reedy A, et al. A review of the role of artificial intelligence in healthcare. J Pers Med 2023;13:951.
5. Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell 2023;6:1169595.
6. Garg RK, Urs VL, Agarwal AA, Chaudhary SK, Paliwal V, Kar SK. Exploring the role of ChatGPT in patient care (diagnosis and treatment) and medical research: a systematic review. Health Promot Perspect 2023;13:183-91.
7. Ferreira AL, Chu B, Grant-Kels JM, Ogunleye T, Lipoff JB. Evaluation of ChatGPT dermatology responses to common patient queries. JMIR Dermatol 2023;6:e49280.
8. Liu J, Zheng J, Cai X, Wu D, Yin C. A descriptive study based on the comparison of ChatGPT and evidence-based neurosurgeons. iScience 2023;26:107590.
9. Durairaj KK, Baker O, Bertossi D, et al. Artificial intelligence versus expert plastic surgeon: comparative study shows ChatGPT “Wins” rhinoplasty consultations: should we be worried? Facial Plast Surg Aesthet Med 2024;26:270-5.
10. Lautrup AD, Hyrup T, Schneider-Kamp A, Dahl M, Lindholt JS, Schneider-Kamp P. Heart-to-heart with ChatGPT: the impact of patients consulting AI for cardiovascular health advice. Open Heart 2023;10:e002455.
11. Rizwan A, Sadiq T. The use of AI in diagnosing diseases and providing management plans: a consultation on cardiovascular disorders with ChatGPT. Cureus 2023;15:e43106.
12. Li W, Chen J, Chen F, Liang J, Yu H. Exploring the potential of ChatGPT-4 in responding to common questions about abdominoplasty: an AI-based case study of a plastic surgery consultation. Aesthetic Plast Surg 2024;48:1571-83.
13. Seth I, Cox A, Xie Y, et al. Evaluating Chatbot efficacy for answering frequently asked questions in plastic surgery: a ChatGPT case study focused on breast augmentation. Aesthet Surg J 2023;43:1126-35.
14. Xie Y, Seth I, Hunter-Smith DJ, Rozen WM, Ross R, Lee M. Aesthetic surgery advice and counseling from artificial intelligence: a rhinoplasty consultation with ChatGPT. Aesthetic Plast Surg 2023;47:1985-93.
Cite This Article
How to Cite
Brochu, B. M.; Mirsky, N. A.; Thaller, S. R. Evaluating ChatGPT's efficacy in addressing common patient questions in plastic surgery consultations. Art. Int. Surg. 2024, 4, 411-26. http://dx.doi.org/10.20517/ais.2024.61
Download Citation
Export Citation File:
Type of Import
Tips on Downloading Citation
Citation Manager File Format
Type of Import
Direct Import: When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.
Indirect Import: When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.
Comments
Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.