The paper introduced here addresses the clinical challenge of choosing ovarian stimulation protocols in in vitro fertilization – embryo transfer, which is highly dependent on experience. It has developed an AI-driven decision support system. The study uses an adaptive combination algorithm to select key features, establishes a pregnancy classification system based on four core indicators on the hCG day, and implements individualized protocol recommendations using iterative random forests. This research extends the application of AI from embryo screening to the upstream decision-making in IVF-ET, builds a three-dimensional optimization framework for efficacy, economy, and time, and develops a clinical tool that can be integrated with electronic medical records. However, the single-center design limits its external validity, the types of algorithm integration are limited, and all individualized factors have not been included.
Wen L, Wu D, Ruan J, Wang R, Long R, Chen R, Hu C, Tian C, Zhang Y, Pan W, Jin L, Liao S. Artificial intelligence-driven precision treatment of reproductive medicine-related diseases: the optimal protocol choice for IVF-ET. J Adv Res. 2025 Oct 23:S2090-1232(25)00835-5. doi: 10.1016/j.jare.2025.10.040. Epub ahead of print. PMID: 41139020.

I. Research Background
Although in vitro fertilization and embryo transfer (IVF-ET) has helped over 12 million infants worldwide, current treatments still face challenges such as low success rates and significant variability in outcomes among patients. Ovarian stimulation (OS), as the earliest and most adjustable key link in the IVF process, directly determines the number of retrieved oocytes, embryo quality, and endometrial status, exerting a crucial impact on the final pregnancy outcome. Clinically, two main medication protocols are commonly used: the long protocol and the antagonist protocol. The latter is recommended by European reproductive medicine authorities for patients with polycystic ovary syndrome or those with normal ovarian response. However, studies have indicated that the ongoing pregnancy rate of the antagonist protocol may be slightly lower. There remains no clear clinical consensus on which protocol suits specific patient groups. Existing artificial intelligence (AI) research in this field has mostly focused on later stages such as embryo selection, lacking a unified, personalized intelligent auxiliary tool for the critical decision of OS protocol selection. Clinicians primarily rely on experience, creating an urgent need for a data-driven personalized recommendation system to improve success rates and advance precision medicine.
II. Experimental Methods
This study adopted a combined retrospective and prospective cohort design, enrolling a total of 17,791 patients undergoing IVF treatment. The research team collected comprehensive clinical data from each patient, ranging from initial consultation to treatment completion, including personal characteristics such as age, height, weight, and ovarian reserve function (e.g., AMH, antral follicle count), seven actual OS protocols used, hormone levels on the hCG administration day, and final pregnancy outcomes.
The model construction consisted of three progressive steps: First, the system automatically combined three machine learning algorithms (Random Forest, XGBoost, and SOIL) to evaluate the importance of 27 clinical indicators, screening out five key factors—progesterone on hCG day, number of oocytes retrieved (NOR), estradiol (E2), endometrial thickness (EMT), and patient age. Second, based on these four core indicators on the hCG day, a four-level pregnancy grading system was established, categorizing pregnancy probability from low to high into four grades (Grade I: 7% to Grade IV: 55%). Third, the core prediction model simulated patients’ hormonal responses and oocyte retrieval outcomes under different stimulation protocols using their pre-treatment baseline data. Through iterative optimization and comparison, the protocol that maximized the likelihood of these four indicators falling within the preset optimal range and achieving the highest pregnancy grade was selected as the final recommendation. To ensure reliability, all models underwent 100 rounds of cross-validation, and their predictive performance was tested on data from new patients.

III. Experimental Results
Foundation of Key Feature Selection and System Construction:
Through in-depth analysis of 27 clinical features from 17,791 patients using the ACA-FI adaptive combination algorithm, the study successfully identified five decisive factors influencing IVF-ET pregnancy outcomes. Among these, progesterone (P) on hCG day, number of oocytes retrieved (NOR), estradiol (E2), endometrial thickness (EMT), and patient age all had predictive importance scores greater than 0.5, significantly higher than other variables. Notably, the combined importance score of indicators measured on the hCG administration day reached 0.658, indicating that hormonal and ultrasound metrics at this time point are the core window for assessing cycle quality. In contrast, factors such as infertility type, etiological classification, and number of treatment cycles had minimal impact on outcomes, providing important evidence for simplifying clinical decision-making.
Construction and Validation of the Pregnancy Grading System:
Based on the aforementioned four modifiable indicators on hCG day (P, NOR, E2, EMT), the study developed an innovative four-level pregnancy grading system. This system categorizes patients’ pregnancy probability as follows: Grade I (total score 4-10, pregnancy rate 7%), Grade II (11-12, pregnancy rate 24%), Grade III (13-14, pregnancy rate 44%), and Grade IV (15-16, pregnancy rate 55%), with clinically significant differences between grades. To validate its utility, the research team analyzed a subgroup of 1,438 patients with live birth outcome records. Results showed a significant positive correlation between pregnancy grade and live birth rate (p<0.001), confirming that the system not only predicts clinical pregnancy but also effectively reflects final live birth potential, providing clinicians with a reliable standardized assessment tool.
Rigorous Validation of Model Predictive Performance:
In terms of accuracy evaluation, the Iterative Random Forest (IRF) algorithm demonstrated superior performance. Compared with traditional methods such as Gradient Boosting Machine (GBM), IRF achieved higher predictive precision for all four core indicators. After 100 rounds of 10-fold cross-validation, its prediction error remained stable, confirming the model’s robustness against data fluctuations. In the prospective validation cohort (4,251 cases), the clinical pregnancy rate was significantly improved from 0.452 with clinicians’ original protocols to 0.512 with the CDSS-recommended protocols (p<0.001), with 80% of patients achieving the optimal Grade IV pregnancy. More importantly, even after adjusting for confounding factors such as age, BMI, and AMH, OS protocol selection remained an independent predictor of pregnancy outcomes—with an adjusted odds ratio (OR) of 0.775 for the antagonist protocol (p<0.001) and 1.563 for the ultra-long protocol (p<0.001). This fully demonstrates the robustness and clinical value of the CDSS recommendations.

IV. Research Innovations and Limitations
The innovations of this study are reflected in three key aspects: First, it breaks through the limitation of traditional AI applications confined to embryo selection, extending intelligent decision-making to the upstream link of OS protocol selection and establishing a complete system from data analysis to protocol recommendation. By combining multiple AI algorithms and iterative optimization, the accuracy and reliability of recommendations are ensured. Second, the study moves beyond sole focus on pregnancy success rates, integrating treatment safety and time efficiency to establish a more comprehensive evaluation standard. Most importantly, the system can directly interface with hospital electronic medical records, automatically extracting patient information and generating recommendations while retaining final decision-making authority with clinicians.
The study also has several limitations: First, the single-center design, despite the large sample size, may be influenced by the specific clinical practices and patient characteristics of the participating center, requiring multi-center validation to confirm external validity. Second, only three AI algorithms were integrated, and the limited variety may restrict the model’s performance. Third, the system does not yet incorporate all individualized factors, such as patients’ previous treatment responses and psychological stress. Additionally, it does not address optimization of other critical IVF links (e.g., embryo transfer timing), leaving room for improvement in the comprehensiveness of recommendations.
V. Research Value and Prospects
This study provides clinicians in reproductive medicine with a standardized decision-making tool for OS protocols. It not only helps young clinicians accumulate experience rapidly but also enables high-volume centers to deliver personalized precision recommendations, effectively improving pregnancy rates and patient satisfaction. The established pregnancy grading system offers a unified benchmark for efficacy evaluation, while the modular design of the ACA-FI algorithm framework demonstrates potential for cross-disease and cross-department translation, providing a new methodological reference for the development of other clinical decision support systems. In the future, the system is expected to undergo continuous iteration through multi-center collaboration, incorporating more algorithms and clinical variables to achieve full-process optimization. In the long term, it will not only promote the standardized update of reproductive medicine guidelines but also provide evidence-based support for healthcare policy formulation, ultimately advancing the homogenization and high-quality development of assisted reproductive technology.

