A Bibliometric Review of Genomic Prediction Algorithms in Plants : Trends, Collaboration Networks, and Emerging Research Topics

Authors

  • Rizki Darmawan Dian Nuswantoro University; Universitas Indonesia Maju
    Indonesia
  • Adi Wijaya Universitas Indonesia Maju
    Indonesia
  • Prajoko Universitas Muhammadiyah Sukabumi
    Indonesia

Keywords:

Bibliometrics, Genomic prediction, Plants

Abstract

Genomic prediction (GP) has emerged as a transformative methodology in biological, health and agricultural sciences, leveraging genome-wide molecular markers to predict complex traits with increasing accuracy. This bibliometric study aims to analyse the development, collaboration patterns, and thematic trends in GP research in plants between 2015 and 2025. A total of 205 open-access research articles were retrieved from the Scopus database using the PRISMA approach and analysed with the R-based Biblioshiny software. The results show a marked increase in publication output since 2018, peaking in 2024, with an annual growth rate of 14.9%. International collaboration is also substantial, with 41.46% of publications involving authors from different countries. The most prolific authors, including Crossa J, Montesinos-López A, and Montesinos-López OA, demonstrate strong collaborative synergies. Leading journals such as Frontiers in Plant Science and The Plant Genome indicate that the field remains strongly rooted in plant genomics and breeding applications. Keyword co-occurrence analysis identified three major thematic clusters: plant breeding and machine learning, AI-based genomics, and statistical prediction systems. Overall, the findings suggest that GP research has evolved into a mature and highly collaborative interdisciplinary field, with a clear shift from conventional statistical approaches toward machine learning and deep learning-driven methodologies. This study provides a systematic map of the intellectual landscape and highlights promising directions for future research in data-driven plant breeding.

Downloads

Download data is not yet available.

References

[1] J. Crossa et al., “Genomic prediction of gene bank wheat landraces,” G3: Genes, Genomes, Genetics, vol. 6, no. 7, pp. 1819–1834, 2016, doi: 10.1534/g3.116.029637.

[2] M. S. Sirsat, P. R. Oblessuc, and R. S. Ramiro, “Genomic Prediction of Wheat Grain Yield Using Machine Learning,” Agriculture (Switzerland), vol. 12, no. 9, pp. 1–12, 2022, doi: 10.3390/agriculture12091406.

[3] C. A. V. Barreto et al., “Genomic prediction in multi-environment trials in maize using statistical and machine learning methods,” Sci. Rep., vol. 14, no. 1, pp. 1–11, 2024, doi: 10.1038/s41598-024-51792-3.

[4] G. Galli et al., “Automated Machine Learning: A Case Study of Genomic ‘Image-Based’ Prediction in Maize Hybrids,” Front. Plant Sci., vol. 13, 2022, doi: 10.3389/fpls.2022.845524.

[5] A. Montesinos-López, O. A. Montesinos-López, D. Gianola, J. Crossa, and C. M. Hernández-Suárez, “Multi-environment genomic prediction of plant traits using deep learners with dense architecture,” G3: Genes, Genomes, Genetics, vol. 8, no. 12, pp. 3813–3828, 2018, doi: 10.1534/g3.118.200740.

[6] J. Yan et al., “LightGBM: accelerated genomically designed crop breeding through ensemble learning,” Genome Biol., vol. 22, no. 1, pp. 1–24, 2021, doi: 10.1186/s13059-021-02492-y.

[7] C. Palaiokostas, “Predicting for disease resistance in aquaculture species using machine learning models,” Aquac. Rep., vol. 20, p. 100660, 2021, doi: https://doi.org/10.1016/j.aqrep.2021.100660.

[8] I. K. Fernandes, C. C. Vieira, K. O. G. Dias, and S. B. Fernandes, “Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials,” Theoretical and Applied Genetics, vol. 137, no. 8, 2024, doi: 10.1007/s00122-024-04687-w.

[9] J. Cuevas et al., “Deep Kernel for genomic and near infrared predictions in multi-environment breeding trials,” G3: Genes, Genomes, Genetics, vol. 9, no. 9, pp. 2913–2924, 2019, doi: 10.1534/g3.119.400493.

[10] O. A. Montesinos-López, A. Montesinos-López, J. Crossa, D. Gianola, C. M. Hernández-Suárez, and J. Martín-Vallejo, “Multi-trait, multi-environment deep learning modeling for genomic-enabled prediction of plant traits,” G3: Genes, Genomes, Genetics, vol. 8, no. 12, pp. 3829–3840, 2018, doi: 10.1534/g3.118.200728.

[11] P. G. Heilmann, M. Frisch, A. Abbadi, T. Kox, and E. Herzog, “Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP,” Front. Plant Sci., vol. 14, 2023, doi: 10.3389/fpls.2023.1178902.

[12] A. Montesinos-López et al., “Multimodal deep learning methods enhance genomic prediction of wheat breeding,” G3: Genes, Genomes, Genetics, vol. 13, no. 5, pp. 1–17, 2023, doi: 10.1093/g3journal/jkad045.

[13] A. Montesinos-López et al., Deep learning methods improve genomic prediction of wheat breeding, (2024). doi: 10.3389/fpls.2024.1324090.

[14] O. A. Montesinos-López, A. Montesinos-López, B. Cano-Paez, C. M. Hernández-Suárez, P. C. Santana-Mancilla, and J. Crossa, “A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library,” Genes (Basel)., vol. 13, no. 8, 2022, doi: 10.3390/genes13081494.

[15] K. Sandhu, S. S. Patil, M. Pumphrey, and A. Carter, “Multitrait machine- and deep-learning models for genomic selection using spectral information in a wheat breeding program,” Plant Genome, vol. 14, no. 3, 2021, doi: 10.1002/tpg2.20119.

[16] Z. Wang et al., “Optimizing Genomic Selection Methods to Improve Prediction Accuracy of Sugarcane Single-Stalk Weight,” Agronomy, vol. 14, no. 12, 2024, doi: 10.3390/agronomy14122842.

[17] R. D. S. Rosado et al., “Artificial neural networks in the prediction of genetic merit to flowering traits in bean cultivars,” Agriculture (Switzerland), vol. 10, no. 12, pp. 1–12, 2020, doi: 10.3390/agriculture10120638.

[18] K. Wang, M. A. Abid, A. Rasheed, J. Crossa, S. Hearne, and H. Li, “DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants,” Mol. Plant, vol. 16, no. 1, pp. 279–293, 2023, doi: 10.1016/j.molp.2022.11.004.

[19] R. Pranckut?, “Scopus and Web of Science stands out for systematic reviews, offering comprehensive coverage across disciplines, including journals, conferences, and patents,” Publications, vol. 9, no. 1, pp. 1–59, 2021.

[20] J. M. González-Camacho, J. Crossa, P. Pérez-Rodríguez, L. Ornella, and D. Gianola, “Genome-enabled prediction using probabilistic neural network classifiers,” BMC Genomics, vol. 17, no. 1, pp. 1–16, 2016, doi: 10.1186/s12864-016-2553-1.

[21] O. A. Montesinos-López et al., “Multi-Trait, Multi-Environment Genomic Prediction of Durum Wheat With Genomic Best Linear Unbiased Predictor and Deep Learning Methods,” Front. Plant Sci., vol. 10, 2019, doi: 10.3389/fpls.2019.01311.

[22] O. A. Montesinos-López, A. Montesinos-López, D. A. Bernal Sandoval, B. A. Mosqueda-Gonzalez, M. A. Valenzo-Jiménez, and J. Crossa, “Multi-trait genome prediction of new environments with partial least squares,” Front. Genet., vol. 13, 2022, doi: 10.3389/fgene.2022.966775.

[23] A. Montesinos-López, O. A. Montesinos-López, J. C. Montesinos-López, C. A. Flores-Cortes, R. de la Rosa, and J. Crossa, “A guide for kernel generalized regression methods for genomic-enabled prediction,” Heredity (Edinb)., vol. 126, no. 4, pp. 577–596, 2021, doi: 10.1038/s41437-021-00412-1.

[24] O. A. Montesinos-Lopez et al., “Application of a Poisson deep neural network model for the prediction of count data in genome-based prediction,” Plant Genome, vol. 14, no. 3, pp. 1–19, 2021, doi: 10.1002/tpg2.20118.

[25] O. A. Montesinos-López et al., “A multivariate poisson deep learning model for genomic prediction of count data,” G3: Genes, Genomes, Genetics, vol. 10, no. 11, pp. 4177–4190, 2020, doi: 10.1534/g3.120.401631.

[26] O. A. Montesinos-López et al., “A benchmarking between deep learning, support vector machine and Bayesian threshold best linear unbiased prediction for predicting ordinal traits in plant breeding,” G3: Genes, Genomes, Genetics, vol. 9, no. 2, pp. 601–618, 2019, doi: 10.1534/g3.118.200998.

[27] “A foundational large language model for edible plant genomes,” Commun. Biol..

[28] H. Kashyap, H. A. Ahmed, N. Hoque, S. Roy, and D. K. Bhattacharyya, “Big data analytics in bioinformatics: architectures, techniques, tools and issues,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 5, no. 1, 2016, doi: 10.1007/s13721-016-0135-4.

[29] S. Ranganathan, “Bioinformatics Education-Perspectives and Challenges,” PLoS Comput. Biol., vol. 1, no. 6, pp. 6–8, 2005, doi: 10.1371/journal.pcbi.0010052.

[30] S. Shalev-Shwartz and S. Ben-David, “Understanding machine learning: From theory to algorithms,” Understanding Machine Learning: From Theory to Algorithms, vol. 9781107057, pp. 1–397, 2013, doi: 10.1017/CBO9781107298019.

[31] R. K. Varshney et al., “Accelerating genetic gains in legumes for the development of prosperous smallholder agriculture: Integrating genomics, phenotyping, systems modelling and agronomy,” J. Exp. Bot., vol. 69, no. 13, pp. 3293–3312, 2018, doi: 10.1093/jxb/ery088.

[32] A. Ismail et al., “A Novel Deep Learning-Based Model for Classification of Wheat Gene Expression,” Computer Systems Science and Engineering, vol. 48, no. 2, pp. 273–285, 2024, doi: 10.32604/csse.2023.038192.

[33] H. Kim, “Deep Learning,” Artificial Intelligence for 6G, vol. 22, no. 4, pp. 247–303, 2022, doi: 10.1007/978-3-030-95041-5_6.

[34] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015, doi: 10.1038/nature14539.

[35] F. Jurado-Rui, D. Rousseau, J. A. Botía, and M. J. Aranzana, “GenoDrawing: An Autoencoder Framework for Image Prediction from SNP Markers,” Plant Phenomics, vol. 5, p. 113, 2023, doi: 10.34133/plantphenomics.0113.

[36] T. H. E. Meuwissen, B. J. Hayes, and M. E. Goddard, “Prediction of total genetic value using genome-wide dense marker maps,” Genetics, vol. 157, no. 4, pp. 1819–1829, 2001, doi: 10.1093/genetics/157.4.1819.

[37] A. Legarra, P. M. VanRaden, E. Mantysaari, M. Bermann, and P. Sullivan, “Scalar methods to deregress and split genomic predictions, and associated behavior of simple regressions, for later use in combined prediction and validations,” J. Dairy Sci., vol. 109, no. 2, pp. 1727–1741, 2025, doi: 10.3168/jds.2025-26859.

[38] P. Pérez and G. de los Campos, “Genome-wide regression and prediction with the BGLR statistical package,” Genetics, vol. 198, no. 2, pp. 483–495, 2014, doi: 10.1534/genetics.114.164442.

[39] D. Gianola, G. De Los Campos, W. G. Hill, E. Manfredi, and R. Fernando, “Additive genetic variability and the Bayesian alphabet,” Genetics, vol. 183, no. 1, pp. 347–363, 2009, doi: 10.1534/genetics.109.103952.

[40] J. H. Friedman, T. Hastie, and R. Tibshirani, “Regularization Paths for Generalized Linear Models via Coordinate Descent,” J. Stat. Softw., vol. 33, no. 1, pp. 1–22, 2010, doi: 10.18637/jss.v033.i01.

[41] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

[42] Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci., vol. 55, no. 1, pp. 119–139, 1997, doi: 10.1006/jcss.1997.1504.

[43] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” 2015. [Online]. Available: https://arxiv.org/abs/1412.6980

[44] S. A. Clark, J. M. Hickey, and J. H. J. van der Werf, “Different models of genetic variation and their effect on genomic evaluation,” BMC Genet., vol. 12, p. 49, 2011, doi: 10.1186/1471-2156-12-49.

Downloads

Published

2025-12-02

How to Cite

Darmawan, R., Wijaya, A., & Prajoko, P. (2025). A Bibliometric Review of Genomic Prediction Algorithms in Plants : Trends, Collaboration Networks, and Emerging Research Topics. Urecol Journal. Part D: Applied Sciences, 5(2), 52–66. Retrieved from https://e-journal.urecol.org/index.php/ujas/article/view/288

Issue

Section

Articles