Biology of Sport

Abstract

4/2025 vol. 42
Original paper

A professional assessment of training plans for muscle hypertrophy and maximal strength developed by generative artificial intelligence

  1. Department of Fitness and Health, IST University of Applied Sciences, Düsseldorf, Germany
  2. Department of Sport and Health Sciences, Technical University of Munich, Munich, Germany
  3. Department of Sports Science and Movement Pedagogy, Technische Universität Braunschweig, Braunschweig, Germany
  4. Integrative and Experimental Exercise Science and Training, Institute of Sport Science, University of Würzburg, Germany
Biol Sport. 2025; 42(4):353–380
Online publish date: 2025/08/26
View full text
Confronting perimenopausal women’s knowledge of coronary heart disease with their health behaviours. Controversial role of hormone replacement therapy in the protection of coronary heart disease
The aim of this study was to evaluate the quality of resistance training plans for muscle hypertrophy and maximal strength generated by three large language models (LLMs): GPT-3.5 (via ChatGPT and Microsoft Copilot) and Google Gemini (GG). A total of 10 experienced coaches, each with at least a bachelor’s degree in exercise science and at least 2 years of coaching experience, rated these plans on a 1–5 Likert scale based on 27 criteria essential for effective training plan design. The LLMs were accessed on April 30, 2024, with a prompt structure that included key training objectives and the training history of a fictional advanced trainee. Results showed that the overall quality of the LLM-generated training plans was moderate. GG outperformed GPT-3.5 (via ChatGPT and Microsoft Copilot) for hypertrophy-related plans on 2 out of 27 criteria (advanced exercise methods, recovery strategies; p < 0.05), while GPT-3.5 (via Microsoft Copilot) outperformed GG for strength-related plans on 1 out of 27 criteria (testing procedure; p < 0.05). Across all criteria, GG received ratings > 3 more frequently than GPT-3.5 (via ChatGPT and Microsoft Copilot), particularly for general aspects, training principles, and training methods. Differences between hypertrophy- and strength-oriented plans within each LLM were minimal, although GPT-3.5 (via ChatGPT) showed the most inconsistency in ratings. Although LLM-generated plans can serve as an initial framework for hypertrophy and strength development, expert supervision remains crucial to refine these plans, as LLMs cannot account for individual responses to training, safety considerations, and the complex physiological adaptation processes observed by experienced coaches.
Share
without publication fees