Biology of Sport
eISSN: 2083-1862
ISSN: 0860-021X
Biology of Sport
Current Issue Manuscripts accepted About the journal Editorial board Abstracting and indexing Archive Ethical standards and procedures Contact Instructions for authors Journal's Reviewers Special Information
Editorial System
Submit your Manuscript
SCImago Journal & Country Rank
4/2025
vol. 42
 
Share:
Share:
abstract:
Original paper

A professional assessment of training plans for muscle hypertrophy and maximal strength developed by generative artificial intelligence

Tim Havers
1, 2
,
Caroline Jelonnek
1
,
Lukas Masur
3
,
Eduard Isenmann
1
,
Billy Sperlich
4
,
Stephan Geisler
1
,
Peter Düking
3

  1. Department of Fitness and Health, IST University of Applied Sciences, Düsseldorf, Germany
  2. Department of Sport and Health Sciences, Technical University of Munich, Munich, Germany
  3. Department of Sports Science and Movement Pedagogy, Technische Universität Braunschweig, Braunschweig, Germany
  4. Integrative and Experimental Exercise Science and Training, Institute of Sport Science, University of Würzburg, Germany
Biol Sport. 2025; 42(4):353–380
Online publish date: 2025/08/26
View full text Get citation
 
PlumX metrics:
The aim of this study was to evaluate the quality of resistance training plans for muscle hypertrophy and maximal strength generated by three large language models (LLMs): GPT-3.5 (via ChatGPT and Microsoft Copilot) and Google Gemini (GG). A total of 10 experienced coaches, each with at least a bachelor’s degree in exercise science and at least 2 years of coaching experience, rated these plans on a 1–5 Likert scale based on 27 criteria essential for effective training plan design. The LLMs were accessed on April 30, 2024, with a prompt structure that included key training objectives and the training history of a fictional advanced trainee. Results showed that the overall quality of the LLM-generated training plans was moderate. GG outperformed GPT-3.5 (via ChatGPT and Microsoft Copilot) for hypertrophy-related plans on 2 out of 27 criteria (advanced exercise methods, recovery strategies; p < 0.05), while GPT-3.5 (via Microsoft Copilot) outperformed GG for strength-related plans on 1 out of 27 criteria (testing procedure; p < 0.05). Across all criteria, GG received ratings > 3 more frequently than GPT-3.5 (via ChatGPT and Microsoft Copilot), particularly for general aspects, training principles, and training methods. Differences between hypertrophy- and strength-oriented plans within each LLM were minimal, although GPT-3.5 (via ChatGPT) showed the most inconsistency in ratings. Although LLM-generated plans can serve as an initial framework for hypertrophy and strength development, expert supervision remains crucial to refine these plans, as LLMs cannot account for individual responses to training, safety considerations, and the complex physiological adaptation processes observed by experienced coaches.
keywords:

Resistance training, Artificial intelligence, Muscle hypertrophy, Maximal strength, Training plan design, Coaching evaluation, Chat bots, Large language models, Generative AI

 
Quick links
© 2025 Termedia Sp. z o.o.
Developed by Bentus.