Introduction
Various criteria are used to determine the skin type e.g. the moisture content in the stratum corneum is checked, sun reaction, pigmentation, depigmentation, sensitivity, skin colour, presence of skin aging signs (wrinkles), lack of elasticity and uneven texture. There are many methods for skin type classification which evaluate different skin parameters: visual and tactile methods, non-invasive instrumental methods – non-invasive bioengineering tools (such as corneometer, tewameter, sebumeter, cutometer etc.), imaging techniques (such as dermatoscopy, ultrasounds, confocal microscopy etc.), artificial intelligence-based analysis and methods such as self-reported instruments e.g. the Baumann Skin Type System, the Fitzpatrick Skin Phototype Classification, the Roberts Skin Type Classification System and visual rating scales e.g. the Glogau Scale, the Griffiths Photonumeric Scale, the Score of Intrinsic and Extrinsic Aging [1, 2].
Skin type diagnosis using a self-reported instrument such as the Baumann Skin Type Questionnaire does not require the use of expensive, calibrated equipment and can be used in remote consultations. The questionnaire allows for an assessment of the skin type: oily/dry, sensitive/resistant, pigmented/non-pigmented and wrinkled/tight. In this part I will discuss dichotomous parameters characterizing the skin: pigmented or non-pigmented (P vs. N), wrinkled or tight (W vs. T). Together with the dichotomous parameters: oily or dry (O vs. D), sensitive or resistant (S vs. R) they form an individual’s BSTI four-letter code (Baumann Skin Type Indicator) describes one of 16 possible Baumann Skin Types [3, 4].
Skin type assessment is important because it is a guidance for specialists and consumers to select and recommend the most appropriate skin care products and protocols. Additionally, it is important in clinical research [2].
Aim
The primary objective of this study was to validate the effectiveness of the Polish language version of the Baumann Skin Type Questionnaire (BSTQ). This research aimed to ascertain the performance of the questionnaire in determining skin types according to the hyperpigmentation/no hyperpigmentation and wrinkled/tight dimensions among a predominantly young adult Polish population.
Material and methods
The Polish language version of the BSTQ has been translated and validated. The author of the questionnaire (Leslie Baumann) consented to the use of the questionnaire for scientific research. The consent of the Research Ethics Committee of the University of Economics and Human Sciences in Warsaw was obtained (number 02/07/2024). BSTQ in English was published with the key in 2006 in the book “The Skin Type Solution”. The book was recognized as a national bestseller in the United States [3]. The questionnaire has attracted considerable interest and its slightly different versions can be found on the Internet.
BSTQ is used to assess skin type. The questionnaire assesses four main skin parameters: 1. oily or dry (O/D), 2. sensitive or resistant (S/R), 3. pigmented or non-pigmented (P/N), and 4. wrinkled or tight (W/T). This study discusses part three and part four of the questionnaire.
The third part of the questionnaire assesses the tendency of the skin to produce melanin. It consists of 11 questions, the questions are single-choice. For each answer a you get 1 point, for answer b 2 points, for answer c 3 points, for answer d 4 points, and for answer e 2.5 points. Question 11 is a question that requires an affirmative or negative answer. In the case of an affirmative answer, i.e. having dark spots on the skin in places exposed to the sun, you should add 5 points to the result. Scoring of 31–45 points means pigmented skin (P Skin Type), while scoring of 10–30 points means non-pigmented skin (N Skin Type). The fourth part of the questionnaire assesses the tendency to form wrinkles, as well as how wrinkled the skin is now. It consists of 21 questions, the scoring principle is the same. Question 21 is a question that requires an affirmative or negative answer. In the case of an affirmative answer, i.e. age 65 or older, you should add 5 points to the result. Scoring of 41–85 indicates wrinkled skin (W Skin Type), while scoring of 20–40 indicates tight skin (T Skin Type). Each part defines a skin parameter (O/D; S/R; P/N; W/T). When combined, a skin phenotype is obtained, called the Baumann Skin Type. There are 16 Baumann skin types based on 4 parameters [3, 4].
Translation and validation process
The preparation of the Polish language version began with two versions of translations from English into Polish. Two independent translators translated the English version of the BSTQ into Polish. Then, the translated versions of the questionnaire were compared by a cosmetologist and a uniform version of the questionnaire was created with a minor modification. The minor modification consisted of supplementing the questionnaire with a short instruction for completion and clarifying some questions and answers for better understanding. In the next stage, another independent translator, who was not familiar with the original version of the BSTQ, performed a back translation from Polish into English. In the last stage, the original version of the BSTQ questionnaire was compared with the back translation of the Polish version (consultation with a native speaker). It was found that both versions of the questionnaire are consistent with each other in terms of content. It was assumed that the version translated from the original into Polish was created correctly.
After the translation process, the validation was performed. The questionnaire was tested on a group of 103 individuals. The participants of the study were students of cosmetology. Study participants gave written, voluntary, informed consent to participate in the study. The respondents completed the questionnaires in paper form. The questionnaire was completed by 101 women and 2 men aged 18–35 years. In order to determine test-retest reliability the respondents were asked to complete the questionnaire twice. The repeated measurement method (test-retest) was performed day after day. Due to the relatively large number of questions in the questionnaire, it is likely that the learning effect – memorizing previous answers – can be omitted.
The study respondents participated in practical classes of facial care cosmetology, which allowed for conducting interviews, skin examinations – visual and palpation skin assessment by a cosmetology expert. During 10 meetings, at weekly intervals, various facial skin care treatments were performed, which allowed for assessing the skin’s reaction to the cosmetic preparations used.
Statistical analysis
The statistical significance was determined with an alpha level of a = 0.05. Numerical data were summarized using the mean (M) and standard deviation (SD), while categorical data were reported as counts (n) and percentages (%). To evaluate the consistency of numerical results across test and retest measures, the intraclass correlation coefficient using a fixed effects model, ICC3, as described by Shrout and Fleiss in 1979 [5], was employed. For categorical data, the agreement was assessed using Gwet’s AC1 agreement metric, developed by Gwet in 2008 [6].
The reliability of internal consistency was assessed through several statistical methods. Cronbach’s α, both based on covariances (αraw) and correlations (αstd), initially introduced by Cronbach in 1951, was utilized [7]. Additionally, Guttman’s Lambda 6 reliability index, G6, as outlined by Kline in 1986, and the average interitem correlation (rM) were calculated [8]. The study also incorporated the index of the quality of the test, S/N, following the methodologies proposed by Cronbach and Gleser in 1964 and further refined by Revelle and Condon in 2019 [9, 10]. The standard error of Cronbach’s alpha, αSE, was calculated using the approach established by Duhachek and Iacobucci in 2004 [11]. Overall performance was estimated using accuracy, sensitivity and specificity metrics using equations (1)–(3). The global accuracy rate was calculated, accompanied by a 95% confidence interval (95% CI) for this rate, employing the binomial ptest.
wheere TPi – true positives for class i; TNi – true negatives for class i; FPi – false positives for class i; FNi – false negatives for class i.
Characteristics of the applied statistical tool
Analyses were conducted using the R Statistical language (version 4.3.1; R Core Team, 2023) on Windows 10 Pro 64 bit (build 19045) [12], using the packages irrCAC (version 1.0; Gwet (2019) [13], caret (version 6.0.94; Kuhn, 2008) [14], report (version 0.5.7; Makowski et al., 2023) [15], gtsummary (version 1.7.2; Sjoberg et al., 2021) [16], readxl (version 1.4.3; Wickham, Bryan, 2023) [17], dplyr (version 1.1.3; Wickham et al., 2023) [18], tidyr (version 1.3.0; Wickham et al., 2023) [19] and psych (version 2.3.9; Revelle, 2023) [20].
Characteristics of the sample
Table 1 offers a detailed summary of the demographic data, medical history, dermatological conditions, and cosmetic experiences of the study cohort, comprising 103 participants.
Table 1
Characteristics of the studied sample, N = 103
The demographic profile of the study sample reveals a predominantly young adult population, with all participants aged between 18 and 35 years. The gender distribution is notably skewed, with females representing 98.06% of the cohort, suggesting that the findings may be more reflective of female skin health and cosmetic experiences. The majority of participants are of Polish nationality, accounting for 78.64%, followed by Ukrainians at 17.48%, and a small fraction categorized as other nationalities, which constitutes 3.88%.
In terms of medical history, the presence of systemic diseases is minimal, with only two participants reporting conditions such as diabetes and depressive-anxiety states, each affecting 0.97% of the sample. This indicates a generally healthy cohort in terms of chronic systemic conditions. A notable n = 10 (9.71%) of participants have a family history of melanoma, which could be significant in studies focusing on genetic predispositions to skin conditions.
The dermatological profile shows that n = 24 (23.30%) of the participants suffer from acne, making it the most common skin condition within the group, followed by eczema and contact dermatitis, affecting n = 10 (9.71%) and 6.80% of the sample, respectively. This suggests a moderate prevalence of common dermatological issues among the participants.
Nearly half of the study participants (44.66%) have reported experiencing skin tightening or tension after makeup removal, indicating that a significant portion of the cohort may have skin sensitivity issues related to cosmetic use. This finding is particularly relevant for cosmetic and dermatological research focusing on product sensitivity and skin barrier function.
Results
Analysis of numerical outcomes and consistency evaluation for test-retest measurements of the Baumann Skin Type Questionnaire
Table 2 provides the results of numerical outcomes for each dimension assessed by the Baumann Skin Type Questionnaire, along with a detailed analysis of the consistency in results across test-retest measurements.
Table 2
Distribution of overall numerical results in each dimension with estimates of agreement for test-retest measurements, df1 = 102, df2 = 22, F = 22
This evaluation captures the distribution of mean scores (M) and their standard deviations (SD), offering a precise depiction of participant responses across two distinct skin type dimensions: presence or absence of hyperpigmentation, and wrinkled or tight skin. The table further assesses the reliability of these measurements through ICC3 values, complemented by 95% confidence intervals (95% CI) and p-values (p), thus underscoring the robustness of the questionnaire in producing consistent and reproducible results. The accompanying statistical parameters, including degrees of freedom (df1 and df2) and F-values, provide additional validation of the statistical processes employed in this consistency check.
For the dimension evaluating skin with or without hyperpigmentation (part III of the BSTQ), the ICC3 value stands at 0.91, reinforcing the consistency of the questionnaire (I – Oily or dry skin: ICC3 = 0.91; II – Sensitive or resistant skin: ICC3 = 0.96 – data from the article: Baumann Skin Type Questionnaire (BSTQ): creation and validation of the Polish language version – Part One). The mean scores observed were M = 19.25 (SD = 4.56) in the test and M = 18.83 (SD = 4.42) in the retest, suggesting a stable measurement across sessions despite the slight variation in scoring.
The highest reliability is noted in the wrinkled or tight skin dimension (part IV of the BSTQ), with an ICC3 of 0.97. The mean scores are nearly identical between the test (M = 45.06 with SD = 4.15) and the retest (M = 44.93 with SD = 4.04), underscoring the exceptional repeatability of this particular assessment.
In summary, the results in Table 2 explicitly demonstrate high consistency in the scores between the initial test and the retest, as evidenced by the ICC3 values exceeding 0.90 in all dimensions. These are supported by narrow 95% confidence intervals, indicating a high level of precision in the estimates of agreement.
For detailed descriptive statistics and the agreement rates pertaining to individual items within a specific dimension of the Baumann Skin Type Questionnaire, please refer to Supplementary Table S1.
Analysis of the internal reliability for test-retest measurements of the Baumann Skin Type Questionnaire
The Skin with or without Hyperpigmentation dimension shows lower internal consistency with αraw scores of 0.55 initially and 0.58 at retest. These values are notably lower than other dimensions (I-oily or dry skin: αraw = 0.75, retest αraw = 0.77; II-sensitive or resistant skin: αraw = 0.72, retest αraw = 0.78 – data from the article: Baumann Skin Type Questionnaire (BSTQ): creation and validation of the Polish language version – Part One), reflected also in the lowest G6 scores of 0.63.
The weakest internal consistency is observed in Wrinkled or Tight Skin dimension, with very low αraw scores of 0.33 initially, slightly rising to 0.36 at retest. The G6 scores also reflect low reliability, ranging from 0.47 to 0.54.
Overall, while the Baumann Skin Type Questionnaire is reliable for certain skin assessments (oily or dry skin, sensitive or resistant skin), its comprehensive efficacy across all dimensions warrants further examination and possible adjustments to ensure uniform high reliability across all skin types.
For dimensions with lower reliability (e.g., wrinkled or tight skin), clinicians should be cautious in basing treatment decisions solely on these results. They might use supplementary diagnostic tools or clinical judgments to confirm these assessments before proceeding with anti-aging treatments or interventions.
Comprehensive evaluation of skin type results and agreement for test-retest measurements of the Baumann Skin Type Questionnaire
The results presented in Table 3 delineate a systematic evaluation of the consistency and accuracy in classifying skin types using the Baumann Skin Type Questionnaire across two measurement points. This table elucidates both the distribution of skin type classifications and the agreement rates between initial testing and retesting sessions. Each segment of the questionnaire – focusing on characteristics such as hyperpigmentation and the presence of wrinkles – undergoes scrutiny to determine both the stability of the classification over time and the precision of each category’s assessment.
Table 3
The results of skin types and consistency assessment for test-retest measurements
The agreement between measures, quantified through the AC1 statistic, is complemented by standard error and confidence interval metrics, offering a robust statistical framework to understand the reliability of the questionnaire.
The data presented in Table 3 reveal significant consistency across all dimensions of skin type assessment, with AC1 values indicating strong agreement between initial testing and retesting sessions, which underscores the reliability of this questionnaire in clinical and research settings.
Notably, the evaluation of skin with or without hyperpigmentation exhibits a perfect agreement, with an AC1 of 1.00 and a confidence interval exactly fixed at 1.00. This indicates an absolute consistency in the classification of this trait, suggesting that such conditions are distinctly observable and less susceptible to subjective interpretation or temporal change.
For the dimension of wrinkled or tight skin, the AC1 value is high at 0.93, with a confidence interval from 0.87 to 0.99. This suggests an excellent level of agreement and points to the reliable assessment of aging-related skin changes, which are likely to be more stable and consistently identifiable characteristics over short periods.
Evaluation of skin type results via visual inspection
The analysis of the skin type characteristics based on visual inspection conducted by a specialist, as detailed in Table 4, offers significant insights into the distribution and prevalence of various skin conditions within a sample size of 103 individuals. This assessment mirrors the structure of the Baumann Skin Type Questionnaire and encompasses two distinct dimensions. The observations recorded are treated as actual data for subsequent analyses, ensuring a rigorous and structured approach to understanding skin type distributions within the observed cohort.
Table 4
Characteristics of skin types based on visual inspection across two dimensions, N = 103
Characteristic | n (%) |
---|---|
Skin with or without hyperpigmentation | |
Hyperpigmentation | 17 (16.50) |
No hyperpigmentation | 86 (83.50) |
Wrinkled or tight skin | |
Tight | 103 (100.00) |
The evaluation of hyperpigmentation shows a significant skew towards individuals without this condition, where 83.50% of the cohort do not exhibit hyperpigmentation. Remarkably, the entire sample has tight skin, a 100% prevalence that could be reflective of the age demographic or other factors that are not explicitly detailed in the dataset. This uniformity may suggest a lack of variability in age.
Evaluation of the Baumann Skin Type Questionnaire: performance metrics across different skin type classifications
The following table, designated as Table 5, presents a detailed analysis of the performance metrics for skin type classification using the Baumann Skin Type Questionnaire compared with validation by the observed data.
Table 5
Performance metrics for skin type classification (estimated vs. observed), N = 103
This assessment encompasses dimensions of skin types, including skin with or without hyperpigmentation and wrinkled versus tight skin. The metrics provided in the table, such as accuracy, confidence intervals (95% CI), sensitivity, and specificity, are pivotal for understanding the effectiveness of the questionnaire in accurately classifying different skin conditions. Each dimension is tested and retested to evaluate the consistency and reliability of the questionnaire over repeated measures.
This rigorous analysis is crucial for dermatological research and practice as it helps in verifying the utility of the questionnaire as a diagnostic tool in clinical settings.
The questionnaire’s efficacy in identifying skin with or without hyperpigmentation and wrinkled versus tight skin demonstrates significant disparities. While the accuracy remains consistent at 85% for hyperpigmentation in both test and retest, the sensitivity is markedly low at 12%, though specificity is perfect at 100%. This suggests that while the questionnaire is highly effective at identifying individuals with no hyperpigmentation, it struggles to accurately detect those with hyperpigmentation.
The most considerable limitation is observed in the classification of wrinkled versus tight skin, where the accuracy is exceptionally low (17% and 18% for test and retest, respectively). Both sensitivity and specificity calculations indicate a severe underperformance in accurately classifying the tight skin, with sensitivity especially low, although specificity remains at 100%.
In conclusion, while the Baumann Skin Type Questionnaire exhibits strong performance in classifying oily versus dry and sensitive versus resistant skin types (data from the article: Baumann Skin Type Questionnaire (BSTQ): creation and validation of the Polish language version – Part One), it shows significant weaknesses in accurately classifying skin with or without hyperpigmentation and particularly in distinguishing between wrinkled and tight skin.
The Polish version of the BSTQ is shown in Appendix C.
Discussion
The Baumann Skin Type Questionnaire is an easy-to-administer questionnaire that allows for the assessment of skin type for the purpose of recommending skin care products and procedures. In addition, it can be useful in screening and recruiting individuals for research trials [21]. Questions in the questionnaire regarding behavioural habits (e.g. smoking, sunbathing, consumption of vegetables and fruits) contribute to the patients’ consideration that these habits may affect skin health. This helps to open up communication between the patient and the professional. BSTQ is not only a diagnostic instrument but also an educational strategy to initiate discussions about skin care [4, 22, 23].
In my opinion, it is important that before a cosmetic or aesthetic procedure, the patient should complete a questionnaire assessing skin type and the Cosmetic Procedure Screening Questionnaire, which screen patients suspected to be suffering from body dysmorphic disorder – BDO (validation of the Polish language version in 2021) [24].
Due to the development of the cosmetics industry, dermatology and the growing need for personalized and more tailored skin care, there is a growing need for an intuitive skin type classification system [25, 26]. BSTQ was used in the study by Ahn et al. [25] which identified the most characteristic skin features of Korean women. According to the authors, the obtained results allow for the provision of more individualized skin care for various skin conditions [25]. In the another article, the skin type of the Korean male population has been characterised [26]. However, Cho et al. [27] claim that there is a lack of sufficient validation for the Asian population. The researchers investigated the relationship between the results from the modified BSTQ and the measurements from digital photography skin analyzer [27]. Choi et al. [28] proposed the modified version of the BSTQ, whereas Cho et al. [27] established the optimized BSTQ based on dermatological assessment of the Asian population.
This study describes a detailed process of development and validation of the Polish language version of the BSTQ (part three: pigmented vs. non-pigmented skin and part four: wrinkled vs. tight skin). The Polish version of the BSTQ (pigmented vs. non-pigmented and wrinkled vs. tight) showed very good test-retest reliability (reproducibility) and an excellent level of agreement between initial test and retest (ICC3 = 0.91, AC1 = 1.00, 95% CI: 1.00 and ICC3 = 0.97, AC1 = 0.93, 95% CI: 0.87–0.99, respectively). However, BSTQ demonstrated a low internal consistency, corroborated by low Guttman’s Lambda 6 reliability (pigmented vs. non-pigmented: αraw = 0.55, αstd = 0.54, G6 = 0.63; retest: αraw = 0.58, αstd = 0.54, G6 = 0.63; wrinkled vs. tight: αraw = 0.33, αstd = 0.25, G6 = 0.47; retest: αraw = 0.36, αstd = 0.33, G6 = 0.54). In addition, comparison of the questionnaire results and the examination of the respondents’ skin revealed discrepancies. The questionnaire’s efficacy in identifying skin without hyperpigmentation is good while it struggles to accurately detect individuals with hyperpigmentation (Accuracy: 85%, Sensitivity: 12%, Specificity: 100% in test and retest). Presumably, this may be due to the fact that some respondents, despite having post-inflammatory hyperpigmentation, claimed that they did not have any hyperpigmentation (they associated the hyperpigmentation only with sun-related hyperpigmentation). This may indicate the need to improve and clarify the questionnaire in this dimension. The most considerable limitation is observed in the classification of wrinkled versus tight skin (Accuracy: 17%, Sensitivity: 17%, Specificity: 100% in test and Accuracy: 18%, Sensitivity: 18%, Specificity: 100% in retest). The author of the original questionnaire in section wrinkled vs. tight skin measures tendency to wrinkle as well as whether the skin is wrinkled now or not. Due to the fact that the study group consisted of young people, visual and palpation assessment of the facial skin showed that the subjects had tight skin. Measuring the tendency to wrinkle is difficult. The results show that it seems reasonable to introduce changes in the structure of the questionnaire in this dimension.
The main limitation was the validation of the BSTQ on young cohort comprising mainly women.
Overall, while the Baumann Skin Type Questionnaire is reliable for assessment of oily versus dry skin and sensitive versus resistant skin (data from the article: Baumann Skin Type Questionnaire (BSTQ): creation and validation of the Polish language version – Part One), its comprehensive efficacy across all four dimensions (with pigmented or non-pigmented skin and wrinkled or tight skin) warrants further examination and possible adjustment to ensure uniform high reliability across all skin types.
The questionnaire in the dimensions: oily or dry skin and sensitive or resistant skin can be used as an individual diagnostic tool for assessing skin type. On the other hand, the questionnaire in the dimensions: skin with or without hyperpigmentation and wrinkled or tight skin has the potential and can be useful as a supplement to the clinical interview. In addition, it can be a tool indicating the need for skin care, which motivates the patient to start talks with the clinician.
Conclusions
These findings suggest that while the questionnaire can be a useful tool for certain dermatological assessments (oily versus dry and sensitive versus resistant skin types – data from the article: Baumann Skin Type Questionnaire (BSTQ): creation and validation of the Polish language version – Part One), its application should be carefully considered and potentially supplemented with other diagnostic approaches, particularly when assessing skin for hyperpigmentation and age-related changes.