eISSN: 2353-561X
ISSN: 2353-4192
Current Issues in Personality Psychology
Current issue Archive Articles in press About the journal Editorial board Journal's reviewers Abstracting and indexing Contact Instructions for authors Ethical standards and procedures

vol. 4
Original paper

A folk-psychological ranking of personality facets

Eka Roivainen

Verve Rehabilitation, Oulu, Finland
Current Issues in Personality Psychology, 4(4), 187–195
Online publish date: 2016/10/10
Article file
Get citation
JabRef, Mendeley
Papers, Reference Manager, RefWorks, Zotero


Which scientist is more influential: Sigmund Freud or Leopold Szondi? One simple way to study this matter is to consult Google, which gives 9,950,000 search hits for Freud and 10,000 for Szondi. The number of mentions in journals, books, and online texts is one indication of the importance, influence, and significance of persons and things. In some cases, the same method can be used to compare psychological phenomena. For example, the phrase he suffers from schizophrenia yields 11,000 Google hits while he suffers from pyromania is mentioned only 100 times, which suggests that the prevalence of schizophrenia is higher than that of pyromania.
Many prevailing trait theories of personality are based on the lexical hypothesis. According to this hypothesis the individual differences most salient and socially relevant in people’s lives become encoded into their language (Allport & Odbert, 1936; Klages, 1932). Thousands of terms describing personality can be found in dictionaries. To classify these words, panels of judges are first used to select the most popular terms for further analysis (Saucier & Goldberg, 1996). When individuals rate their own and their peers’ personality using the few hundred selected terms, clusters of correlating terms are identified. Models with one (Rushton & Irving, 2011) to 16 (Cattell, 1946) factors have been proposed to describe the structure of personality (Ashton, Lee, & Boies, 2015). The Big Five model, proposed by Goldberg (1990), may be considered the prevailing contemporary lexically based personality model, while the HEXACO model, developed by Ashton and Lee (2007), is an important six-factor alternative.
While it is widely agreed that a lower-level structure of narrower traits – personality facets – needs to be added to factor models to gather a more detailed description of personality, there is no consensus on the nature of facets or on the correct method for their selection (Ziegler & Bäckström, 2016). For example, in McCrae and Costa’s (2003) Five-Factor Model, the broad factors are divided into six facets, each based on the following principle: “Our solution was to review the psychological literature and choose traits that seemed to cover the most important ways in which people differed” (McCrae & Costa, 2003). In Ashton and Lee’s (2007) HEXACO model, six factors are divided into four facet scales each, based on correlational analysis. However, hierarchical models where the broad domains are divided into facets poorly accommodate interstitial traits that have substantial shared variance with two or more of the Big Five or Six traits. Loehlin and Goldberg (2014) suggested that traits might actually better conform to lists than to hierarchies. An alternative to the hierarchical models is the Big Five Circumplex Model (Hofstee, De Raad, & Goldberg, 1992), where facets are seen as blends of two broader traits. For example, friendliness is a blend of agreeableness and extroversion. Nonetheless, the major dimensions of personality do not interact mechanically, and some combinations of factors are represented by many terms in the lexicon and other combinations by very few. Specifically, Factors I (Agreeableness), II (Extraversion) and IV (Emotional stability) form many blends, while terms that have their highest loading on Factor V (Openness/Intellect) have much lower secondary loadings.
To circumvent the theoretical contradictions and practical problems involved in the division/combination of the Big Five factors into facets, Wood, Nye, and Saucier (2010) developed the Inventory of Individual Differences in the Lexicon (IIDL). The IIDL has 61 scales that are based on 61 clusters of three or more synonymous or closely synonymous personality terms. Wood et al. identified these terms based on a pool of 504 familiar personality adjectives from Saucier (1997). Wood et al. summarized the rationale of this approach and the shortcomings of the prevailing models as follows: “When… distinct elements are aggregated to form a single ‘broad’ measure, or when only the ‘core’ of the superfactor is measured, it is often unclear which distinguishable aspect of the measure is most related to the variable of interest.” Arguably, it is wiser to buy a used car from a person who scores high on test items specifically developed to test honesty than from one who scores high on the broad domain scales of Agreeableness and Conscientiousness (Factor III) on a Big Five test. Paunonen and Ashton (2001) reported that a few narrow facet scales in personality tests may sometimes be better predictors of behavior than the Big Five primary factors together.
However, the IIDL may omit important aspects of personality represented by only one or two words in the lexicon. The 61 IIDL clusters include only 302 of Saucier’s pool of 504 adjectives. According to the lexical hypothesis, the importance of a trait is reflected both in the number of synonyms and in the popularity of those terms (Saucier & Goldberg, 1996). Recent studies (Roivainen, 2013, 2015) have shown that, among the 300-400 personality adjectives used in Saucier and Goldberg (1996) and Ashton and Lee (2007), usage frequency of the most popular terms (e.g., honest, friendly, or kind) may be a hundred- or a thousand-fold larger compared to the least popular adjectives (e.g., negativistic or nonconforming). Arguably, a facet represented by a single high usage word is likely as significant as another facet represented by two or three moderate usage words.
The large number of potential facets is one shortcoming of a facet-centered approach where broad traits are considered of secondary importance. For example, Wood’s IIDL has 61 scales, and if facets represented by fewer than three terms in the lexicon were included in a personality test, the number of scales could become impractically large. Obviously, some criteria are needed to select the most important facets. Ideally, the selected facets should be maximally predictive of behavior and represent all diverse aspects of personality.
In the present study, a “folk psychological” ranking of personality facets was performed by studying the frequency of use of (English language) personality adjectives in causal clauses on the Internet. Causal clauses, such as John avoided the meeting because he is a shy person, express cause-effect relationships. In theory, the prevalence of a trait term in causal clauses reflects the explanatory significance of the respective trait as perceived by lay persons. In Study 1, the semantic correlation of the most frequently used adjectives was analyzed using a dictionary. Synonyms and antonyms were eliminated from the adjective ranking list to generate a facet ranking list. In Study 2, clusters of synonymous or closely synonymous terms were studied. Clusters were ranked by calculating the usage frequency of terms in each cluster. The lists of facets based on these analyses were compared to the facet models of the prevailing factor theories of personality.



Frequencies of use of 295 popular personality adjectives that Saucier and Goldberg (1996) and Ashton and Lee (2007) used in previous factor-analytic studies were analyzed in phrases of the type because he is a kind person, on the Internet using a simple Google search. The number of “most relevant” search results was recorded. Google defines “most relevant” search results in the following way:
When a user enters a query, our machines search the index for matching pages and return results we believe are the most relevant to the user. Relevancy is determined by over 200 factors, one of which is the PageRank for a given page. PageRank is the measure of the importance of a page based on the incoming links from other pages. (Google, 2015)
The usage frequency of the trait words in causal clauses was compared to their overall usage frequency as attributes of the word person and preceded by very (very kind person) in the Google Books database. This type of trigram was used to eliminate expressions such as rational person that appear often in legal texts and do not actually involve personality description. The Google Books database is based on millions of books. It is the largest existing text corpus (Michel et al., 2011). Usage frequency of the trigrams was searched for the year 2000 with three years of smoothing.
Using the Merriam-Webster (2016) dictionary, the 40 most frequently used adjectives were further analyzed and terms with a higher-ranking synonym or antonym were identified.
The frequency of the use of personality terms in causal clauses was compared to the primary factor loadings of these terms reported in the study by Saucier and Goldberg (1996). In addition, social desirability and category breadth estimates of the top 40 adjectives were retrieved from Hampson, Goldberg, and John (1987).


In all, 1,214 causal clauses of the type described above were found on the Internet. The expression because he is a kind person appeared 69 times on the Internet, while 89 of the 295 adjectives had zero frequency in causal clauses. The most popular trigram in Google Book texts was very religious person with a usage frequency of 12 × 10-7%, while 127 trigrams with the less popular adjectives had usage frequencies below 0.1 × 10-7%. The correlation between usage frequency in causal clauses on the Internet and in trigrams in books was 0.66.
The 40 most popular adjectives, shown in Table 1, accounted for 765 of the 1214 Google hits. The cumulative frequency of use of the top 40 adjectives in book texts was 111 × 10-7%, as compared to 121 × 10-7% for the other 255 adjectives. There were nine pairs of dictionary synonyms and eight pairs of antonyms among the 40 most popular terms.
The correlation between word popularity in causative clauses and the highest primary loading in the five-factor model reported by Saucier and Goldberg (1996) was .19. The correlation between primary factor loading and popularity in book texts was .06. Twenty of the top 40 adjectives had the highest loading on the Agreeableness factor, six on Openness, six on Extraversion, four on Conscientiousness, and four on the Emotional Stability factor.
The 40 most popular adjectives had a mean social desirability z-value of 0.63 and category breadth value of 0.73, meaning that the concepts are on the average more broad and socially desirable than 73-76% of the 573 terms (these include the 295 terms studied in the present study) used in the study by Hampson et al. (1987).



Previous studies show that some traits are represented by many synonymous or closely synonymous terms that have moderate usage frequencies, while other facets may be represented by only one or two terms that have a very high usage frequency (Roivainen, 2013). For example, intelligence seems to be represented by a fairly small number of highly popular terms (Roivainen, 2014). Therefore, the ranking of facets presented in Table 1 may be biased against facets that are represented by many closely synonymous terms. In Study 2, the sums of the usage frequencies of terms grouped in synonym clusters were analyzed.


The 339 terms employed in the study come from the study by Goldberg (1990). Using Norman’s (1967) pool of 2,797 stable-trait terms, Goldberg first eliminated nouns and adjectives that were difficult, or referred to nonhuman or evil behaviors, or were considered difficult or ambiguous. The resulting pool of 1,710 terms was further reduced to a set of 479 terms by eliminating terms on the basis of their ambiguity, difficulty, or over-evaluation, as judged by panels of college students. Among the 479 terms, Goldberg identified 100 synonym clusters involving 339 terms. The mean item intercorrelation in the clusters was .40 in Goldberg’s study.
Goldberg’s clusters were chosen instead of Wood’s IIDL clusters because the terms used are largely the same as those analyzed in Study 1, and because the IIDL clusters include a number of non-personality related terms such as those involving looks (beautiful, handsome), wealth (rich, poor) or age (young, old).


Table 2 shows the sum of the usage frequencies of the adjectives in the top-ranking 30 clusters. In causal clauses, the total frequency of the use of the personality terms in the top 30 clusters was 855 as compared to 240 for the 70 lower-ranking clusters. In Google Book texts the cumulative frequency of the top 30 cluster adjectives was 126 × 10-7% as compared to 54 × 10-7% for the 70 lower-ranking clusters. The correlation between usage frequency in causal phrases and in book texts was .76.
As expected, the top-ranking clusters shown in Table 2 include the top-ranking adjectives shown in Table 1, with a few exceptions, such as religious, which was not included in Goldberg’s 1990 study. Based on the adjective list, “the Big Five facets” are kindness, honesty, intelligence, shyness, and humility, while the cluster ranking indicates morality, empathy, intelligence, modesty, and shyness as the five most significant facets.


The results of the study support previous studies that have analyzed usage frequencies of personality descriptors. A few personality terms are used very frequently, whereas many personality terms found in dictionaries are rarely used (Leising, Scharloth, Lohse, & Wood, 2014; Roivainen, 2013; Wood, 2015). Social desirability and category breadth are correlated positively with usage frequency (Wood, 2015). High usage terms on the Internet are also frequently used in other corpora, such as the Google Books corpus, Corpus of Contemporary American English (Davies, 2008; Wood, 2015), Twitter (Roivainen, 2015), and in open-ended self-descriptions (Ames & Bianchi, 2008). Usage frequency has only a weak positive correlation with factor loading (Roivainen, 2013; Wood, 2015), which means that many popular personality terms represent traits that are in the interstitial space between the major domains of the Big Five model.
The present study shows that it is possible to select a smaller sample of words from among hundreds of personality terms that quantitatively account for the majority of personality descriptions in books and Internet texts. However, one should ask whether these lists of 30-40 trait terms cover the semantic field as well as facets of prevailing factor models. A comparison between the 30 facets of the Five Factor Model (Costa & McCrae, 1992) in Table 3, the 30 synonym clusters in Table 2, and the 40 adjectives shown in Table 1 indicates that the Five Factor Model is more detailed in the description of the Conscientiousness factor and less nuanced in the description of the Agreeableness factor than the two lists of facets. Fifteen of the 30 synonym clusters are composed of terms related to Agreeableness, and half of the adjectives in Table 1 have their highest loading on the Agreeableness factor. Only one of the 30 clusters, Dependability, includes terms related to Conscientiousness. Of the 23 nonsynonymous adjectives shown in Table 1, only the terms reliable and lazy had their highest loading on the Conscientiousness factor.
The Emotional stability factor is also better covered by the NEO-PI facets. Only five of the 30 synonym clusters refer to emotional stability, and insecure is the only term in Table 1 that specifically refers to anxiety or depression. It may be that these psychiatric concepts are perceived to represent states more than traits; therefore, they act as personality attributes less often. The openness aspect of the Openness/Intellect factor is also poorly represented by the adjectives in Tables 1 and 2. This result is in keeping with Roivainen (2014), who found that terms such as open-minded and close-minded have very low usage frequencies compared to terms that represent the intellect aspect of this factor, such as intelligent and smart.
The adjectives in Tables 1 and 2 probably cover the interstitial spaces between the five factors better than the NEO-PI facets do. Seven of the 23 nonsynonymous terms in Table 1 had a secondary loading of > .30 in Saucier and Goldberg’s (1996) study. For example, honest, humble, and moral fit poorly into the five-factor framework. For this reason, Ashton and Lee (2007) added a sixth Honesty-Humility factor in their HEXACO model. The results of the present study support this solution based on the high usage frequencies of the words related to honesty and humility. Another example of the problems of the hierarchical models involves the term shy, which is a (negative) factor marker of Extraversion in Saucier and Goldberg’s (1996) model; however, Self-consciousness is classified as an Emotional Stability facet in McCrae and Costa’s (2003) taxonomy.
The facet rankings shown in Tables 1 and 2 are, as the title of this study suggests, “folk-psychological” rankings that may be biased in many ways. Furthermore, studying texts and words should not be naively equated with studying the types of phenomena represented (Uher, 2013, 2015). For example, the phrase “because he is an Aquarius” appears 12 times on the Internet, but it is doubtful whether zodiac signs actually have causal effects. The lexical hypothesis itself appeals to common sense; however, only a few empirical studies have attempted verify its validity (Uher, 2013). Leising et al. (2014) found that the rated importance of a term was associated with the frequency of word use in a set of German personality descriptors. However, Wood (2015) found “little evidence that trait terms rated as having greater relational impact were more frequently used, or had a greater number of synonyms”. Wood observed that words tended to be used more frequently if they had positive, rather than consequential effects, on relational decisions. Everyone tries to be nice, but it is more important for others to know whether one is murderous. For this reason, the frequency of terms that reflect socially valued traits may be exaggerated compared to their actual relational effects. Another explanation for Wood’s observations may come from the fairly narrow aspects of behavior analyzed by Wood’s panels of judges. Entering new social or working relationships with other people represents a small proportion of all social interactions. Therefore, it may be that frequently used personality terms represent traits with significant effects on behavior in situations other than those studied by Wood. For example, McAdams (1992) criticized the five-factor model for being a “psychology of the stranger”. The five factors describe what one might want to know if one knew nothing else about a person. Arguably, the Internet and book texts offer a more representative sample of behavior descriptions.
In a recent study, Mottus (2016) argues that “when outcome associations are specific to facets, they should not be generalized to traits and when the associations are specific to particular items they should not even be generalized to facets.” Layman concepts such as honesty represent narrower traits than the Big Five factors do, but it may be that some narrow aspects or nuances of the trait of honesty predict some behavioral outcomes even better than the trait of honesty as a whole. Thus, personality research need not stop at the facet level but should proceed to the nuance level of the personality hierarchy. However, factor- and facet-level analyses will remain important for ensuring that all significant aspects of personality are included in the personality models.


If the lexical hypothesis is valid, personality models should include facets represented by the most popular personality terms. These facets are important independent of their standing in factor models, and they may be blends of two, three, or more broad traits. Their relevance has been tested in real life and in the process of language evolution. To cover all important aspects of personality, correlational analysis is needed to eliminate synonymous, closely synonymous, or somewhat related concepts, depending on the number of facets included in the model. Further studies are needed to analyze, for example, the optimal choice of 10, 30, or 50 scales for a general personality test. A wider variety of spoken and written language corpora and ngrams with different defined nouns should be studied to maximize validity in the selection of the most important terms. A larger pool of personality descriptors, such as Norman’s (1967) list of 2797 personality terms, might be analyzed instead of the smaller samples used in the present study.


Ames, D. R., & Bianchi, E. C. (2008). The agreeableness asymmetry in first impressions: perceiver’s impulse to (Mis)judge agreeableness and how it is moderated by power. Personality and Social Psychology Bulletin, 34, 1719–1736.
Allport, G. W., & Odbert, H. G. (1936). Trait names: a psycholexial study. Psychological Monographs, 47, 1.
Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11, 150–166.
Ashton, M. C., Lee, K., & Boies, K. (2015). One-through six-component solutions from ratings on familiar English personality-descriptive adjectives. Journal of Individual Differences, 36, 183–189.
Cattell, R. B. (1946). The description and measurement of personality. New York: World Book.
Costa, P. T., & McCrae, R. R. (1992). NEO PI-R Professional Manual. Odessa, FL: Psychological Assessment Resources.
Davies, M. (2008). The Corpus of Contemporary American English: 450 million words, 1990‐present. http://corpus.byu.edu/coca/
Goldberg, L. R. (1990). An alternative “description of personality”: The Big‐Five factor structure. Journal of Personality and Social Psychology, 59, 1216–1229.
Google. (2015). How Google search works. https://support.google.com/webmasters
Hampson, S. E., Goldberg, L. R., & John, O. P. (1987). Category breadth and social desirability values for 573 personality terms. European Journal of Personality, 1, 241–258.
Hofstee, W. K. B., De Raad, B., & Goldberg, L. R. (1992). Integration of the Big Five and circumplex approaches to trait structure. Journal of Personality and Social Psychology, 63, 146–163.
John, O., & Srivastava, S. (1999). Big Five Trait Taxonomy. In L. A. Pervin & O. P. John (eds.), Handbook of personality: Theory and research (2nd ed., pp. 102–138). New York: Guilford.
Klages, L. (1932). The science of character. London: Allen & Unwin.
Leising, D., Scharloth, J., Lohse, O., & Wood, D. (2014). What types of terms do people use when describing an individual’s personality? Psychological Science, 25, 1787–1794.
Loehlin, J. C., & Goldberg, L. R. (2014). Do personality traits conform to lists or hierarchies? Personality and Individual Differences, 70, 51–56.
McAdams, D. P. (1992). The five-factor model in personality: A critical appraisal. Journal of Personality, 60, 329–361.
McCrae, R. R., & Costa, P. (2003). Personality in adulthood. A Five-factor theory perspective. New York: Guilford Press.
Merriam-Webster. (2016). Merriam-Webster online: Dictionary and Thesaurus. http://www.merriam-webster.com
Michel, J. B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Google Books Team, Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A., & Aiden, E. L. (2011). Quantitative analysis of culture using millions of digitized books. Science, 331, 176–182.
Mottus, R. (2016). Towards more rigorous personality trait-outcome research. European Journal of Personality, 30, 292–303.
Paunonen, S. V., & Ashton, M. C. (2001). Big-five factors and facets and the prediction of behavior. Journal of Personality and Social Psychology, 81, 524–539.
Roivainen, E. (2013). Frequency of the use of English personality adjectives: Implications for personality theory. Journal of Research in Personality, 47, 417–420.
Roivainen, E. (2015). Personality adjectives in twitter tweets and in the google books corpus. an analysis of the facet structure of the openness factor of personality. Current Psychology, 34, 621–625.
Roivainen, E. (2015). The Big Five factor markers are not especially popular words. Are they superior descriptors? Integrative Psychological and Behavioral Science, 49, 590–599.
Rushton, J. P, & Irwing, P. (2011). The general factor of personality: Normal and abnormal. In T. Chamorro-Premuzic, S. von Stumm, & A. Furnham (eds.), The Wiley-Blackwell Handbook of Individual Differences (pp. 132–161). West Sussex, UK: Wiley-Blackwell.
Saucier, G., & Goldberg, L. (1996). Evidence for the Big Five in analyses of familiar English personality adjectives. European Journal of Personality, 10, 61–77.
Saucier, G. (1997). Effects of variable selection on the factor structure of person descriptors. Journal of Personality and Social Psychology, 73, 1296–1312.
Uher, J. (2013). Personality psychology: Lexical approaches, assessment methods, and trait concepts reveal only half of the story. Why it is time for a paradigm shift. Integrative Psychological and Behavioral Science, 47, 1–55.
Uher, J. (2015). Conceiving “personality”: Psychologists’ challenges and basic fundamentals of the Transdisciplinary Philosophy-of-Science Paradigm for Research on Individuals. Integrative Psychological and Behavioral Science, 49, 398–458.
Wood, D. (2015). Testing the lexical hypothesis: Are socially important traits more densely reflected in the English lexicon? Journal of Personality and Social Psychology, 108, 317–335.
Wood, D., Nye, C. D., & Saucier, G. (2010). Identification and measurement of a more comprehensive set of person-descriptive trait markers from the English lexicon. Journal of Research in Personality, 44, 258–272.
Ziegler, M., & Bäckström, M. (2016). 50 Facets of a Trait 50-Ways to Mess Up? European Journal of Psychological Assessment, 32, 105–110.
Copyright: © 2016 Institute of Psychology, University of Gdansk This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License (http://creativecommons.org/licenses/by-nc-sa/4.0/), allowing third parties to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material, provided the original work is properly cited and states its license.
Quick links
© 2020 Termedia Sp. z o.o. All rights reserved.
Developed by Bentus.
PayU - płatności internetowe