The Leadership Quarterly

The Authentic Leadership Inventory (ALI): Development and empirical tests
Linda L. Neider ⁎, Chester A. Schriesheim

a b s t r a c t
Available online 15 October 2011 This paper presents the development and preliminary validation of a new measure of authentic leadership, the Authentic Leadership Inventory (ALI). It also assesses the recently developed Authentic.
LeadershipQuestionnaire (ALQ). Results indicate some concernswith the ALQ but support
the content validity, reliability, factor structure, convergent and discriminant validity, concurrent
validity, and freedomfromimpression management response bias of the ALI. Confirmatory factor
analyses also do not support treating authentic or transformational leadership as universally global
constructs. Instead, it is argued that future research would better be served by using separate
authentic and transformational dimensions (rather than aggregate or globalmeasures) to understand
the unique aspects of both leadership constructs.
© 2011 Published by Elsevier Inc.

Keywords:
Authentic leadership
Authentic Leadership Inventory
Scale development
Open any newspaper and itwill be replete with examples of corruption and greed at the very top ranks of U.S. corporations. Is it any wonder, then, that the general public, as well as scholars, have become enamored with finding authenticity in leadership? A critical dilemma, of course, is for researchers to operationally define the key behaviors and dimensions of such leadership (Cooper, Scandura, & Schriesheim, 2005). As Yukl (2010) aptly notes, “Until differences in the definition of authentic leadership are resolved, and differences between authentic leadership theory and other theories of leadership…are resolved, itwill be difficult even to determinewhat should be included in the research” (p. 425).
There have been, however, numerous attempts to explicate the concept of authentic leadership within the last decade (for a current and more in-depth review, see Gardner et al., in press). One of the first perspectives was put forth by Bass and Steidlmeier (1999) and suggests that authenticity is an extension of transformational leadership.

Specifically:
“Leaders are authentically transformational when they increase awareness of what is right, good, important, and beautiful, when they help to elevate followers’ needs for achievement and self-actualization, when they foster in followers higher moral maturity, and when they move followers to go beyond their self-interests for the good of their group, organization, or society” (Bass & Steidlmeier, 1999.
In Bass and Steidlmeier’s (1999) view, authentically transformational leaders display the four major transformational leadership dimensions of idealized influence, inspirational motivation, intellectual stimulation, and individualized consideration. An authentic transformational leader is essentially a “moral agent” who empowers followers to take actions that are noble, fair, and legitimate (Bass & Steidlmeier, 1999).
Although Bass and Steidlmeier (1999) view authenticity as simply an extension of transformational leadership (Yukl, 2010), current elaborations consider authentic leadership as a “root concept” that underlies the positive aspects of charismatic, transformational, spiritual, and ethical leadership theories (Ilies, Morgeson, & Nahrgang, 2005). Luthans and Avolio (2003), for example, state that the “authentic leader is confident, hopeful, optimistic, resilient, moral/ethical, future-oriented, and gives priority to developing associates to be leaders. The authentic leader is true to him/herself” (p. 243). This latter description is also incorporated.
The Leadership Quarterly 22 (2011) 1146–1164
⁎ Corresponding author at: Management Department, School of Business Administration, University of Miami, 5250 University Drive, Coral Gables, FL 33146,
USA. Tel.: +1 305 284 6123.
E-mail address: [email protected] (L.L. Neider).
1048-9843/$ – see front matter © 2011 Published by Elsevier Inc.
doi:10.1016/j.leaqua.2011.09.008

into Kernis’s (2003) work on self-esteem, which stresses that authenticity entails the “unobstructed operation of one’s true, or
core, self” (p. 13) in everyday living. Similarly, Shamar and Eilam (2005) contend that an authentic leader has a “high level of self-resolution or self-concept clarity” (p. 399), in addition to self-concordant goals, self-expressive behavior, and the held belief that the leader role is central to their self-concept.
Utilizing findings from positive psychology and related fields, as well as previous operationalizations (Avolio & Gardner, 2005;
George, 2000; Kernis, 2003; Luthans & Avolio, 2003), Ilies, Morgeson and Nahrgang (2005) developed a four-dimensional model
of authentic leadership. Thismulti-factor conceptualization includes self-awareness (“one’s awareness of, and trust in, one’s own personal characteristics, values, motives, feelings, and cognitions;” p. 377); unbiased processing (“not denying, distorting, exaggerating or ignoring private knowledge, internal experiences, and externally based evaluative information;” p. 378); authentic behavior/acting
(“whether people act in accord with their true self as opposed to acting merely to please others or to attain rewards or avoid punishments through acting ‘falsely;’” p. 380); and authentic relational orientation (“involves an active process of self-disclosure and the development
ofmutual intimacy and trust so that intimateswill see one’s true self-aspects, both good and bad;” p. 381).While the Ilies, Morgeson and Nahrgang (2005) four-dimensional model successfully built upon prior theory to describe the potential behaviors, antecedents, and outcomes associatedwith authentic leadership, research cannot advance in any areawithout appropriate and psychometrically sound measures (Cooper, Scandura & Schriesheim, 2005).
Reflecting this, construct development and measurement validation for authentic leadership was recently addressed at some length by Walumbwa, Avolio, Gardner, Wernsing, and Peterson (2008). Assimilating research from social psychology, moral and ethical philosophy, and the contributions noted above, the authors proposed a four-factor Authentic Leadership Questionnaire (ALQ) and presented preliminary psychometric evidence for its future usage. Essentially, their higher order, multi-dimensional authentic leadership construct consists of the following four factors:
Self-Awareness (S) demonstrating an understanding of how one derives and makes meaning of the world and how that meaning-making process impacts the way one views himself or herself over time. It also
refers to showing an understanding of one’s strengths and weaknesses and the multifaceted nature
of the self, which includes gaining insight into the self through exposure to others, and being cognizant of one’s impact on other people (Walumbwa et al., 2008.
Relational Transparency (R) presenting one’s authentic self (as opposed to fake or distorted self) to others. Such behavior promotes trust through disclosures that involve openly sharing information and expressions of one’s true thoughts and feelings while trying tominimize displays of inappropriate emotions (Walumbwa, Avolio, Gardner,Wernsing & Peterson, 2008.
Balanced Processing (B) showing that they objectively analyze all relevant data before coming to a decision. Such people
also solicit views that challenge their deeply held positions (Walumbwa, Avolio, Gardner,
Wernsing & Peterson, 2008, pp. 95–96).
Internalized Moral Perspective (M) refers to an internalized and integrated form of self-regulation. The sort of self-regulation is
guided by internal moral standards and values versus group, organizational, and societal pressures,
and it results in expressed decision making and behavior that is consistent with these internalized
values (Walumbwa, Avolio, Gardner, Wernsing & Peterson, 2008, p. 95).
After a deductive and inductive content analysis process, Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) ultimately
generated sixteen items for incorporation into the ALQ, followed by preliminary assessments of their instrument’s construct
validity.
A major contribution of Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) is the fact that the operationalization of authentic leadership employed in developing the ALQ is based on a thorough review of theoretical contributions encompassing multiple disciplines. This is a first step in construct development and validation and it is absolutely necessary to establish the psychometric soundness of any new measurement instrument (Hinkin, 1995; Schriesheim, Powers, Scandura, Gardiner, & Lankau,
1993). However, as Cronbach (1984) states, “Construct validation is a fluid, creative process….no interpretation can be considered
the final word, established for all time” (p. 149). Thus, a closer look at the ALQ may be warranted despite the encouraging evidence that currently exists concerning this instrument (see Gardner et al., in press for additional discussion of evidence on the ALQ).
One concern about the ALQ is that although eight sample items (from the sixteen used in the instrument) are presented in the Appendix to Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008), the full instrument is commercially copyrighted. While it is currentlymade available at no cost to researchers (see Gardner et al., in press), access to this instrumentmay become problematic
in the future. Additionally, althoughWalumbwa et al. based their initial itemgeneration on an extensive analysis of the literature, the content validation process that was employed relied heavily on the subjective judgments of a small number of doctoral students and other “subject matter experts.” In recent years, a quantitative approach to content validation has been developed to help reduce or eliminate subjectivity from scale development and item assessments (Hinkin & Tracey, 1999; Schriesheim, Cogliser, Scandura, Lankau, & Powers, 1999; Schriesheim, Powers, Scandura, Gardiner & Lankau, 1993; for recent illustrations of the application ofthese methods, see Hinkin & Schriesheim, 2008; Schriesheim, Alonso, & Neider, 2008). This quantitative process involves using one-way analysis of variance and factor or component analysis to further refine and strengthen item assignment. Given the importance
of measurement in the field of leadership research (Schriesheim & Cogliser, 2010), it is essential that rigorous procedures be used to assess instrument content validity and to refine and/or replace problematic questionnaire items before considerable time, effort, and resources are invested in subsequent research that employs suchmeasures. These issues together led to the development of a new instrument to assess authentic leadership, one that will be available in perpetuity to researchers for further psychometric and substantive investigations, and one with clear content validity (as well as construct validity.
In addition to the need for a valid measurement device assessing authentic leadership, there is considerable conceptual ambiguity concerning the difference between authentic leadership and related constructs, particularly (as suggested above) with respect to current conceptualizations of transformational leadership (Cooper, Scandura & Schriesheim, 2005; Yukl, 2010).
Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) began the process of assessing the ALQ’s discriminant validity, and found support for the empirical distinctiveness of perceived authentic versus perceived transformational leadership. However, the Multifactor Leadership Questionnaire (MLQ; Bass, 1990; Bass & Avolio, 1990, 1993) was used in this research, an instrument
that has been shown to have serious construct validity issues (for a brief review, see Schriesheim, Alonso & Neider, 2008). Among the concerns about theMLQ are (a) its problematic factor structure, in particular the collapsing of different dimensions into an aggregate or globalmeasure, (b) the high intercorrelations among its subscales, and (c) the possibility of serious response bias distortions affecting at least one of its subscales (Schriesheim, Alonso & Neider, 2008). In fact, preliminary evidence suggests that the
Transformational Leadership Inventory (TLI), initially developed by Podsakoff, MacKenzie, Moorman, and Fetter (1990), and subsequently further tested (e.g., Podsaskoff, MacKenzie & Bommer, 1996), “appears to be psychometrically superior” to the more widely used MLQ (Schriesheim, Alonso & Neider, 2008, p. 16).
Another potential issue with the Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) development of the ALQ concerns their confirmatory factor analyses (CFA’s), which indicated that a second-order factor model fit their data better than did a simple four-factor correlated first-order factor model. Unfortunately, in specifying and testing their first- and second-order factor models, two correlated error or “garbage parameters” were included (Walumbwa, personal communication, March 16, 2009). This would
tend to inflate model fit and weaken the conclusion that a second-order factor model is a significantly better portrayal of authentic
leadership than is a simple first-order model (MacCallum, 1986).
Thus, based on the discussion presented above, there are three key purposes to this study. One is to begin the validation process for an alternativemeasure of authentic leadership whichwill be always available to researchers andwhich uses the thorough theoretical foundation established by Walumbwa, Avolio, Gardner,Wernsing and Peterson (2008). Second,wewanted to employ amore rigorous quantitative content validity assessment procedure and to also assess the viability of several different CFAmodels (without the benefit of garbage parameters). Finally, the third goal of the present paper is to investigate the discriminant validity of our newauthentic leadership measure vis a vis a better transformational leadership measure (the TLI rather than the MLQ) and to also investigate the convergent
and discriminant validity (Campbell & Fiske, 1959) of the new measure using relatively rigorous CFA procedures (cf. Schmitt & Stults, 1986; Widaman, 1985). As part of this third objective, relationships between the newmeasure and social desirability or impression management response bias (Zerbe & Paulhus, 1987)will be examined, alongwith relationships between the newscales and three commonly used leadership dependent variables (general job satisfaction, supervision satisfaction, and organizational commitment) (Dumdum, Lowe, & Avolio, 2002; Podsakoff, Bommer, Podsakoff, & MacKenzie, 2006).

  1. Study 1: scale development and content validation
    As noted above, the development of items for measuring authentic leadership was facilitated by adopting the theoretical
    framework and dimension definitions provided by Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008, pp. 95–96).
    Using the Walumbwa et al. four dimension definitions as guides, the authors wrote four items for each dimension, two of
    which were paraphrased from the two sample items that are provided by Walumbwa, Avolio, Gardner, Wernsing and Peterson
    (2008, Appendix, p. 121) for each dimension. This was done to provide maximum theoretical convergence and fidelity with
    the Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) definitions and conceptualization in the initial pool of authentic
    leadership items. Having produced a preliminary set of 16 authentic items for our Authentic Leadership Inventory (ALI), we then
    undertook the process of assessing their content validity, along with the content validity of the 8 authentic leadership items that
    Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) used to illustrate the Authentic Leadership Questionnaire (ALQ). The
    method employed was the quantitative content validity assessment procedure developed by Schriesheim, Cogliser, Scandura,
    Lankau and Powers (1999, 1993) and further refined by Hinkin and Tracey (1999).
    1.1. Method
    1.1.1. Sample
    The sample consisted of 40 undergraduates (juniors and seniors) and 32 executive M.B.A. students who were taking classes in
    leadership at a medium-sized southern university and who had not yet covered either authentic or transformational leadership.
    The sample was comprised of 56% males and the average age was 23.6 years old. A total of 44% reported that they were currently
    employed, 32% reported being employed in the last six months, and only 8% reported having no work experience. As noted in
    Schriesheim, Cogliser, Scandura, Lankau and Powers (1999, 1993), the requirements to complete the task assigned the subjects
    are sufficient intellectual ability to read and understand the dimension definitions and rate the items, and the lack of any theoretical
    biases. As such, the use of these particular students was seen as reasonable.
    1.1.2. Procedure
    Rating forms were administered during class time, along with verbal and written instructions. Participation was voluntary but
    given extra course credit. Each respondent was administered a form that contained the 16 ALI items in the same order as they

appear in Table 1. The 8 ALQ items appeared after the ALI items and these were presented in alternating order. ALQ Self-Awareness sample items 1 and 2 appeared as items 17 and 21; Relational Transparency items 3 and 4 were shown as items 18 and 22; Internalized Moral Perspective items 5 and 6were listed as items 19 and 23; and Balanced Processing items 7 and 8 appeared as items 20 and

  1. A short demographic questions section was included as were the definitions (presented above) of the four authentic leadership
    constructs that were to be assessed.
    The definitions of the four leader behaviors were presented on a cover page to the rating formthat was torn off and used as a reference
    while the items were being evaluated. These definitions were exactly the same as those presented earlier (when we first discussed
    the ALQ). The respondents then rated each of the 24 items on the extent towhich they believed that the itemsmeasured each
    of the four leader behaviors, using the theoretical definitions of the dimensions that were provided (i.e., each item was rated four
    times, once for each dimension). The rating scale employed was: 0=None, 1=Hardly any, 2=Some, 3=Much, 4=Very much,
    and 5=Almost completely or completely. For an example of the rating formemployed, see Schriesheim, Powers, Scandura, Gardiner
    and Lankau (1993).
    1.1.3. Methods of analysis
    We first used one-way analysis of variance (ANOVA) and planned directional t-tests to determine towhich dimension(s) each item
    should be assigned (Hinkin & Tracey, 1999). This technique eliminates the use of subjective judgment for assigning items to dimensions
    since the procedure employs well-established direct empirical tests for determining item dimensionality. Additionally, although
    not needed for the current study, this technique can be utilized with relatively small sample sizes and it is quite straightforward in its
    application. For recent illustrations of the use of this method see Hinkin and Schriesheim (2008) and Schriesheim, Alonso and Neider
    (2008).
    Once we had completed the ANOVA and t-tests, we then employed extended data matrix component analysis to determine if
    the subjects had confused the four dimensions (i.e., whether they were theoretically distinct) and to further determine to which
    dimension(s) each item should be assigned. The data input for this analysis was an “extended data matrix” (Schriesheim, Powers,
    Scandura, Gardiner & Lankau, 1993) that had 24 columns (representing the 16 ALI and 8 ALQ items) and 288 rows, representing 4
    judgments (one on each authentic leadership dimension) for the 72 subjects.
    These datawere subjected to a principal components analysis to determine the appropriate number of underlying dimensions in the
    data. The eigenvalues of the first eight components were 5.14, 4.29, 2.69, 2.13, 1.03, 0.99, 0.79, 0.74, indicating by a scree test (Cattell,
    1966) the appropriateness of extracting four dimensions (the four components accounted for a total of 59.4% of the item variance).
    The four componentswere then subjected to a varimax orthogonal rotation for interpretation (because the dimensions are theoretically
    independent).
    1.2. Results
    1.2.1. Results for the ALI items
    As shown in Table 2, the ANOVAs and t-tests support the assignment of all sixteen ALI items–except for item 6–to the theoretical
    dimension for which they were written. Other than for item6, the items’ mean ratings on their theoretically appropriate dimensions
    are generally much larger than their mean ratings on even the next most highly rated dimension. This is also reflected in generally
    large one-way ANOVA F-test values that can be seen in Table 2.
    Using the .40 loading criterion suggested by Ford, MacCallum, and Tait (1986), the component analysis reported in Table 3
    supports the theoretical distinctiveness of the four authentic leadership dimensions (i.e., that they are not confused by the
    Table 1
    ALI items.
  2. My leader solicits feedback for improving his/her dealings with others. (S)
  3. My leader clearly states what he/she means. (R)
  4. My leader shows consistency between his/her beliefs and actions. (M)
  5. My leader asks for ideas that challenge his/her core beliefs. (B)
  6. My leader describes accurately the way that others view his/her abilities. (S)
  7. My leader admits mistakes when they occur. (R)
  8. My leader uses his/her core beliefs to make decisions. (M)
  9. My leader carefully listens to alternative perspectives before reaching a conclusion. (B)
  10. My leader shows that he/she understands his/her strengths and weaknesses. (S)
  11. My leader openly shares information with others. (R)
  12. My leader resists pressures on him/her to do things contrary to his/her beliefs. (M)
  13. My leader objectively analyzes relevant data before making a decision. (B)
  14. My leader is clearly aware of the impact he/she has on others. (S)
  15. My leader expresses his/her ideas and thoughts clearly to others. (R)
  16. My leader is guided in his/her actions by internal moral standards. (M)
  17. My leader encourages others to voice opposing points of view. (B)
    Note. Items 1 and 6 were subsequently deleted from the final scales and are shown in italResponse choices are: (1) Disagree strongly; (2) Disagree; (3)
    Neither Agree nor Disagree; (4) Agree; and (5) Agree strongly. Abbreviations used are: (S)=Self-Awareness, (R)=Relational Transparency, (M)=Internalized
    Moral Perspective, and (B) = Balanced Processing. Instructions given respondents in organizations usually include the definitional statement, “Please note that
    the term ‘leader’ means your immediate or direct supervisor.”

Table 2
ALI content validity rating results.
Scale and item no. S mean rating (SD) R mean rating (SD) M mean rating (SD) B mean rating (SD) One-way F-test (Sig.) Planned directional
t-test comparisons
ALI 1 2.39 1.14 0.50 1.93 22.18 SNR SNM SNB
(S) (1.97) (1.75) (1.24) (1.95) (.001) * * *
ALI 2 1.28 2.88 0.68 0.79 15.33 RNS RNM RNB
(R) (1.84) (1.99) (1.54) (1.49) (.001) * * *
ALI 3 1.15 1.69 3.18 0.53 22.04 MNS MNR MNB
(M) (1.80) (1.98) (2.06) (1.30) (.001) * * *
ALI 4 1.18 0.54 0.79 2.99 16.21 BNS BNR BNM
(B) (1.76) (1.36) (1.55) (2.15) (.001) * * *
ALI 5 3.22 1.24 0.32 0.44 63.81 SNR SNM SNB
(S) (1.89) (1.83) (0.98) (1.17) (.001) * * *
ALI 6 2.15 1.76 1.43 0.69 9.93 RNS RNM RNB
(R) (2.09) (2.06) (1.92) (1.38) (.001) * *
ALI 7 0.89 0.64 3.54 0.92 36.07 MNS MNR MNB
(M) (1.61) (1.42) (1.93) (1.47) (.001) * * *
ALI 8 0.74 0.42 0.25 3.94 83.25 BNS BNR BNM
(B) (1.33) (1.18) (0.83) (1.61) (.001) * * *
ALI 9 4.39 0.65 0.64 0.40 122.02 SNR SNM SNB
(S) (1.13) (1.46) (1.46) (1.16) (.001) * * *
ALI 10 0.36 3.68 0.10 0.50 123.53 RNS RNM RNB
(R) (1.01) (1.59) (0.45) (1.19) (.001) * * *
ALI 11 1.14 0.68 3.86 0.57 58.78 MNS MNR MNB
(M) (1.77) (1.54) (1.59) (1.42) (.001) * * *
ALI 12 0.39 0.43 0.32 4.22 118.00 BNS BNR BNM
(B) (0.99) (1.21) (0.95) (1.51) (.001) * * *
ALI 13 3.69 0.76 0.46 0.19 84.95 SNR SNM SNB
(S) (1.64) (1.48) (1.13) (0.74) (.001) * * *
ALI 14 1.14 3.10 0.22 0.69 50.79 RNS RNM RNB
(R) (1.71) (1.78) (0.79) (1.52) (.001) * * *
ALI 15 0.71 0.29 4.25 0.36 119.10 MNS MNR MNB
(M) (1.32) (0.94) (1.43) (1.12) (.001) * * *
ALI 16 0.85 1.17 0.29 3.10 45.12 BNS BNR BNM
(B) (1.58) (1.76) (0.91) (1.91) (.001) * * *
ALQ 1 2.18 1.32 0.61 1.49 9.17 SNR SNM SNB
(S) (1.99) (1.85) (1.41) (1.86) (.001) * * *
ALQ 3 1.49 3.08 0.81 0.35 29.18 RNS RNM RNB
(R) (1.94) (2.17) (1.69) (1.09) (.001) * * *
ALQ 5 1.40 1.64 2.38 0.43 22.58 MNS MNR MNB
(M) (1.87) (2.04) (2.14) (1.21) (.001) * * *
ALQ 7 0.82 0.64 0.83 3.31 21.15 BNS BNR BNM
(B) (1.54) (1.48) (1.53) (2.06) (.001) * * *
ALQ 2 3.38 1.42 0.14 0.44 130.67 SNR SNM SNB
(S) (1.72) (1.78) (0.70) (1.29) (.001) * * *
ALQ 4 2.65 1.86 0.97 0.71 17.72 RNS RNM RNB
(R) (1.95) (2.02) (1.74) (1.53) (.001) * *
ALQ 6 0.97 0.53 4.01 0.38 83.50 MNS MNR MNB
(M) (1.71) (1.42) (1.50) (0.98) (.001) * * *
ALQ 8 0.60 0.38 0.50 4.08 103.04 BNS BNR BNM
(B) (1.23) (1.11) (1.27) (1.47) (.001) * * *
Note: Table 1 presents the ALI items. For the ALQ items, see Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008, p. 121). Dimension abbreviations used
are: S = Self-Awareness, R = Relational Transparency, B = Balanced Processing, and M = Internalized Moral Perspective.

subjects) and that 15 of the ALI items are properly assigned to the four authentic leadership dimensions. Item 6, again, appears
improperly assigned due to its .48 loading on the Self-Awareness component instead of the Relational Transparency dimension.
Item 1 also appears to be possibly problematic due to amoremodest .31 loading on the Balanced Processing component (in addition
to its appropriate .47 loading on the Self-Awareness dimension).
1.2.2. Results for the ALQ items
As shown in Table 2, the ANOVAs and t-tests support the assignment of 7 of the 8 Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) ALQ items to the theoretical dimensions for which they were written. Item 4 is shown to be clearly assigned to the wrong dimension, however, as its highest mean rating is on Self-Awareness and not Relational Transparency. The component
analysis likewise indicates that 7 of the 8 ALQ items are properly assigned, again with the item 4 being misassigned due to having a strong loading on the Self-Awareness component (.63) and only a modest loading (.31) on Relational Transparency

Table 3
ALI content validity rotated component results.
Scale and item no. Component 1 (B) Component 2 (M) Component 3 (S) Component 4 (R) Item communality (h2)
ALI 1 (S) .31 −.17 .47 −.04 .45
ALI 2 (R) −.11 .01 .08 .72 .59
ALI 3 (M) −.14 .68 −.04 .19 .54
ALI 4 (B) .67 −.06 .01 −.23 .51
ALI 5 (S) −.20 −.15 .72 .01 .67
ALI 6 (R) −.05 .24 .48 .29 .63
ALI 7 (M) −.02 .79 −.10 −.10 .65
ALI 8 (B) .85 −.17 −.06 −.11 .80
ALI 9 (S) −.17 .02 .81 −.13 .75
ALI 10 (R) −.12 −.25 −.15 .77 .75
ALI 11 (M) −.09 .78 −.02 −.15 .64
ALI 12 (B) .81 −.15 −.17 −.11 .73
ALI 13 (S) −.19 −.09 .78 −.07 .67
ALI 14 (R) −.09 −.14 .10 .76 .64
ALI 15 (M) −.21 .78 −.17 −.24 .74
ALI 16 (B) .78 −.15 .00 .06 .64
ALQ 1 (S) .21 −.10 .52 .11 .34
ALQ 3 (R) −.14 .14 .19 .73 .63
ALQ 5 (M) −.09 .63 .05 .26 .58
ALQ 7 (B) .76 −.01 −.03 −.07 .59
ALQ 2 (S) −.13 −.16 .74 .11 .63
ALQ 4 (R) .04 .14 .63 .31 .66
ALQ 6 (M) −.15 .80 −.08 −.17 .71
ALQ 8 (B) .84 −.13 −.10 −.10 .75
Eigenvalue (% of variance explained) 5.14 4.29 2.69 2.13 14.25
(21.4%) (17.9%) (11.2%) (8.9%) (59.4%)
Note: Table 1 presents the ALI items and definitions. For the ALQ items, see Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008, p. 121). Scale abbreviations used are: S = Self-Awareness, R = Relational Transparency, M = Internalized Moral Perspective, and B = Balanced Processing. Loadings ≥.40 are italicized.

1.3. Discussion
Both sets of results lend support to the theoretical content validity of the new ALI scales—if itemALI 6 and, possibly, itemALI 1 are
removed. The results also support the content validity of seven of the eight Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) ALQ items but, unfortunately, their final measure includes problematic item ALQ 4. Since our ALI item 6 was created by paraphrasing ALQitem4, it seems clear that the theoretical content of this itemis problematic as ameasure of Relational Transparency and it probably should not be employed in an authentic leadership scale. In view of this result and the fact that eight additional ALQ
items have not been empirically evaluated for content validity, the ALQ warrants further content validity testing and possible refinement. This supports our earlier discussion concerning the desirability of subjecting the ALQ to psychometric analyses beyond those presented inWalumbwa, Avolio, Gardner,Wernsing and Peterson (2008).

  1. Study 2: testing the ALI factor structure and the differentiation of authentic from transformational leadership. Given the positive content validity assessment results reported above, the new ALI scales were tested for internal consistency reliability and empirical factor structure. Although the content validity assessment process yields scales that are theoretically distinct from one another, this does not guarantee that they will be perceived as being separate and distinct by respondents in field settings. There are a number of reasonswhy theoretically distinct scales may not be seen as distinct by respondents when they describe a leader’s behavior, including the very real possibility that leaders who display one type of behaviormeasured by an instrumentmay also tend to display the
    instrument’s other behaviors (Schriesheim, Cogliser, Scandura, Lankau & Powers, 1999, 1993). Thus, field-testing the dimensionality of any new instrument is a critical next step even after theoretical content validity has been supported (Hinkin, 1998). In our testing we sought to establish that the ALI measures four different perceived leader behaviors and that the best representation
    of its perceptual structure is a model that has four different and distinct–but correlated–latent variables (factors). Data from the 2008 U.S. presidential election provided us with an opportunity to not only study the ALI’s factor structure but to also see
    how it might vary according to the leader being described. Furthermore, because candidate authenticity was one of the issues that surrounded the election (Collins, 2008; Medved, 2008), it also afforded us with an excellent opportunity to test whether authentic
    leadership is seen as consisting of dimensions that are different from those that are associated with transformational leadership. Finally, because extraneous sources of variance and covariance are always a concern with survey questionnaire measures, we also undertook the examination of the degree to which the ALI measures are likely to be contaminated by social desirability or impression
    management (Nunnally & Bernstein, 1994; Zerbe & Paulhus, 1987). Briefly, Zerbe and Paulhus (1987) present a reconceptualization of social desirability response bias as involving two components:
    self-deception and impressionmanagement. They argue that the measurement and control of socially-desirable responding is better

accomplished by the impression management component than by the self-deception component because the latter may often have
substantive relationshipswith variables studied in organizational behavior. Impressionmanagement, on the other hand, ismore likely
to represent irrelevant or confounding variance and covariance and, hence, is more theoretically appropriate for use as a control
variable in organizational research. For these reasons,we employed ameasure of impressionmanagement (described below) and examined
its relationships with the four ALI dimensions.
2.1. Method
2.1.1. Sample
The initial sample consisted of 536 undergraduates (freshmen, juniors, and seniors) taking two management courses—one an introductory
general survey course and the second a first course in organizational behavior. Failures to remember the identification code
(for matching questionnaires—discussed below), mismatched codes, and respondents withmissing data totaled 37, thereby reducing
the final sample size to 499.
The average age of the samplewas 19.7 years old and 58.6%weremales. All but 5 hadwork experience. A total of 174 had full-time
work experience that averaged 2.9 years in duration,while 320 had part-time experience of 2.7 years on average.Of the 175 thatwere
currently employed, the average number of hours worked perweek averaged 17.3. Finally, the composition of the 498 who reported
their ethnicity was 7.3 Black Non-Hispanic, 61.9%White Non-Hispanic, 20.4% Hispanic, 0.2 Native American or Alaskan, 4.6% Asian or
Pacific Islander, and 5.6% Other.
2.1.2. Procedure
Surveys were administered during class time; participation was voluntary but given extra course credit. The first survey was
administered approximately 1 week before the 2008 U.S. presidential election; the second survey was administered one or two
days after the election, depending on the class meeting schedule. Matching of the first and second surveys was accomplished
by having the students write a 4-digit identification code of their own choosing on the first survey and then recording the
same code on the second survey.
2.1.3. Measures
The first surveymeasures thatwere administered included the 16 items of the ALI that are shown in Table 1, Podsakoff,MacKenzie,
Moorman and Fetter (1990) 23-item Transformational Leadership Inventory (TLI), additional social–psychological measures, a set of
background demographic questions (age, gender, political affiliation, etc.), a measure of impression management, and one question
asking how the respondent intended to vote in the election. Each measure (except the demographic, impression management, and
voting intent questions) was asked twice—once soliciting descriptions of John McCain and once asking for descriptions of Barak
Obama. The questions were grouped by scale and by candidate (e.g., ALI forMcCain) andwere randomly distributed on the questionnaires.
The second survey asked the respondents how they had voted and, if they had not voted, the reason(s) for not doing so. Based
upon the Zerbe and Paulhus (1987) reconceptualization of socially desirable responding, we employed the Paulhus (1998) Balanced
Inventory of Desirable Responding (Version 7)—a twenty item scale in which only extreme responses are scored (answers of 1 and 2
are scored for the reverse-coded items and 6 and 7 are scored for the regular items; all answers aremade on a 7-point scale that ranges
from “Not True” to “Very True”). Scores can range from0 to 20,with higher scores indicating higher impression management. Psychometric
evidence supporting this scale is generally quite positive (e.g., Lanyon & Carle, 2007; Paulhus, 1991, 1998). Two sample items
are, “I sometimes tell lies if I have to” (reverse-scored), and “I never cover up my mistakes.”
2.1.4. Methods of analysis
All scaleswere first analyzed for internal consistency reliability, using Cronbach’s coefficient alpha. These analyseswere undertaken
for the McCain andObama descriptions separately. For the ALI, these analyseswere conducted bothwith andwithout items 1 and 6
being included.
The factor structure of the ALI alone was next tested for each candidate separately using LISREL confirmatory factor analysis
(CFA) (Jöreskog & Sörbom, 2006). Combined analyses were not conducted because our data on the respondents’ perceptions of
the two candidates are not independent. Using the full set of 16 ALI items, six alternative hierarchically-nested rival models
(cf. Widaman, 1985) were first tested: (a) a 1-factor model, with all 16 items loading on the same factor, (b) a second-order factor
model, with 4 first order factors (representing the 4 ALI scales) loading on the one second order factor, (c) a correlated 4-factor
model (representing the 4 ALI scales), (d) a correlated 4-factor model with item 1 being allowed to cross-load on the Balanced
Processing factor, (e) a correlated 4-factor model with item 6 being allowed to cross-load on the Self-Awareness factor, and (f)
a correlated 4-factor model with items 1 and 6 each being allowed to cross-load (as on models d and e). Models d, e, and f
were tested to verify the content validity assessment findings that ALI items 1 and 6 are poor representations of their constructs
and therefore should be eliminated from the instrument. A seventh model (g) was also fit that was identical to the third model (c)
except that it was based on the 14 ALI items that had clearly shown good content validity in the content validity assessment.
Next, the discriminant validity of the ALI (and the differentiation of authentic from transformational leadership) was tested by
subjecting it to multiple CFA’s with the TLI, again employing hierarchically-nested rival models to determine the best portrayal of
the data (cf. Widaman, 1985). Although authentic and transformational leadership are related constructs, current theory emphasizes
that they are not identical in their component dimensions (Walumbwa, Avolio, Gardner,Wernsing & Peterson, 2008). Consequently,
ALI items 1 and 6 were deleted from the scales (due to the ALI CFA results, reported below) and five combined ALI-TLI CFA’s

conducted: (a) a 1-factormodel,with all 37 items loading on the same factor, (b) a correlated 2-factormodel (with the ALI items loading
on one factor and the TLI items loading on another), (c) a second order factormodelwith 10 first order factors (4 factors for the ALI
and 6 factors for the TLI) that loaded on one second order factor, and (d) another second order factor model with the same 10 first
order factors and two second order factors (one for authentic leadership and one for transformational leadership), and (e) a correlated
10-factormodel (one for each ALI and TLI dimension) that had the ALI and TLI items loading only on their appropriate scale factors.
Amixed first and second ordermodel that had only the ALI first order factors loading on a single second order factorwas not estimated
because the McCain ALI results did not support such a model (and because we wanted to estimate intercorrelations among all ten ALI
and TLI factors for both the McCain and Obama data).
Finally, we correlated the four ALI scales with impression management, for both McCain and Obama separately. We hoped that
all eight correlations would be nonsignificant or, at least, low enough so as to not suggest problematic confounding of the ALI with
an extraneous response bias.
All of the CFA models employed were estimated using maximum likelihood procedures and without the benefit of “garbage
parameters” (MacCallum, 1986) to improve model fits. Additionally, unless described otherwise, all of the first order factors
were specified to be comprised of only the items that are part of each scale (e.g., all the Relational Transparency items loaded
on only the Relational Transparency factor) and no items were left unassigned.
2.2. Results
2.2.1. Reliability results
Both the ALI and the TLI had acceptable internal consistency reliabilities (≥.70; Nunnally & Bernstein, 1994). The lowest ALI
coefficient alpha was .74, while the highest was .85. Three of the 4 ALI scales had reliabilities of ≥.80 for both McCain and
Obama. Similar results were obtained for the 6 TLI scales, with the lowest reliability being .72. Table 4 presents these results
for each dimension for McCain and Obama separately. Additionally, coefficient alpha for the impression management scale was
computed to be .81.
2.2.2. McCain ALI CFA tests
As shown in the top half of Table 5, the McCain description CFA’s show that the theoretically appropriate 4-factor model
(model c) is a significantly better fit to the data than are the one factor model (model a) (ΔX2=424.75, df=6, pb.001) or the
model with four first order and one second order factors (model b) (ΔX2=13.61, df=2, pb.01). The comparative fit (CFI) and
non-normed fit (NNFI) indices also support this conclusion, as do the standardized root mean square residuals (Std. RMR) and
root square mean errors of approximation (RMSEA) (Browne & Cudeck, 1993; Hu & Bentler, 1999; Medsker, Williams, & Holahan,
1994).
Testing ALI items 1 and 6, both separately (models d and e) and together (model f) against the 4-factormodel (model c) in which
they load on only the dimension forwhich they were originally written shows that both items do notmeasure one dimension alone.
The chi-square difference tests for model d against model c (ΔX2=15.16, df=1, pb.001), model e against model c (ΔX2=10.11,
df=1, pb.01), and model f against model c (ΔX2=19.55, df=1, pb.001) show all three to be better fits to the data. Testing model
f, with both items cross-loaded, shows it to be a significantly better fit than model d (ΔX2=4.39, df=1, pb.05) and model e
(ΔX2=9.44, df=1, pb.01). Thus, these results confirm the content validity assessments and further support eliminating items 1
and 6 from the ALI. When this is done, model g is the resultant final model. Table 5 shows that this 14-item 4-correlated factors
model (g) has an excellent fit to the McCain description data, with highly acceptable CFI, NNFI, Std. RMR, and RMSEA indices.

Table 4
Scale internal consistency reliabilities.
Scale # Items Coefficient alphas
Study two Study three
McCain Obama EMBAs
Authentic Leadership Inventory
Self-Awareness (S) — initial 4 (1, 5, 9, 13) .77 .83 .78
Self-Awareness (S) — final 3 (5, 9, 13) .74 .79 .70
Relational Transparency (R) — initial 4 (2, 6, 10, 14) .84 .85 .80
Relational Transparency (R) — final 3 (2, 10, 14) .81 .80 .77
Internalized Moral Perspective (M) 4 (3, 7, 11, 15) .83 .85 .74
Balanced Processing (B) 4 (4, 8, 12, 16) .85 .85 .82
Transformational Leadership Inventory (Podsakoff, MacKenzie, Moorman & Fetter, 1990)
Vision 5 .88 .85 .83
Modeling 3 .88 .88 .88
Group goals 4 .89 .87 .85
High performance expectations 3 .78 .72 .76
Individualized support 4 .77 .82 .81
Intellectual stimulation 4 .90 .89 .90

Table 5
McCain and Obama ALI measurement model confirmatory factor analysis results.
(a) 16-item
1-factor
model
(b) 16-item
second order
model
(c) 16-item
4-factor
model
(d) 16-item 4-factor;
item

1 cross loading

(e) 16-item 4-factor;
item

6 cross loading

(f) 16-item 4-factor;
items 1
and 6 cross loading
(g) 14-item
4-factor model
Degrees of
freedom
104 100 98 97 97 96 71
McCain
Chi-squared 760.48 349.34 335.73 320.57 325.62 316.18 238.13
CFI .84 .94 .94 .95 .95 .95 .95
NNFI .82 .93 .93 .93 .93 .93 .94
Std. RMR .067 .049 .047 .046 .046 .046 .045
RMSEA .130 .072 .071 .069 .071 .069 .069
RMSEA CI (.120, .130) (.064, .080) (.063, .080) (.061, .077) (.063, .079) (.061, .077) (.059, .078)
Obama
Chi-squared 590.74 343.91 343.94 343.93 341.32 341.32 262.47
CFI .90 .95 .95 .95 .95 .95 .95
NNFI .88 .94 .94 .94 .94 .94 .94
Std. RMR .049 .039 .038 .038 .038 .038 .038
RMSEA .100 .070 .071 .072 .071 .072 .074
RMSEA CI (.094, .110) (.062, .078) (.063, .079) (.064, .080) (.063, .079) (.063, .080) (.065, .084)
Note. See the text for a description of each model. Abbreviations used are: CFI= comparative fit index, NNFI = Non-Normed Fit Index, Std. RMR = standardized
root mean square residual, RMSEA = root mean square error of approximation, and CI = confidence interval.

2.2.3. Obama ALI CFA tests
As shown in the bottomhalf of Table 5, the Obama descriptions CFA’s showa very different picture than do theMcCain data. Here,
it can be seen that the one first-order factor model (model a) is clearly inferior to all of the other 16-item models (theweakest results
are formodel c,withΔX2=246.80, df=6, pb.001). Additionally, the three correlated 4-factormodels that have items 1 and/or 6 loading
on a second factor (models d, e, and f) are not significantly better fits to the data than is themodel without cross loadings (model
c). Finally, for the Obama data it can be seen that the modelwith four correlated first order factors (model c) is not a significantly better
fit than is themodel with four first order and one second order factors (model b). These results clearly showthat the appropriateness
of considering perceived authentic leadership to be a higher order and more general unitary construct depends upon who is
being described. In the case ofMcCain, such a conceptualization is not supported by the data.With respect to Obama, seeing perceived
authentic leadership as a global construct appears warranted.
Given the results obtained for the McCain descriptions and the earlier content validity assessment results, we decided that the
most conservative treatment of the new ALI would be to eliminate items 1 and 6 from the final instrument. Doing this for the
Obama descriptions results in a good fitting 14-item model that has 4 first order and 1 second order factors (X2=266.19, df=73).
However, a 14-item model that has only 4 first order factors (and no second order factor) is about as good a fit to the data
(X2=262.47.80, df=71), and the difference between the twomodels (ΔX2=3.72, df=2, pN.15) is nonsignificant. Employing the criterion
of parsimonious fit (e.g., Browne & Cudeck, 1993; Medsker, Williams & Holahan, 1994), the 14-item model with 4 first order
factors is considered the better portrayal of the data.
The bottom half of Table 5 shows the 14-item 4-factor Obama results in the last (right-hand) column. As shown there, the result
is a very good fitting model, with CFI, NNFI, Std. RMR, and RMSEA indices that are highly acceptable. Consequently, we conclude
that the final version of the ALI (at least for now) should have 14 items—three measuring Self-Awareness and three
measuring Relational Transparency, with four (each) measuring Internalized Moral Perspective and Balanced Processing. As
shown in Table 4, deleting items 1 and 6 from the ALI produces a slight (.03) decrement in internal consistency reliabilities for
the two affected scales but we believe that this is more than offset by the improvement of these scales’ theoretical content validity
and empirical differentiation.
Tables 6 and 7 present the obtained standardized parameter estimates for the 14-item correlated 4-factor (model g) portrayals
of the McCain and Obama data, respectively. As can be seen by comparing the two tables, all of the item loadings are relatively
high and reasonably similar for the two candidates. However, the patterns of factor intercorrelations are notably different. Although
all of the McCain and Obama factor intercorrelations are significantly less than 1.0 (pb.001), the McCain intercorrelations
show better evidence of respondent perceptual discrimination among the four dimensions. While all of the Obama intercorrelations
are of magnitude .80 or greater, the McCain data show only two intercorrelations in excess of .80, while there are three in
the .73 to .77 range and one has a value of .59. This demonstrates why a second order factor was supported in the Obama data and
not in the McCain data and reinforces the potential importance of not assuming that perceived authentic leadership is universally
a unitary or higher-order (global) construct.
2.2.4. Combined ALI and TLI results
The CFA’s of the 37 ALI and TLI items combined clearly support the model with 10 first order factors over all four rival models—
for both the McCain and Obama descriptions. These results are presented in Table 8.

Table 6
McCain standardized ALI CFA results.
Factor loadings
Item 1 2 3 4 Theta–delta
1 0.71 0.49
2 0.78 0.40
3 0.61 0.62
4 0.78 0.39
5 0.74 0.46
6 0.79 0.38
7 0.70 0.51
8 0.75 0.44
9 0.77 0.40
10 0.76 0.42
11 0.72 0.48
12 0.86 0.26
13 0.75 0.43
14 0.76 0.43
Factor intercorrelations
Item 1 2 3 4
1 1.00
2 .88 1.00
3 .73 .74 1.00
4 .82 .77 .59 1.00
Note. All parameter estimates shown are statistically significant (pb.001).

For the McCain descriptions, as shown in the top half of Table 8, testing the 10 correlated factors model (model e) against the
(a) 1-factor, (b) 2-factor, (c) one second order factor, and (d) two second order factors models yields chi-square differences of
1785.53 (df=45), 1302.06 (df=44), 419.44 (df=35), and 293.42 (df=34), respectively, all statistically significant at pb.001.
The CFI, NNFI, Std. RMR, and RMSEA fit indices all also clearly show the superiority of the 10-factor model over the others.
The Obama descriptions are shown in the bottom half of Table 8, and they yield highly similar findings. Again, testing the 10 correlated
factors model (model e) against the (a) 1-factor, (b) 2-factor, (c) one second order factor, and (d) two second order factors
models yields chi-square differences of 1167.60 (df=45), 673.90 (df=44), 301.13 (df=35), and 120.76 (df=34), respectively, all
statistically significant at pb.001. The CFI, NNFI, Std. RMR, and RMSEA fit indices all also show the superiority of the 10-factor
model over the others, althoughmodel d,with two second order factors, has fit indices that are close to those of the 10-factormodel

Table 7
Obama standardized ALI CFA results.
Factor loadings
Item 1 2 3 4 Theta–delta
1 0.79 0.38
2 0.83 0.32
3 0.64 0.59
4 0.81 0.34
5 0.75 0.43
6 0.72 0.48
7 0.81 0.34
8 0.73 0.47
9 0.72 0.48
10 0.77 0.41
11 0.71 0.50
12 0.83 0.31
13 0.83 0.32
14 0.71 0.50
Factor intercorrelations
Item 1 2 3 4
1 1.00
2 .89 1.00
3 .84 .85 1.00
4 .88 .83 .80 1.00
Note. All parameter estimates shown are statistically significant (pb.001).

Table 8
McCain and Obama ALI and TLI discriminant validity tests.
(a) 1-factor (b) 2 correlated factors (c) One 2nd order factor (d) Two 2nd order factors (e) 10 correlated factors
Degrees of freedom 629 628 619 618 584
McCain
Chi-squared 3433.95 2950.18 2067.56 1941.42 1648.12
CFI .79 .82 89 .90 .92
NNFI .78 .81 .88 .89 .91
Std. RMR .062 .056 .054 .050 .042
RMSEA .110 .100 .072 .068 .061
RMSEA CI (.110, .120) (.097, .100) (.068, .075) (.065, .072) (.058, .065)
Obama
Chi-squared 2931.59 2437.89 2065.12 1884.75 1763.99
CFI .83 .87 .89 .91 .91
NNFI .82 .86 .88 .90 .90
Std. RMR .054 .048 .049 .044 .042
RMSEA .100 .084 .076 .070 .069
RMSEA CI (.098, .100) (.081, .087) (.073, .079) (.067, .073) (.065, .072)
Note. See the text for a description of each model. Abbreviations used are: CFI= comparative fit index, NNFI = Non-Normed Fit Index, Std. RMR = standardized
root mean square residual, RMSEA = root mean square error of approximation, and CI = confidence interval.

Examining the disattenuated latent variable (factor) correlations (i.e., corrected for measurement error; Jöreskog & Sörbom,
2006) between the 4 authentic leadership dimensions and the 6 transformational leadership dimensions shows many to be relatively
high, but all are significantly less than 1.0 (pb.001). Perhaps reflecting the ALI results, the ALI dimensions have lower correlations
with the TLI dimensions in the McCain data (the highest correlations for Self-Awareness, Relational Transparency,
Internalized Moral Perspective, and Balanced Processing are .72, .81, .76, and .78, respectively, while the corresponding highest
correlations for the Obama descriptions are .86, .83, .81, and .82). These findings as a set further support the discriminant validity
of the ALI and the TLI and also that perceived authentic leadership is different from perceived transformational leadership. They
also support the earlier ALI findings that indicate that relationships among the first order leadership constructs may vary depending
on the person(s) being described and this finding, along with others, is further considered in the discussion section that
follows.
2.2.5. ALI correlations with impression management
Seven of the eight ALI correlations with the Paulhus (1998) impression management measure are nonsignificant. For McCain,
the correlations for Self-Awareness, Relational Transparency, Internalized Moral Perspective, and Balanced Processing are .09
(pb.05), .07, .06, and .06 (respectively). For Obama, the correlations are .04, .03, −.03, and .03. Since a large sample is involved
(N=499), even the one significant correlation (.09) does not appear problematic, as it indicates only 0.8% shared variance — a
trivial amount at best.
2.3. Discussion
Our results support the ALI’s lack of confounding with impressionmanagement (social desirability), aswell as its internal reliability
and factor structure—in addition to its discriminant validity vis a vis the TLI.Overall, the analyses did not support treating authentic
leadership or transformational leadership as universally global or aggregate constructs. Thus, future research might be best served by
using the dimensions or scales separately rather than combining themall into a globalmeasure. Coupledwith the strong content validity
results obtained in Study 1, these findings suggest that the ALI appears psychometrically sound and worthy of further testing.
We nowturn to such further testing in a third study that is designed to provide additional psychometric evidence concerning the ALI.

  1. Study 3: further testing of the ALI
    Study 3 was undertaken to replicate all of the analyses of Study 2 using amore conventional sample. It was also designed to formally
    examine the ALI’s convergent and discriminant validity and to allow the testing of the ALI’s concurrent validity with three of the most
    commonly used dependent variables in leadership research—general job satisfaction, satisfaction with supervision, and organizational
    commitment (Dumdum, Lowe & Avolio, 2002; Podsakoff, Bommer, Podsakoff&MacKenzie, 2006).
    Some of the more informative evidence concerning construct validity can be obtained by examining the extent to which scales
    correlate with other measures designed to assess similar constructs (convergent validity) and the extent to which they do not
    correlate with measures of dissimilar constructs (discriminant validity) (cf. Hinkin, 1995; Nunnally & Bernstein, 1994). Although
    convergent and discriminant validity can be tested using a number of different techniques, probably the most rigorous involves
    conducting a CFA of a multitrait–multimethod matrix (MTMM) (Campbell & Fiske, 1959; Kenny & Kashy, 1992; Marsh, 1989).

3.1. ALI–ALQ theoretical correspondence
Since the ALI contains four dimensions that are conceptually identical to those on the ALQ (Walumbwa, Avolio, Gardner,
Wernsing & Peterson, 2008), conducting an MTMM analysis using both instruments appears reasonable. However, because our
intent was not to validate the ALQ, we decided to employ only the two sample ALQ items for each subscale that are presented
in Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008). Unfortunately, however, since our earlier analyses (Study 1)
showed one of the eight Walumbwa et al. items to have questionable content validity, we created a substitute item for administration
to our Study 3 sample and for our MTMM analyses. This item was, “My leader is open about all information, even if it
shows that he/she made a mistake,” and it replaced ALQ item 4.
3.2. Sample and procedure
The sample was collected using a snowballing technique (e.g., Gosserand & Diefendorff, 2005;Martins, Eddleston, & Veiga, 2002;
Tepper, 1995) in which 38 full-time employed executive M.B.A. (EMBA) students each obtained completed survey questionnaires
fromfive employed individualswithwhomtheywere very familiar. The criteria used for selecting the five were that theywere amixture
of different genders, ages, and ethnic backgrounds and that each worked full-time in a mid-level position in a different organization.
Of the total of 229 respondents (including the students), complete data were obtained from 228, who comprised the final
sample. The average age of the sample was 32.31 years old (SD=9.52), 44.5% were males, and they reported working an average
of 43.46 h/week (SD=8.19). Ethnically, the sample was 6.2% Black Non-Hispanic, 58.1% White Non-Hispanic, 26.9% Hispanic, 5.3%
Asian, and 3.5% Other.
3.3. Measures
The survey questionnaire that was administered asked the respondents to describe their current supervisors using the final
version of the ALI from Study 2 (with two 3- and two 4-item scales), eight ALQ items (two items for each dimension, with the
one item substitution noted above), and the 23-item TLI. Additionally, the survey contained a demographics section (which
was used to obtain the sample description presented above) and also measures of impression management, satisfaction, and commitment.
The cover letter to the survey stressed the voluntary nature of the study and guaranteed the respondents complete
anonymity.
As in Study 2, the Paulhus (1998) Version 7 of the Balanced Inventory of Desirable Responding was employed to measure impression
management or the tendency to provide socially desirable survey responses.
Additionally, the two 5-item(each) Satisfactionwith Supervisor Human Relations and Satisfaction with Supervisor Technical Ability
subscales of the long form (100-item) MSQ (Weiss, Dawis, Lofquist & England, 1967) were employed and used together to measure
overall Satisfaction with Supervision. The twenty-item MSQ short form was used to measure overall Job Satisfaction. The MSQ
was employed because it has substantial reliability and validity data, and it is one of the most commonly used job satisfaction scales
(e.g., Dunham, Smith, & Blackburn, 1977; Gillet & Schwab, 1975; Schriesheim, Tetrault, Kinicki, & Carson, 1989;Wanous, 1974). Additionally,
the MSQ appears to have a higher proportion of true score (non-error and non-method) variance than a number of other
established satisfaction measures (Schriesheim, Tetrault, Kinicki & Carson, 1989). The MSQ employs a five-point Likert scale (from
very dissatisfied to very satisfied) and two sample Satisfaction with Supervision items are, “The way my supervisor and I understand
each other,” and “The technical ‘know-how’ of my supervisor.” (TwoMSQ short formitems are, “Being able to keep busy all the time,”
and “My pay and the amount of work I do.”)
The Porter, Steers,Mowday, and Boulian (1974) organizational commitment questionnaire (OCQ)was used tomeasure employee
commitment to the company in Study 3. This 15-item instrument uses a 7-point Likert response scale (from “Strongly disagree” to
“Strongly agree”). The OCQ has been widely used and subjected to considerable psychometric examination; data pertaining to its reliability
and validity are extensive and generally quite positive (e.g.,Mowday, Porter, & Steers, 1982;Mowday, Steers, & Porter, 1979).
Two sample items are, “I feel very little loyalty to this company” (reverse-scored), and “I really care about the fate of this company.”
3.4. Methods of analysis
3.4.1. Replication of Study 2
All of the analyses of Study 2 were repeated using the same methods and the data from Study 3.
3.4.2. MTMM CFA models
One of the more recent advances in the application of CFA to the examination of data structures is the advent of the correlated
uniqueness model for examiningMTMMdata. Historically,MTMMdata have generally been examined usingmodelswhich specified
separate trait, method, and error (uniqueness) factors (cf. Schmitt & Stults, 1986;Widaman, 1985). In these models, the uniqueness
factors are completely uncorrelated within themselves or with any other factors; however, the trait factors are correlated among
themselves, as are the method factors (the traits are uncorrelated with the methods).
While allowing the examination of MTMMs from a perspective similar to that of Campbell and Fiske (1959), these method factor
models have consistently encountered a number of problems. Sometimes they have had identification problems that have led to a
failure of parameter estimates to converge. Often, they have had parameter estimates outside of the range that is theoretically

admissible (e.g., negative error variances, factor correlations in excess of 1.0, etc.). Finally, although there is no way of being sure,
method factormodels are suspected of overestimatingmethods effects, by including true trait variance as part of themethod variance
estimates (see Kenny & Kashy, 1992, and Marsh, 1989, for further details).
Correlated uniqueness models, in contrast to method factor models, have been shown to be more likely to be identified and to
have parameter estimates which are theoretically admissible (Kenny & Kashy, 1992; Marsh, 1989). These models specify that the
latent variables (“trait factors”) are intercorrelated among themselves, and that the uniqueness or error terms for each separate
method are correlated among themselves. However, the uniquenesses are not correlated across methods, nor are they correlated
with the traits. As Kenny and Kashy (1992, p. 169) note, “restricting the method–method correlations to zero seems to be a very
strong assumption” but is generally needed for model identification. However, not estimating method–method covariances
causes the amount of trait–trait covariance to be overestimated, yielding conservative estimates of discriminant validity
(Kenny & Kashy, 1992).
To simplify the modeling process and to maximize the likelihood of model convergence, the Study 3 MTMM analyses
employed the correlated uniqueness model with input data being an ALI–ALQ scale variance–covariance matrix. Consequently,
a single indicator approach was used (cf. Williams & Hazer, 1986), setting the scale error variances equal to the product of the
scale’s variance times one minus its internal consistency reliability.
As in Study 2, we used the maximum likelihood estimation procedures of LISREL (Jöreskog & Sörbom, 2006). Following the
recommendations of Marsh (1989) and Kenny and Kashy (1992), the modeling process followed the basic sequence outlined
in Widaman (1985) and Schmitt and Stults (1986). An initial or “full model” was first fit, specifying four trait factors, with
each of the ALI and ALQ measures loading on only the one factor which it theoretically represents. The correlations among
these trait factors were freely estimated, as were correlations among the uniqueness terms for the MLQ and for the TLI. However,
correlations between the MLQ and TLI uniquenesses were not permitted, nor were correlations permitted between the trait and
uniqueness factors.
The second model which was fit allowed testing for discriminant validity by constraining the trait factor intercorrelations to
equal 1.0. The third model allowed testing for convergent validity by eliminating the trait factors altogether. Finally, the fourth
and last model assessed method bias effects by eliminating all uniqueness correlations (cf. Schmitt & Stults, 1986; Widaman,
1985).
Our comparisons of model fit are based upon the chi-square difference test, which allows the examination of statistical significance.
Additionally, Marsh (1989) specifically recommends the use of the Tucker and Lewis Index (Tucker & Lewis, 1973), also
known as the Non-Normed Fit Index (NNFI), to assess MTMM model fit with the correlated uniqueness model. Finally, as discussed
by Medsker, Williams and Holahan (1994), Hu and Bentler (1999), and Browne and Cudeck (1993), we also use the standardized
rootmean square residual (Std. RMR) and the root mean square error of approximation (RMSEA) (and its confidence interval) as additional
indicators of model fit.
3.5. Results
3.5.1. Reliability results
The extreme right-hand column of Table 4 presents the ALI and TLI internal consistency reliabilities. Additionally, the obtained reliabilities
for the 2-itemALQ scales are .64, .65, .71, and .72 for Self-Awareness, Relational Transparency, InternalizedMoral Perspective,
and Balanced Processing, respectively. The reliabilities for the Impression Management, General Job Satisfaction, Supervision Satisfaction,
and Organizational Commitmentmeasures are .71, .91, .94, and .88 (respectively). Thus, except for the two short ALQmeasures, all
are at or above .70 and therefore acceptable (Nunnally & Bernstein, 1994). (While the two ALQ reliabilities are somewhat disappointing,
these scales are used belowonly for theMTMManalyses; it should be noted that these analyses correct estimates of convergent and
discriminant validity and method variance for measurement error, thus minimizing the effect of low scale reliability on obtained
results.)
3.5.2. ALI CFA tests
As shown in Table 9, the EMBA data clearly support the superiority of the four-factor model (model c) and the second-order
factor model (model b) over the one-factor model, as indicated by both chi-square difference tests (ΔX2=121.06, df=6, and
ΔX2=120.47, df=4, respectively, both pb.001) and differences in the other fit indicators. However, there is no significant chisquare
difference or meaningful differences in fit indicators between the four-factor model and the second-order factor model.
Thus, the rule of parsimony suggests that the second-order factor model should be considered the better portrayal of the data.
Table 10 presents the completely standardized second-order factor results. As can be seen from Table 10, the ALI factor loadings
are slightly lower but generally comparable to the earlier McCain and Obama results. One noteworthy exception is ALI item 9,
which has a mediocre loading on its assigned factor (.46) and a relatively high level of measurement error (.79) (suggesting the
research needs to keep an eye on the performance of this item in the future). The loadings of the first-order factors on the secondorder
factor are relatively high (.83 to .89), as might be expected given that overall support was found for a higher-order factor.
3.5.3. ALI and TLI CFA results
As shown in Table 11, the CFA’s of the 37 ALI and TLI items together clearly support the model with 10 first-order factors over all
four rivalmodels. Testing the 10 correlated factorsmodel (model e) against the (a) 1-factor, (b) 2-factor, (c) one second order factor,
and (d) two second order factors models yields chi-square differences of 871.84 (df=45), 670.09 (df=44), 187.33 (df=35), and

Table 9
EMBA ALI measurement model confirmatory factor analysis results.
(a) 14-item 1-factor model (b) 14-item second order factor model (c) 14-item 4-factor model
Degrees of freedom 77 73 71
Chi-squared 276.03 155.56 154.97
CFI .85 .94 .94
NNFI .82 .92 .92
Std. RMR .064 .053 .054
RMSEA .110 .073 .074
RMSEA CI (.196, .120) (.058, .088) (.059, .089)
Note. See the text for a description of each model. Abbreviations used are: CFI= comparative fit index, NNFI = Non-Normed Fit Index, Std. RMR = standardized
root mean square residual, RMSEA = root mean square error of approximation, and CI = confidence interval.

135.38 (df=34), respectively, all statistically significant at pb.001. The CFI, NNFI, Std. RMR, and RMSEA fit indices all also show the
superiority of the 10-factor model over the others.
As in Study 2, examining the disattenuated latent variable (factor) correlations (i.e., corrected for measurement error; Jöreskog &
Sörbom, 2006) between the 4 authentic leadership dimensions and the 6 transformational leadership dimensions showsmany to be
relatively high, but all are significantly less than 1.0 (pb.001) and the pattern is similar to the McCain data of Study 2, where these
correlationswere consistently lower than those for Obama (the highest correlations for Self-Awareness, Relational Transparency, Internalized
Moral Perspective, and Balanced Processing are .75, .74, .81, and .76 respectively; as mentioned above, the corresponding
McCain correlations were .72, .81, .76, and .78, while those for Obama were .86, .83, .81, and .82). These findings as a set further support
the discriminant validity of the ALI and the TLI and also that perceived authentic leadership is different from perceived transformational
leadership. They also support the Study 2 ALI findings indicating that relationships among the first order leadership
constructs may vary depending on the person(s) being described.
3.5.4. ALI correlations with impression management
Three of the four ALI correlations with the Paulhus (1998) impression management measure are nonsignificant. Only the correlation
for Balanced Processing is statistically significant (r=.13, pb.05), while the correlations for Self-Awareness, Relational
Transparency, and Internalized Moral Perspective are not (they are −.04, .06, and .03, respectively). However, as for Study 2,
the one significant correlation (.13) does not appear problematic, as it indicates only 1.7% shared variance between the ALI
scale and impression management.

Table 10
EMBA standardized ALI CFA results.
Factor loadings
Item 1 2 3 4 Theta–epsilon
1 0.69 0.52
2 0.78 0.39
3 0.56 0.68
4 0.72 0.49
5 0.63 0.61
6 0.90 0.19
7 0.80 0.36
8 0.70 0.51
9 0.46 0.79
10 0.64 0.59
11 0.69 0.53
12 0.79 0.38
13 0.69 0.53
14 0.76 0.42
Factor intercorrelations
Item 1 2 3 4 2nd order
1 1.00
2 .70 1.00
3 .74 .75 1.00
4 .73 .74 .78 1.00
2nd order .83 .84 .89 .87 1.00
Note. Second-order factor correlations shown are the same as the loadings of the first-order factors on the second-order factor. All parameter estimates shown are
statistically significant (pb.001).

EMBA ALI and TLI discriminant validity tests.
(a) 1-factor (b) 2 correlated factors (c) One 2nd order factors (d) Two 2nd order factors (e) 10 correlated factors
Degrees of freedom 629 628 619 618 584
Chi-squared 2010.08 1808.33 1325.57 1273.62 1138.24
CFI .73 .77 .86 .87 .89
NNFI .72 .76 .85 .86 .88
Std. RMR .079 .075 .068 .065 .042
RMSEA .120 .100 .072 .068 .062
RMSEA CI (.110, .120) (.097, .110) (.067, .077) (.063, .074) (.056, .068)
Note. See the text for a description of each model. Abbreviations used are: CFI= comparative fit index, NNFI = Non-Normed Fit Index, Std. RMR = standardized
root mean square residual, RMSEA = root mean square error of approximation, and CI = confidence interval

3.5.5. MTMM CFA results
Table 12 presents the summary of model fit statistics for the four rival models involved in performing a correlated error MTMM
CFA. As shown in Table 12, the fullmodel (model a), supporting the presence of convergent and discriminant validity and method bias
effects, has both clearly superior fit indices and a significantly lower chi-square value than do the models indicating a lack of convergent
validity (model b), lack of discriminant validity (model c), or lack of method bias (model d). The chi-square difference values
showthis,with differences of 556.72 (df=14), 272.73 (df=6), and 69.12 (df=12) (respectively), all ofwhich are statistically significant
at pb.001. Additionally, the full model has a nonsignificant chi-square value (pN.05), indicating that it is a very good fit to the
data.
Table 13 presents the completely standardized full model. As can be seen there, all of the scales have very high factor
loadings, indicative of high levels of convergent validity. Not surprisingly, the ALI loadings are slightly higher than the
ALQ items, possibly reflecting the fact that 3- and 4-item measures are better construct indicators than are 2-item indicators.
The correlations among the four authentic leadership dimensions are very much in line with our previous findings in
that, while they are relatively high, they are all significantly different from 1.0 (pb.001).
Examining the error intercorrelations shows that, while there is significant overall error covariance, seven of the twelve error
intercorrelations are both trivial and nonsignificant (and the highest significant term is only .14). Thus, while the analysis shows
evidence of significant method effects, these effects are not strong or powerful, certainly when contrasted with the convergent
validities of the scales.
3.5.6. Concurrent validity results
Table 14 shows the intercorrelations among the ALI scales and the three dependent variables. As might be expected, all twelve correlations
are statistically significant. However, the pattern of correlations seems noteworthy since it also fits whatmight be expected
from a nomological network of variables involving authentic leadership. Here, the highest dependent variable correlations would be
expected for Satisfaction with Supervision, and this occurs without exception (average r=.60). The next highest relationships would
be expected forMSQ General Satisfaction, since it contains 2 supervision items aswell as additional items that are possibly affected by
supervisory actions (such as recognition and howcompany policies are administered). Again, the data supports thiswithout exception
(average r=.42). Finally, organizational commitment would be expected to show the weakest correlations with the four ALI dimensions,
since it is the product ofmanymore factors than just the supervisor and his or her behavior (cf.Mowday, Porter & Steers, 1982,
1979). This expectation is also met (average r=.30).
Table 15 presents the results of three linear multiple regression analyses, regressing each dependent variable against the four
ALI measures simultaneously. Here, we would argue that the discriminant validity, as well as the individual usefulness, of each
scale should be indicated by obtaining different patterns of statistical significance in “predicting” the three dependent variables.
The results bear this out.
For the General Satisfaction variable, only Balanced Processing is a statistically significant independent variable, while for Organizational
Commitment, only Internalized Moral Perspective is significant. On the other hand, all four measures are statistically significant

Table 12
EMBA ALI–ALQ multitrait–multimethod matrix analyses.
(a) Full model (b) Lack of convergent validity (c) Lack of discriminant validity (d) Lack of method bias
Degrees of freedom 10 24 16 22
Chi-squared 16.92⁎ 573.64 289.65 86.04
CFI .99 .57 .78 .95
NNFI .98 .49 .62 .94
Std. RMR .023 .360 .100 .039
RMSEA .047 .210 .340 .220
RMSEA CI (.000, .093) (.180, .230) (.320, .370) (.082, .130)
Note. See the text for a description of each model. Abbreviations used are: CFI= comparative fit index, NNFI = Non-Normed Fit Index, Std. RMR = standardized
root mean square residual, RMSEA = root mean square error of approximation, and CI = confidence interval.
⁎ pN.05.

Table 13
EMBA ALI–ALQ multitrait–multimethod full model factor results.
Factor loadings
Scale 1 2 3 4
ALI-S .84
ALI-R .88
ALI-M .86
ALI-B .90
ALQ-S .80
ALQ-R .82
ALQ-M .84
ALQ-B .85
Factor intercorrelations
1 2 3 4
1 1.00
2 .78 1.00
3 .62 .81 1.00
4 .82 .73 .76 1.00
Error intercorrelations
ALI-S ALI-R ALI-M ALI-B ALQ-S ALQ-R ALQ-M ALQ-B
ALI-S .30
ALI-R .02* .23
ALI-M .10 −.04* .26
ALI-B −.04* .04* .02* .18
ALQ-S .37
SLQ-R .08 .32
ALQ-M .04* .09 .29
ALQ-B .14 .09 .00* .29
Note. Scale abbreviations used are: S = Self-Awareness, R = Relational Transparency, M = Internalized Moral Perspective, and B = Balanced Processing. All parameter
estimates shown without an asterisk (*) are statistically significant (pb.01).

in the analysiswith Supervision Satisfaction. Again, this not only supports the concurrent validity of the new ALI scales but also their distinctiveness
and probable usefulness for future research on the antecedents and effects of authentic leadership.
Repeating both the correlation and regression analyses presented abovewhile statistically controlling for the impression management
variable produced virtually identical results. This suggests that bias from impression management may not be problematic for
studies employing the ALI and satisfaction and commitment variables (tabular results are available from the authors upon request).
3.6. Discussion
The results from Study 3, using a different sample, show support for and replicate the findings from Study 2. As in Study 2, the
ALI CFA indicated support for a second order factor model as well as support for the discriminant validity of the ALI vis a vis the
TLI. Additional confirmation was also found for the earlier finding indicating that relationships among first order leadership constructs
may vary depending on the specific leader observed. Additionally, utilizing a multitrait–multimethod (MTMM) matrix
analysis, convergent and discriminant validity were established for the ALI. Also, as in Study 2, the overall results show a lack
of confounding with impression management. Finally, the concurrent validity results for the ALI show promise for its future
use in developing elaborated nomological networks of the antecedents and consequences of authentic leadership.
Table 14
EMBA ALI scale correlations with dependent variables.
Correlations
Scale Mean Std. Dev. 1 2 3 4 5 6

  1. Self-Awareness 3.34 0.91 –
  2. Relational Transparency 3.65 0.97 .59 –
  3. Moral Perspective 3.67 0.78 .55 .58 –
  4. Balanced Processing 3.43 0.90 .58 .61 .62 –
  5. General Satisfaction 3.76 0.69 .40 .39 .40 .48 –
  6. Supervision Satisfaction 3.57 0.97 .58 .60 .59 .62 .74 –
  7. Organizational Commitment 5.02 1.14 .28 .29 .33 .29 .70 .56
    Note. All correlations shown are statistically significant at pb.001.

Table 15
EMBA ALI scale multiple regression results.
Independent variable B Std. Err. Beta t-value
Dependent variable: general satisfaction (regression R2=.124; F[4,223]=20.05, pb.001)
Self-Awareness .095 .059 .124 1.61
Relational Transparency .057 .057 .080 1.00
Moral Perspective .089 .070 .100 1.27
Balanced Processing .229 .063 .299 3.64⁎⁎⁎
Dependent variable: sup. satisfaction (regression R2=.517; F[4,223]=59.64, pb.001)
Self-Awareness .216 .067 .203 3.25⁎⁎⁎
Relational Transparency .205 .065 .206 3.17⁎⁎
Moral Perspective .247 .079 .198 3.11⁎⁎
Balanced Processing .275 .071 .257 3.85⁎⁎⁎
Dependent variable: org. commitment (regression R2=.133; F[4,223]=8.58, pb.001)
Self-Awareness .093 .105 .074 0.89
Relational Transparency .101 .102 .086 0.99
Moral Perspective .289 .125 .197 2.31⁎
Balanced Processing .095 .113 .075 0.84
⁎ pb.05.
⁎⁎ pb.01.
⁎⁎⁎ pb.001

  1. General discussion and conclusion
    One of the more significant and recurring problems in leadership research is the lack of psychometrically sound measures to
    assess various constructs. This is particularly disheartening because leadership attributes are clearly perceptual and, to use an old
    adage, very much in the “eye of the beholder.” Thus, accurate assessment becomes even more crucial in order to develop a generalized
    understanding about leadership processes. Unfortunately, the lack of attention to measurement in this area has led to
    what Schriesheim, Alonso and Neider (2008) call a “boom and bust” cycle, in which scholars enthusiastically embrace a new leadership
    approach, followed by wide usage of a particular leadership questionnaire, only to eventually learn that the accumulated
    results are spurious byproducts of what is essentially a poor measurement process (see also Schwab, 1980).
    To avoid this type of situation in the newly emerging field of authentic leadership, Walumbwa, Avolio, Gardner, Wernsing and
    Peterson (2008) developed and conducted psychometric analyses for a new measure (the ALQ) based on an extensive review of
    cross-disciplinary literature. The present investigation found support for seven out of the eight published items in the Walumbwa,
    Avolio, Gardner, Wernsing and Peterson (2008) ALQ instrument using a more rigorous quantitative content validity assessment
    process. However, there are some concerns as to the conclusions Walumbwa et al. reached regarding their confirmatory factor
    analyses (which indicated a higher-order factor model), as well as their use of the MLQ for discriminant validity assessment
    purposes.
    The present study primarily sought to develop a new measure based on the theoretical framework and available dimension definitions
    provided by Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008) in their comprehensive review of the literature
    and scale development article. This led to the creation of a preliminary pool of items for a new instrument, the Authentic Leadership
    Inventory (ALI), which was then subjected to content validity assessments along with 8 items extracted from the Appendix of
    Walumbwa, Avolio, Gardner, Wernsing and Peterson (2008). Using a more rigorous content validity assessment method developed
    by Schriesheim, Cogliser, Scandura, Lankau and Powers (1999, 1993) and Hinkin and Tracey (1999), support was found for the ALI
    (as well as seven out of the eight published Walumbwa et al. items).
    Having developed this new measure, the next set of analyses with a larger sample (Study 2) demonstrated acceptable internal
    consistency reliability for the four ALI scales and freedom from association with impression management (social desirability) In
    addition, confirmatory factor analyses, without the benefit of “garbage parameters,” and discriminant validity assessment vis a vis
    the TLI (rather than the more controversial MLQ) showed strong support for the new measure. Notably, as will be discussed
    below, the results do not, however, support treating authentic leadership or transformational leadership as universally global
    constructs.
    Using a third sample, Study 3 replicated the Study 2 findings, again demonstrating the internal consistency, factor structure,
    lack of social desirability (impression management) bias, and discriminant validity of the ALI. Additionally, to advance the construct
    validity of the ALI, MTMM analyses were undertaken to assess both convergent and discriminant validity (Campbell &
    Fiske, 1959). The results were highly supportive, and the Study 2 ALI and TLI factor-analytic results were replicated, showing
    that relationships among the first-order leadership constructs may vary depending on the specific individual being described.
    This, again, raises concerns about treating authentic leadership or transformational leadership as global constructs.
    Finally, after conducting the above analyses (all supportive of the construct validity of the ALI),we examined part of the nomological
    network of authentic leadership by assessing its concurrent validity with three widely used dependent variables — supervisory satisfaction,
    overall job satisfaction, and organizational commitment. Briefly put, the pattern of differential results appeared reasonable
    and supported the concurrent validity of the ALI

Taken as awhole, the results fromthese three initial investigations of the ALI’s psychometric properties are encouraging. However,
as noted earlier, construct validation is verymuch a continuous learning process (Nunnally & Bernstein, 1994), and “no interpretation
can be considered the final word, established for all time” (Cronbach, 1984, p. 149). Thus, given these encouraging results, it appears
that a considerably broader psychometric assessment needs to be undertaken of the ALI, hopefully leading to increased confidence in
the construct validity of this measure (for a more detailed discussion of the types of studies that would be useful, see Schriesheim &
Cogliser, 2010). There are also several substantive areas that might benefit from further investigation.
One of these areas concerns the fact that the studies reported here tested several different CFA (measurement) models, and
the findings showed differential results depending on who (McCain, Obama, and the respondents’ current supervisors) was
being described. Specifically, the results indicate that the appropriateness of considering authentic leadership to be a higher
order and more general unitary construct depends upon who is being described. For McCain, a first-order four-factor model
was an excellent fit to the data, while for Obama and the current supervisors assessed in Study 3, viewing authentic leadership
as a more global construct appeared warranted. Although we suspect that the latter finding may hold more generally, collecting
data from multiple samples is clearly necessary to draw firmer conclusions about this phenomenon.
In addition, although the comparisons of the two very different presidential candidates yielded relatively high and reasonably similar
factor loadings, the pattern of factor intercorrelations appeared to be different. This, again, seems to indicate that one cannot assume
that perceptions of authentic leadership are universally unitary or represent a higher-order, global construct. Furthermore, the
results of the confirmatory factor analyses of all 37 ALI and TLI items combined seemto support amodel with 10 first-order factors as
opposed to several second-order factor models for the Obama, McCain, and current supervisors’ data. Also note that the ALI results
have lower correlations with the TLI dimensions for the McCain and current supervisors’ data as opposed to the Obama data. This
lends further support for the discriminant validity of the ALI vis a vis the TLI but also supports (again) the fact that relationships
among the first order leadership constructs may vary depending on the person being described.
Although these results warrant further investigation, itmay not be surprising that a more generalized viewof authentic leadership
is attributable to early perceptions of Obama. Numerous pundits and commentators made note of Obama’s high level of charisma and
apparent authenticity during the campaign period and immediately following the election.Management guruWarren Bennis (Bennis
& Zelleke, 2008), in a news report describing Obama, stated that, “After meeting him, even the most jaded political reporters have
been known to report that he is something rare and special” (p. G07). Similarly, Gentry (2009), in a BusinessWeek editorial commented
that, “Obama’s inaugural addresswas honest,motivational…when addressing followers for the first time, a leader needs to be authentic,
visionary, inspirational, and charismatic.” Thus, Obama, who was then a politician with a short history compared toMcCain,
was stereotypically seen as authentic, charismatic, and perhaps almost magical as a leader. Given this, it is no wonder that our study
participants apparently saw him in a more aggregate and less molecular manner.
In conclusion, there are a number of other areas that need further investigation, particularlywith respect to strengthening the construct
validity of authentic leadership measures. As several scholars in the area have noted (Avolio, Gardner,Walumbwa, Luthans, &
May, 2004; Cooper, Scandura & Schriesheim, 2005; Gardner et al., in press), it is imperative to begin building the nomological network
associatedwith authentic leadership, particularly identifying not only the relevant constructs related to authentic leadership but also
the respective levels of analysis that are theoretically appropriate. Of particular interest would be studies demonstrating additional
discriminant as well as convergent validity (Campbell & Fiske, 1959). As Hinkin (1995) has stressed, “The use of multiple methods
and samples might be a necessary requirement in the development and use of new measures” (p. 983). We certainly agree with
this assessment and hope that our preliminarywork developing the ALIwill induce others in the field to further refine and strengthen
this measure for future research on authentic leadership.
References
Avolio, B. J., & Gardner, W. L. (2005). Authentic leadership development: Getting to the root of positive forms of leadership. The Leadership Quarterly, 16, 315–338.
Avolio, B. J., Gardner, W. L.,Walumbwa, F. O., Luthans, F., & May, D. R. (2004). Unlocking the mask: A look at the process by which authentic leaders impact follower
attitudes and behaviors. The Leadership Quarterly, 15, 801–823.
Bass, B. M. (1990). From transactional to transformational leadership: Learning to share the vision. Organizational Dynamics, 18, 19–31.
Bass, B. M., & Avolio, B. J. (1990). Multifactor Leadership Questionnaire. Palo Alto, CA: Consulting Psychologists Press.
Bass, B. M., & Avolio, B. J. (1993). Multifactor Leadership Questionnaire. Palo Alto: CA: Consulting Psychologists Press.
Bass, B. M., & Steidlmeier, P. (1999). Ethics, character, and authentic transformational leadership behavior. The Leadership Quarterly, 10, 181–218.
Bennis, W., & Zelleke, A. (2008). Obama’s charisma is important attribute. Deseret News, Salt Lake City, UT, March 2, 2008 (pp. G07).http://www.deseretnews.com/
opinion (Accessed March 6, 2009).
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bolen, & J. S. Long (Eds.), Testing structural equation models (pp. 136–162).
Thousand Oaks, CA: Sage.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait–multimethod matrix. Psychological Bulletin, 56, 81–105.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276.
Collins, S. (2008). Desperately seeking an authentic president: Authenticity rules in this year’s US presidential elections. http://www.spiked-online.com/index.
php/site/article/4505/ (February 14).
Cooper, C., Scandura, T. A., & Schriesheim, C. A. (2005). Looking forward but learning from our past: Potential challenges to developing authentic leadership theory
and authentic leaders. The Leadership Quarterly, 16, 475–493.
Cronbach, L. J. (1984). Essentials of psychological testing (4th ed.). Cambridge, MA: Harper & Row.
Dumdum, U. R., Lowe,K. B.,&Avolio, B. J. (2002).Ameta analysis of transformational and transactional leadership. In B. J. Avolio, & F. J. Yammarino (Eds.), Transformational
and charismatic leadership: The road ahead (pp. 35–66). Oxford, UK: Elsevier Science.
Dunham, R. B., Smith, F. J., & Blackburn, R. S. (1977). Validation of the index of organizational reactions with the JDI, the MSQ, and Faces scales. Academy ofManagement
Journal, 20, 420–432.
Ford, J. K., MacCallum, R. C., & Tait, M. (1986). The application of exploratory factor analysis in applied psychology: A critical review and analysis. Personnel Psychology,
39, 291–314.

Gardner, W. L., Cogliser, C. C., Davis, K. M., & Dickens, M. P. (in press). Authentic leadership: A review of the literature and research agenda. The Leadership
Quarterly.
Gentry, W. A. (2009). Nonverbal Obama: Aside from his words. Business Week online, January 20, 2009http://www.businessweek.com (Accessed March 6, 2009.)
George, J. M. (2000). Emotions and leadership: The role of emotional intelligence. Human Relations, 53, 1027–1055.
Gillet, B., & Schwab, D. P. (1975). Convergent and discriminant validities of corresponding Job Descriptive Index and Minnesota Satisfaction Questionnaire scales.
Journal of Applied Psychology, 60, 313–317.
Gosserand, R. H., & Diefendorff, J. M. (2005). Emotional display rules and emotional labor: The moderating role of commitment. Journal of Applied Psychology, 90,
1256–1264.
Hinkin, T. R. (1995). A review of scale development practices in the study of organizations. Journal of Management, 21, 967–988.
Hinkin, T. R. (1998). A brief tutorial on the development of measures for use in survey questionnaires. Organizational Research Methods, 1, 104–121.
Hinkin, T. R., & Schriesheim, C. A. (2008). An examination of “non-leadership”: From laissez-faire leadership to leader reward omission and punishment omission.
Journal of Applied Psychology, 94, 1234–1248.
Hinkin, T. R., & Tracey, J. B. (1999). An analysis of variance approach to content validation. Organizational Research Methods, 2, 175–186.
Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation
Modeling, 6, 1–55.
Ilies, R., Morgeson, F. P., & Nahrgang, J. D. (2005). Authentic leadership and eudaemonic well-being: Understanding leader–follower outcomes. The Leadership
Quarterly, 16, 373–394.
Jöreskog, K. G., & Sörbom, D. (2006). LISREL, version 8.8. Lincolnwood, IL: Scientific Software International.
Kenny, D. A., & Kashy, D. A. (1992). Analysis of the multitrait–multimethod matrix by confirmatory factor analysis. Psychological Bulletin, 112, 165–172.
Kernis, M. H. (2003). Towards a conceptualization of optimal self-esteem. Psychological Inquiry, 14, 1–26.
Lanyon, R. I., & Carle, A. C. (2007). Internal and external validity of scores on the Balanced Inventory of Desirable Responding and the Paulhus Deception Scales.
Educational and Psychological Measurement, 67, 859–876.
Luthans, F., & Avolio, B. (2003). Authentic leadership: A positive development approach. In K. S. Cameron, J. E. Dutton, & R. E. Quinn (Eds.), Positive organizational
scholarship: Foundations of a new discipline (pp. 241–261). San Francisco: CA: Barrett-Koehler.
MacCallum, R. C. (1986). Specification searches in covariance structure modeling. Psychological Bulletin, 100, 107–120.
Marsh, H. W. (1989). Confirmatory factor analyses of multitrait–multimethod data: Many problems and a few solutions. Applied Psychological Measurement, 13,
335–361.
Martins, L. L., Eddleston, K. A., & Veiga, J. F. (2002). Moderators of the relationship between work-family conflict and career satisfaction. Academy of Management
Journal, 45, 399–409.
Medsker, G. J.,Williams, L. J., & Holahan, P. J. (1994). A review of current practices for evaluating causalmodels in organizational behavior and human resourcesmanagement
research. Journal of Management, 2, 439–464.
Medved, M. (2008). The authenticity election. Townhall.com, February 6.
Mowday, R. T., Porter, L., & Steers, R. M. (1982). Organizational linkages: The psychology of commitment, absenteeism, and turnover. New York: Academic Press.
Mowday, R. T., Steers, R. M., & Porter, L. W. (1979). The measurement of organizational commitment. Journal of Vocational Behavior, 14, 224–227.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Paulhus, D. L. (1991). Balanced Inventory of Desirable Responding (BIDR). In J. P. Robinson, P. R. Shaver, & L. S. Wrightsman (Eds.), Measures of personality and
social psychological attitudes (pp. 37–41). San Diego, CA: Academic Press.
Paulhus, D. (1998). Paulhus Deception Scales (PDS): The Balanced Inventory of Desirable Responding—7: User’s manual. North Tanawanda, N.Y.: Multi-Health
Systems.
Podsakoff, P. M., Bommer, W. H., Podsakoff, N. P., &MacKenzie, S. B. (2006). Relationships between leader reward and punishment behavior and subordinate attitudes,
perceptions, and behaviors: A meta-analytic review of existing and new research. Organizational Behavior and Human Decision Processes, 99, 113–142.
Podsakoff, P. M., MacKenzie, S. B., Moorman, R. H., & Fetter, R. (1990). Transformational leader behaviors and their effects on followers’ trust in leader, satisfaction,
and organizational citizenship behaviors. The Leadership Quarterly, 1, 107–142.
Podsaskoff, P.M., MacKenzie, S. B., & Bommer,W. H. (1996). Transformational leader behaviors and substitutes for leadership as determinants of employee satisfaction,
commitment, trust, and organizational citizenship behaviors. Journal of Management, 22, 259–298.
Porter, L.W., Steers, R. M., Mowday, R. T., & Boulian, P. V. (1974). Organizational commitment, job satisfaction, and turnover among psychiatric technicians. Journal of
Applied Psychology, 59, 603–609.
Schmitt, N., & Stults, D. M. (1986). Methodology review: Analysis of multitrait–multimethod matrices. Applied Psychological Measurement, 10, 1–22.
Schriesheim, C. A., Alonso, S., & Neider, L. L. (2008). A quantitative examination of the content validity and theoretically dimensionality of the Transformational
Leadership Inventory (TLI). Paper presented in the Research Methods Track, Southern Management Association Annual Meeting, St. Petersburg Beach, Fl, October
29–November 1.
Schriesheim, C. A., & Cogliser, C. C. (2010). Construct validation in leadership research: Explication and illustration. The Leadership Quarterly, 20, 725–736.
Schriesheim, C. A., Cogliser, C. C., Scandura, T. A., Lankau, M. J., & Powers, K. J. (1999). An empirical comparison of approaches for quantitatively assessing the content
adequacy of paper-and-pencil measurement instruments. Organizational Research Methods, 2, 140–156.
Schriesheim, C. A., Powers, K. J., Scandura, T. A., Gardiner, C. C., & Lankau, M. J. (1993). Improving construct measurement in management research: Comments and
a quantitative approach for assessing the theoretical adequacy of paper-and-pencil survey-type instruments. Journal of Management, 19, 385–417.
Schriesheim, C. A., Tetrault, L. A., Kinicki, A. J., & Carson, K. (1989). A confirmatory analysis of JDI, MSQ, and IOR construct validity. Paper presented at the
Fourth Annual Conference of the Society for Industrial and Organizational Psychology, Boston, MA, April, 1989.
Schwab, D. P. (1980). Construct validity in organizational behavior. Research in Organizational Behavior, 2, 3–43.
Shamar, B., & Eilam, G. (2005). “What’s your story?”: A life-stories approach to authentic leadership development. The Leadership Quarterly, 16, 395–417.
Tepper, B. J. (1995). Upward maintenance tactics in supervisory mentoring and nonmentoring relationships. Academy of Management Journal, 38, 1191–1205.
Tucker, L. R., & Lewis, C. (1973). The reliability index for maximum likelihood factor analysis. Psychometrika, 38, 1–10.
Walumbwa, F. O., Avolio, B. J., Gardner, W. L., Wernsing, T. S., & Peterson, S. J. (2008). Authentic leadership: Development and validation of a theory-based measure.
Journal of Management, 34, 89–126.
Wanous, J. P. (1974). A causal–correlational analysis of the job satisfaction and performance relationship. Journal of Applied Psychology, 59, 139–144.
Weiss, D. J., Dawis, R. V., Lofquist, L. H., & England, G. W. (1967). Manual for the Minnesota satisfaction questionnaire (Minnesota Studies in Vocational Rehabilitation
22). Minneapolis: Industrial Relations Center, University of Minnesota.
Widaman, K. F. (1985). Hierarchically-nested covariance structure models for multitrait–multimethod data. Applied Psychological Measurement, 9, 1–26.
Williams, L. J., & Hazer, J. T. (1986). Antecedents and consequences of satisfaction and commitment in turnover models: A reanalysis using latent variable structural
equation methods. Journal of Applied Psychology, 71, 219–231.
Yukl, G. (2010). Leadership in organizations (7th ed.). Englewood Cliffs, NJ: Prentice Hall.
Zerbe, W. J., & Paulhus, D. L. (1987). Socially desirable responding in organizational behavior: A reconceptualization. Academy of Management Review, 12, 250–264.