A correlational study of the efficiency of certain aptitude tests in predicting the success of supervisors in an aircraft factory

Citation preview

A CORRELATIONAL STUDY OP THE EFFICIENCY OP CERTAIN APTITUDE TESTS IN PREDICTING THE SUCCESS OP SUPERVISORS IN AN AIRCRAFT FACTORY

A Thesis Presented to the Faculty of the Department of psychology The University of Southern California

In Partial Fulfillment of the Requirements for the Degree Master of Arts

by Paul 0. Thompson January, 1950

UMI Number: EP64001

All rights reserved INFO RM ATION TO A LL USERS The quality o f this reproduction is dependent upon the quality of the copy subm itted. In the unlikely event that the author did not send a com plete m anuscript and there are m issing pages, these will be noted. Also, if material had to be rem oved, a note will indicate the deletion.

Dissertation PyMisMng

UMI EP64001 Published by ProQuest LLC (2014). C opyright in the Dissertation held by the Author. M icroform Edition © ProQ uest LLC. All rights reserved. This w ork is protected against unauthorized copying under Title 17, United States Code

P roQ uest LLC. 789 East Eisenhow er Parkway P.O. Box 1346 Ann Arbor, Ml 48106 - 1346

T h is thesis, w r i t t e n by

..Pml..Q.t...ThQmp^.P.iJ-...................... u n d e r the g u id a n c e o f h%P~.. F a c u lt y C o m m itte e , and app ro ve d

by a l l its

m em bers, has been

presen ted to a n d accepted by the C o u n c i l on G r a d u a te S t u d y a n d R e search in p a r t i a l f u l f i l l ­ m e n t o f th e re q u ire m e n ts f o r the degree o f

M as t e r p f A r t s

Faculty Committee

irman

I

TABLE OP CONTENTS CHAPTER

I.

II.

PAGE

INTRODUCTION .....................................

1

The p r o b l e m ................................ .

1

Importance of the s t u d y .....................

1

Organization of the t h e s i s ...................

Ij.

REVIEW OP THE L I T E R A T U R E ........................

5

The general literature on psychological testing in industry

.....................

5

Literature on relationships between super­ visory-administrative success and psycho-

III.

. logical tests..............................

7

Summary of the reports in the literature . .

9

THE TESTS E M P L O Y E D ..............................

10

The Guilford-Zimmerman Aptitude Survey . . .

10

The California Short-Form Test of Mental Maturity (Advanced 3-Form) ............... IV.

THE EXAMINEES AND THE P R O C E D U R E ...............

11\.

...........

llj.

The administration of t e s t s ...............

lif.

Selection of a criterion

1$

The population sample

.................

Analysis m e t h o d s ........................... V.

12

RESULTS AND D I S C U S S I O N .......................... The r e s u l t s ...................

17 21 21

iii CHAPTER

PAGE

Correlations with the criterion and inter­ correlations of componentt e s t s ............ Significance Discussion

21

ofthe obtained correlations . .

...................

26

. . . . . . . .

27

Predictive efficiency of the GuilfordZimmerman Aptitude Survey

................

27

The predictive efficiency of the California Test of Mental M a t u r i t y ...................

30

The significance of the difference between the multiple correlation coefficients . . .

32

The intercorrelations • • . . ...............

3k

The correlations between similar tests of 36

the two b a t t e r i e s ......................... Interpretation of results VI.

...........

37

SUMMARY AND C O N C L U S I O N ......................

39

S u m m a r y .....................................

39

The findings of the s t u d y ................... C o n c l u s i o n .................................

39 lj.1

BIBLIOGRAPHY ..........................................

k3

APPENDIX I .............................................

1(.8

APPENDIX I I ..........................................

52

APPENDIX I I I ..........................................

511-

APPENDIX I V ..........................................

56

APPENDIX V .............................................

58

APPENDIX V I ..........................................

60

LIST OP TABLES TABLE

I.

PAGE

Comparison of Aptitude Survey Means and Standard Deviations for the Three Levels of the Position Level

II.

Criterion . . .

Correlation of Tost

...............

Scores with the "D-Score11

Criterion and Associated Statistics III.

.........

23

California Test of Mental Maturity Inter­ correlations ..................................

V.

22

Guilford-Zimmerman Aptitude Survey Inter­ correlations ..................................

IV.

18

2lj.

Factors in the Solution of the Aptitude Survey Multiple Correlation and Regression Co­ efficients ....................................

VI.

28

Factors in the Solution of the Test of Mental Maturity Multiple Correlation and Regression Coefficients ..................................

29

CHAPTER I INTRODUCTION I.

THE PROBLEM

The principle purpose of this study was to determine how well five tests of the Guilford-Zimmerman Aptitude Survey, individually and as a group, differentiate success in the levels of management of the Lockheed Aircraft Corporation, Burbank, California,

A secondary purpose was to compare the

prediction precision of these tests with that of the California Short Form Test of Mental Maturity, which at present plays a part in the company’s employee selection and placement program. II.

IMPORTANCE OF THE STUDY

The ultimate of scientific research is prediction, Ifhen the scientist has determined the relationships existing between observed phenomena, he is in a position to predict the value of one as a function of others.

This is the ultimate

form of knowledge which makes possible the practical advances in our culture. The importance of the investigation herein reported is in its attempt to further this kind of knowledge in one important segment of our culture: industrial management.

More

specifically, this study sought to determine the functional

2 relationship between suecess in supervising^n^^admini.s.tr.ating, and scores on certain aptitude tests. The study was more than routinely important because one group of tests used is part of a new series of aptitude tests, which heretofore has not been applied to the management field. Survey.

This series is known as the Guilford-Zimmerman Aptitude It represents a new approach to aptitude testing, an

approach which was first acclaimed by Thurstone as ”primary mental abilities” .!

The principle is to isolate unique human

capabi1ities by factorial analysis methods and to construct new tests of items that are heavily loaded with the unique factor at the expense of unwanted elements. ^

This technique

was used extensively by the Army Air Forces during World War IX for the selection and placement of personnel and proved to be both effective and economical.3

The Guilford-Zimmerman tests

represent an extension of this war research to civilian applications. 1 L. L* Thurstone, Primary Mental Abilities (Chicago: University of Chicago Press, 193o)> PP* 1-llS. 2 J. p. Guilford, f,The Discovery of Aptitude and Achievement Variables,” Science, 106:279* September 26, 19^7. J. P. Guilford, ”Factor Analysis in a Test Develop­ ment Program,” Psychological Review, March, 19ij-8 . 3 J. P. Guilford and Wayne S • Zimmerman, 11Some AAF Findings Concerning Aptitude Factors,” Occupations, 26:154” 159> December, 1947-

3 Much research has been devoted toward finding testing devices which predict the differential success of super­ visors and administrators, but the quest has not been highly rewarding.

The basic aptitudes that make an in­

dustrial supervisor successful are undoubtedly relatively many as compared to, for example, those desirable for a successful welder.

Certainly, personality" factors are of

considerable consequence, but they lie beyond the scope of this paper. It is hoped that the work here reported may be a stepping stone toward further research and development in the basic-trait test technique and its application to industrial problems.

The ultimate goal of this approach

is to provide a test of each factorially isolated factor, and to find combinations of these which are closely re­ lated to each vocational skill, no matter how complex.

The

resulting benefits to vocational counseling and toward better solutions of personnel problems is obvious. The present study is admittedly narrow in scope, even in the field of predicting the success of management personnel.

However, if it were to increase the predictive

accuracy in this particular application by 5 P ©*4 cent, or to lead some one else to do so, this alone would make the effort worth while.

III.

THE ORGANIZATION OP THE THESIS

The remainder of this report is organized as follows: Chapter II is a review of the literature related to psychological research in industry and particularly as concerns selection and placement of supervisory and administrative personnel.

The results of several studies similar to the

present one are cited and the trend of thought on the subject is indicated. Chapter III contains descriptive comments concerning the Guilford-Zimmerman tests and the California Short Form Test of Mental Maturity. Chapter IV defines the population sample tested and gives the procedure pertaining to administration of tests, selection of the criterion, and analysis of the test data. Chapter V presents the results of the analysis, a short discussion of the results and their significance, and a comparison of the obtained intercorrelations with those expected.

Finally, Chapter VI contains a summary of the

report, with conclusion and recommendations.

CHAPTER I I

REVIEW OP THE LITERATURE Only recently has scientific research gained a foot­ hold in the industrial personnel field.

Traditionally,

industrial employment and classification has been handled by the subjective estimates of employment managers.

The

chief guides for judgment have been questionnaires, inter­ views, and even photographs of applicants.

However, as a

result of the impetus given objective psychological testing by the nation at war, industry is gradually adopting the more objective, scientific methods. The general literature on psychological testing in industry.

The adoption of psychological testing methods has

been slow, not only because of basic resistance to something new, but also because some personnel testing programs have not paid their way.

As a result, some industrial executives

have little confidence in any purported results and the word “psychological11 has become a battle cry to them.1

On the

other hand, some large concerns have personnel research organizations which have unequivocally proved their worth

1 Edward N. Hay, “The Use of Psychological Tests in Selection and Promotion,1? Personnel, 15:123, February,

1940.

--------

6

over a number of years, and which can attribute a great share of their success to psychological testing methods.^ What then are the causes of the failure of some of them? Perhaps it is faulty method, or it could easily be non­ utilization of the best tests for the particular situation. Moreover, it could be improper use of the tests which po­ tentially could do the job. reported again and Hay reports

These are some of the answers

again in the literature. that the

factors generally considered for

employment and promotion are age, health, social development, education, interests, and previous performance; while general mental ability and

special physical and mental aptitudes are

not recognized as vital for s u c c e s s . 3

Guy W.

Wadsworth, one

of the most successful executives in the personnel field, emphasizes intelligence as being of primary importance and relegates aptitude tests to secondary importance.^-

At any

rate, test research in industry has been limited mainly to intelligence tests, worksample tests, personality inventories, rating scales, and biographical data sheets.

Little use has

been made of tests designed to tap wbasicft human factors. 2 Hay, loc. cit. 3 Ibid., p. 117. k- Guy W. Wadsworth, Jr., "Tests Prove Their Worth to a Utility,11 Personnel Journal, 1I4.;183, November, 1935*

Literature on relationships between supervisoryadministrative success and psychological tests.

Published

reports of studies exposing relationship between personnel evaluation devices and supervisory success are meager. Uhrbrock and Richardson^ administered a battery of nine tests, including intellectual, personality, and interest inventories, to one hundred sixty-three factory supervisors and obtained a multiple correlation of .lj.8 + .0 four superintendents.

with ratings of each man by

Using only those items that proved

significant (85 out of the 820 items) increased the coefficient to

*71 ± .03. Harrell^ tested a group of supervisors in five Georgia

textile plants, and using the ratings of four superiors as the criterion, found correlations of .37 with the Otis S A Intelligence Test, .18 with a test of social intelligence, and .15 with the "self-confidence" factor of the Bernreuter Personality Inventory. Shuman,7

using a sample of two hundred ninety-seven,

was able to select from 15 to 20 per cent more "excellent" 5 Richard S. Uhrbrock and M. W. Richardson, "The Basis for Constructing a Test for Forecasting Supervisory.Ability", Personnel Journal, 12:lifl-l5l4., October, 1931}-. 6 Willard Harrell, "Testing Cotton Mill Supervisors," Journal of Applied Psychology, 21}.:31-35* January, 19ij.0. 7 John T. Shuman, "The Value of Aptitude Tests for Factory Workers in the Aircraft Engine and Propeller Industries, Journal of Applied Psychology, 29:185-190, June, 19l}5.

aircraft factory supervisors with the aid of the Otis Q S Test of Mental Ability (Beta A), the Minnesota Paper Form Board, and the Bennet Test of Mechanical Comprehension, This result, obtained by the use of cut-off scores, corre­ sponds to a correlation coefficient of approximately .ij.0 , Sartain^ administered a varied group of paper-andpencil tests to forty aircraft plant supervisors and, corre­ lating the results with composite scores from two rating forms, he found no significant coefficients in most cases. However, he did find low negative coefficients for a test of mechanical comprehension and one of supervisory talent (contradicting in part the results of Shuman and Harrell), Sartain qualified his results to the extent of questioning the validity of his criterion, but he concluded that the tests used simply did not differentiate the quality of super­ visor at his plant, Stockford,9 Industrial Relations Advisor of Lockheed Aircraft Corporation, studied a sample of one hundred thirtynine supervisors and found that the production records of the supervisors could be predicted with some success by utilizing cut-off scores on certain factors.

In terms of

8 A. Q. Sartain, "Relation between Scores on Certain Standard Tests and Supervisory Success in an Aircraft Factory, Journal of Applied Psychology, 30:328-332, August, 1914-6* 9 Lee Stockford, "Selection of Supervisory Personnel," Personnel, 2lf: 186-199 $ November, 1914-7-

9 percentage above chance, the predictive values of the factors were:

Persuasiveness (35 per cent), Administrative Interest

(?)) per cent), Native Intelligence (17 per cent), SupervisoryStaff Seniority (11 per cent), Community Stability (9 per cent), Supervisory Ratings (5 per cent), and Company-wide Seniority (2 per cent).

These correspond to correlation coefficients

ranging from approximately .75 to .2 0 .

Stockford adds that

the greatest value of a psychological testing program in industry is its effectiveness in eliminating the poorer workers, whether or not It is precise in differentiating the higher levels of success. Summary of the reports in the literature.

The reports

reviewed cite the relationships between supervisory success and certain psychological tests as generally low, but indicate that some of them can do a good job through proper selection, proper combination, suitable tailoring to meet specific situ­ ations, and discrete interpretation.

The consensus of opinion

is that a psychological testing program is a valuable asset to an Industry, if the results are applied with full awareness of their limitations.

CHAPTER I I I

THE TESTS EMPLOYED The GuiIford-Zimmerman Aptitude Survey.-*is a battery of seven tests.

This survey

Other tests are being developed

and will be added to the battery from time to

time.

Originally,

the first four tests of the battery were administered for the purposes of this study and the next addition to the battery, Part VII, was administered at a later date to a smaller number of the same group.

Part I (Form AX) is a test of Verbal Com­

prehension; Part II (Form AX), General Reasoning; Part III (Form A )

9

Numerical Operations; Part IV (Form A ) 9 Perceptual

Speed; and Part VII (Form A), Mechanical Knowledge.

Unfortu­

nately, Parts V and VI (Spatial Orientation and Spatial Visuali­ zation), which would have been interesting and valuable addi­ tions to this study, were not ready for use at the time. As already mentioned, the Guilford-Zimmerman tests are and outgrowth of the Army Air Forces wartime research on classi­ fication tests.

Through factor analysis of a great number of

tests, the unique factors of the Guilford-Zimmerman tests were 1 J. P. Guilford and Wayne S. Zimmerman, nThe GuilfordZimmerman Aptitude Survey,,f Journal of Applied Psychology, 3 2 :2l4.-3l|_, February, 1 ^ 8 . Guilford and Zimmerman, The Gui 1ford-Zimmerman Apti­ tude Survey— A Manual of Instructions and Interpretations (Beverly Hills j Sheridan Supply Company, 1 ) > 8 pp.

11 isolated.

The present tests are an attempt to maximize the

variance contribution of the unique factors and to minimize the contamination of other factors. The Verbal Comprehension test is a power test of the general vocabulary type, which is found to be the best measure of the verbal factor*

The Ceneral Reasoning test is made up

of arithmetic-reasoning items, which are the best measure of the general reasoning factor yet to be found.

The Numerical

Operations test is a speed test made up of simple addition, subtraction, multiplication, and division exercises, in that order. The Perceptual Speed test measures rapidity and accuracy in matching pairs of objects with small detail.

And finally,

the Mechanical Knowledge test measures the practical nfix-it!l knowledge such as is acquired around the home, automobile, and workshop, with a substantial share of emphasis on tool uses. This is the only factor found to be unique to general mechanical tests• The working time allowed on the several tests named was 25 minutes, 35 minutes, 8 minutes, 5 minutes, and 30 minutes, respectively, making a total test battery time of 2 hours and 8 minutes, including the preliminary instruction time allowed (25 minutes).

12 The California Short-Porm Test of Mental Maturity (Advanced S-Form).^

This battery is composed of six parts.

The first three, classified as ”non-language” tests, are (1) Sensing Right and Left, (3) Similarities.

(2) Manipulation of Areas, and

The other three, which are listed as

Mlanguage” tests, are called (I4.) Numerical Quantity, (5) Inference, and (6) Vocabulary. The first test is made up of items presenting a member of a pair.

In most cases it is a part of the human body, and

in others it is an article of clothing.

The task is to decide

whether it is the right or left member. The Manipulation of Areas test is composed largely of tri-dimension spatial visualization items, though several items of two-dimensional detail matching are mixed in. The task in the items of the Similarities test is to abstract a common property in three objects and to choose a fourth object having the same property from a given group. The fourth test is a simple arithmetic reasoning test. The fifth part presents items which state two premises and the examinee selects the best of a list of conclusions. last test is a short multiple choice vocabulary test.

The The

2 Elizabeth T. Sullivan, Willis ?/. Clark, and Earnest W. Tiegs, Manual of Directions, California ShortPorm Test of Mental MaturTty Advanced 3 -Form (Los Angeles: California Test Bureau, 19^3)» B pp.

13 reasoning and vocabulary tests of the two batteries are quite comparable in content. The Guilford-Zimmerman tests are printed separately, while the California tests are printed as one pamphlet.

The

latter tests are considerably shorter than the former, and require less working t i m e . 3 The raw score totals of the California ,fnon-language11 tests, as well as the totals of the Mlanguage,f tests, are convertable to I. Q. values, with the aid of mental age equi­ valents which are provided.

However, the I. Q. scores are

of no value in this analysis, so only the raw scores were considered.

3 Sullivan, Clark, and Tiegs, loc. cit.

CHAPTER IV

THE EXAMINEES AND THE PROCEDURE The population sample. eighty-four*

The group tested numbered

The composition of the group was somewhat

heterogeneous.

All had positions of a supervisory nature,

though a sprinkling of them were classified as staff specialists, with job titles such as t!production analyst11. Approximately half were from the office force.

The group

represents seven levels in the salaried hierarchy: from group-supervisor to division manager (in the office force) and superintendent (in the factory force). Since a limited fraction of this group was available at the time Part VII of the Guilford-Zimmerman battery was administered, data for only thirty-one cases are presented for this variable. The tests were

administration of tests. The

Guilford-Zimmerman

administered over a period of several months in

19^7 and 19lj.8.

The California battery was administered

shortly previous to this.

Although this battery is now used

as a selection and placement device, the scores did not affect

the criterion standing of the group under study. The

testing conditions of the

twobatteries differed

in this respect that the California battery was given to

15 the supervisors totally on company time, while half of the group took the Aptitude Survey tests on their own time. Otherwise, as far as testing room and test administrator are concerned, the conditions were the same. Selection of a criterion.

Three possible criteria

were available as measures of success: merit rating,

(1 ) a subjective

(2 ) a composite score of the supervisors

value to the company, and (3 ) his level-of-position in the management organization. The merit rating was a typical rating by a superior and was judged to be too subjective for a meaningful criterion. The literature is full of.reports that testify to the low validity of merit ratings, largely due to partiality and errors in judgment by the rater.

In fact, Stockford and

Bissell found that scores of salaried employees on a standard rating scale, designed to measure performance, more validly reflected personal-social relationships between superior and subordinate •^ The composite score, known as a “D-score” was a value more or less objectively indicating the employee*s over-all standing with the company under the Management Selection and Placement Plan.

Superiors1 ratings have a role in this

1 Lee Stockford and H. W. Bissell, “Factors Involved in Establishing a Merit Hating Scale,” Personnel, 26:98* September, 19^9*

16 criterion also— to the extent of $ 0 per cent.

However, these

ratings are heavily weighted with objective standards of quantity control, quality control, budget control, and record of subordinates* grievances.

The remaining £0 per cent of

the D-score is composed of the following factors:

training

(formal education), experience in the type of work, seniority in the job, seniority with the company, and psychological test scores. In addition to the likely subjective-bias in the rating components, another disadvantage in this criterion is the fact that the objective standards for the rating element do not apply equally to the office force, the factory force, and staff specialists. The position-level criterion was divided into three levels:

(1 ) group and section level,

and (3) division level.

(2 ) department level,

This criterion is actually the

most realistic of the alternatives, since a man in the top level must be looked upon as more successful at the moment than one in another level.

However, though apparently more

realistic, this has a fallacy.

It doesn*t take into account

the fact that men in the lower levels may in time advance beyond any of those in the top group, proving themselves po­ tentially „more successful, after having passed the age, seniority, and experience hurdles.

17 In order to check the possibilities of this criterion, a tri-serial eta coefficient was calculated to establish the relationship between it and the Guilford-Zimmerman Verbal Comprehension test, which had already yielded substantial correlation with the D-score criterion.

The obtained eta,

.0 9 , was so low as to indicate the unsuitability of this criterion.

(Table I, page 18, shows the means and standard

deviations of the D-scores and the Aptitude Survey variables for the three position-levels.) After considering the desirable and undesirable aspects of the three alternative criteria, the D-score variable was selected as the best available. ...... — . W||g|1|| „ immii, Hi,,,,, 11, T -r ii

.. .

Analysis methods.

For establishing the degree of

correlation of the several tests with the D-scores, Pearsonian coefficients were computed from the raw scores by the formula: r x y = fNXXY -

(XX) (XY)

//[N X ? i2 -

( I X ) 2] [NXY2 - -

(X Y )2 ]

Preliminary analysis with a tentative sample population showed that the scatter of the cases was essentially linear and most efficiently handled with the Pearson coefficient formula. Means and standard deviations were easily obtained as adjuncts to the correlational computations, the standard deviations by the formula:

6 = (l/H) /N£X2 - (XX)2

18 TABLE I

COMPARISON OP APTITUDE SURVEY MEANS AND STANDARD DEVIATIONS FOR THE THREE LEVELS OF THE POSITION LEVEL CRITERION

Variable D-Score

I

II

III

IV

VII

Measure

Level 3

Level 2

Level 1

M

75.8

70.5

61^.9

6

8 .2

6.9

1 0 .4

N

11

37

36

M

7 1 .2

5 6 .6

5 6 .2

6

13.5

18.7

9.3

N

11

37

36

M

lB.i*.

12.9

14*5

6

5.3

5.9

6 .2

N

11

37

36

M

8 6 .6

6 6 .3

6 5 .8

6

15.9

21.3

24.3

N

11

37

36

M

45.8

43.3

3 8 .6

6

7.3

8 .1

8.7

N

11

36

36

M

47.5

45.7

42.5

6

0.3

4.8

5.0

N

2

15

. _ 2k

All Levels

9.5

19.0

6 .1

2 3 .0

8.7

5 .0 ....

19 The significance of the obtained correlations was O estimated by the use of the Wallace and Snedicor table, based upon the lfnull hypothesis” test.

The significance was

further estimated by applying the formula for standard error:

fr Z

(1 - r2 )/ tfr=“I

As a preparation for calculating multiple correlation coefficients for the two batteries, intercorrelation coeffi­ cients were obtained by use of an intercorrelation computing form developed by the Educational Testing Service and reported by Tucker.

3

However,

since the N was smaller for

the Aptitude Survey Mechanical Knowledge test (N - 3 1 )> its intercorrelations were computed separately. As a means of comparing the Guilford-Zimmerman battery with the California Test of Mental Maturity for predictive precision, multiple correlation coefficients were calculated by the Doolittle method and, through the use of beta factors, the variance contributions of each test were ascertained. In addition, coefficients closely related to the correlation coefficient were computed to show more clearly the 2 J. p.Guilford, Fundamental Statistics in Psychology and Education (New York: McGraw-Hill Book Company, Inc., 1952), pp. 323-324. 3 Ledyard R. Tucker, ffA Note on the Computation of a Table of Intercorrelations,H Psychometrika, 13:214.5-250, December, 19^4-8.

predictive expectancy warranted by the size of the corre­ lations.

Finally, regression equations of best fit were

computed for estimating a supervisors criterion standing from knowledge of his test s c o r e s A

jj~~— — Guilford, Fundamental Statistics in Psychology and Education, o p . cit., pp. 262 ff.

CHAPTER V RESULTS AND DISCUSSION I.

THE RESULTS

Correlations with the criterion and intercorrelations of component tests.

The results of the analysis are shown

in Tables II, III, and IV, (pages 22, 23, and 2if.),

Table II

indicates that in the Guilford-Zimmerman Aptitude Survey the verbal test was the only test found to have substantial correlation with the criterion D-scores.

However, all except

the Perceptual Speed test showed definite relationship with the criterion. Table III shows that, for the group tested, the verbal (I), reasoning (II), and numerical (III) performances were moderately intercorrelated, contrary to what would be expected from data on these tests from the general population* Perceptual Speed (IV) showed no positive intercorrelation significantly above zero, while it was found to have low but real negative correlation with the Mechanical Knowledge (VII) performance.

The latter was not related to the verbal test

performance, but had fair negative correlation with the others. The correlations of the California Test of Mental Maturity variables with the criterion are also shown in Table II.

Those of the Inference and Vocabulary tests were

22 TABLE I I

CORRELATION OP TEST SCORES WITH THE "D-SCORE" CRITERION AND ASSOCIATED STATISTICS

Test

Mean Raw Score

6

58.33

19.01

.5 2 2

34.32

6 .0 8

68.73

v/6 p

r2

.0 8 0

6.5

.271

.2 8 6

.1 0 1

2 .8

.0 8 2

2 3 .0 2

.205

.105

2 .0

.0 4 2

41.58

8 .6 7

.0 7 2

.1 1 0

1.5

.005

44*35

5.02

.2 7 8

.1 6 8

1 .6

.077

.1 1 0

0 .4

.0 0 2

r

Aptitude Survey I Verbal II Gen. Reas. Ill Numerical IV Percep. Sp. VII Mechanical

California Test of M. M. 2 .1 4

2. Manipulations

6 .7 0

2 .2 2

.172

.1 0 6

1 .6

.0 3 0

3. Similarities

9.65

2.31

.198

.1 0 5

1.9

.039

Ij.. Numerical Q.

945

3.19

.5 0 6

.0 8 2

6 .2

.2$6

5. Inference

1 2 .1 3

1.74

.557

.0 7 6

7.3

.310

6 . Vocabulary

2 8 .6 7

8.43

.5 6 8

.074

7.7

.323

Total, Non-language 33.81

4*45

.138

.1 0 8

1.3

.019

O • 1

3.740

1. R-L Sensing

Total, Language

50.37

11.42

.645

.064

1 0 .1

4 l6

Sum Total

8 4 .1 3

1 3 .0 0

.6 3 1

.0 6 6

9.6

.398

(Note:

N s 8lj. except for Aptitude Survey, Part IV (N s 83) and Aptitude Survey, Part VII (N = 31)*)

23

TABLE III GUILFQRD-ZIMMEBMAN APTITUDE SURVEY IUTERC ORRELATIORS

Variable (Verbal Comp.)

I

11 496

I

III

IV

D-Score

.1^28

-.1 2 2

.1 2 6

.5 2 2

•ij-33

.0 6 9

-.176

.2 8 6

.077

-.257

.2 0 5

-.0 7 6

.0 7 2

(Gen. Reas.)

II

.4 9 6

(Numer. Op.)

Ill

.1^8

433

IV - . 1 2 2

.0 6 9

.077

(Meehan. Kn.) VII

.1 2 6

-.1 7 6

-.257

-.0 7 6

D-Score

.5 2 2

.2 8 6

.2 0 5

.0 7 2

(Percep. Sp.)

VII

.2 7 8 .2 7 8

2lj.

TABLE IV CALIFORNIA TEST OF MENTAL MATURITY INTERCORRELATIONS

Variable

(l)

(3)

(4)

(5)

.1 9 6

.0 4 2

.0 2 8

.0 4 6 - . 1 3 0 - . 0 4 2

.0 2 6

.1 6 0

.2 7 8

.0 0 4

.1 7 2

.3 5 2

.3 7 0

.2 9 1

.1 9 8

•494

.596

.5 0 6

•259

.557

(R-L Sens.)

(1 )

(Manip.)

(2 )

.1 9 6

(Simil.)

(3)

.0 4 2

.0 2 6

(Numer. Q . ) (4)

.0 2 8

•l60

.352

(Infer.)

(5)

.0 I4.6

.2 7 8

.370

•494

(Vocab.)

(6 )

-.130

.0 0 4

.2 9 1

.5 9 6

.2 5 9

-.0 4 2

.1 7 2

.1 9 8

.3 0 6

.557

D-Score

(6 )

DScore

(2 )

.5 6 8 .5 6 8

25

higher than the highest of the Guilford-Zimmerman group, though not significantly higher, since in each case the difference was less than one standard error.

The Sensing

Right and Left test showed no significant relationship, while the Manipulation of Areas test and the Similarities test had low correlations. Table IV, page 21}., indicates that the Sensing Right and Left variable maintained itself independent of all, except for a slight affinity for the Manipulation of Areas test (2) and a slight negative relation with the Vocabulary test (6 ).

The highest correlation with the manipulation

variable was that of the logic test (5)*

Similarities

test and the three language tests were all well correlated with each other, the largest correlation being .5 9 6 * between the Numerical Quantity and Vocabulary tests. Interestingly enough, Parts I and II of the Aptitude Survey, which duplicate the last mentioned California battery tests, were also the most highly related of their battery. The multiple correlation coefficient of the GuilfordZimmerman battery with the criterion, calculated for the combination of Parts I, II, III, and VII, turned out to be *5 7 3 > an increase of *05l above the greatest individual test correlation. The multiple correlation coefficient of the California Test of Mental Maturity, combining all component tests except

26 Sensing Right and Left, was .719* an increase of ♦lljJL above the highest individual test correlation, and .O7I4. above that of the total Language scores. Significance of the obtained correlations.

The Wallace

and Snedicor Tabled gives a correlation value of .215 as being significant11 (at the 5> per cent level) and *280 as being f,very significant” (at the 1 per cent level) for two-variable correlations with

8 )4.

cases.

In the Survey battery, the.verbal

test was far above the f,very significant” level, with the reasoning test just above this critical level, the numerical test just missing the ”significant” level, and the perceptual speed test unquestionably ”not significant” .

Since the

Mechanical Knowledge test had an N of 31, it would have to show a coefficient of .355 to be ”significant” at the 5 P e** cent level of confidence.

Its value,

.2 7 8 , falls slightly

short. With .215 and .280 also the ”significant” and ”very significant” values for the California Tests, one finds the Sensing Right and Left test definitely not qualifying, the manipulation and similarities tests falling slightly short of significance, and the three language tests undoubtedly •'very significant” .

Comparisons may also be made between

^ J. P. Guilford, Fundamental Statistics in Psychology and Education (New York: McGraw-Hill Book Company, Inc., ), pp. 323-32)4..

27 the r/Sr indices of the several tests in order to judge relative significance (see Table II, page 22). The multiple correlations (Table V, page 28 and Table VI, page 29) were also obviously well beyond the ffvery significant11 level.

The standard error of the Guilford-

Zimmerman battery multiple R was only .07i|- and that of the California battery multiple R was only .053* again emphasizing the strength of the relationships. II.

DISCUSSION

Predictive efficiency of the Gullford-Zimmerman Apti­ tude Survey.

Several things stand out in the results.

One

of the more important things was the size of the correlation coefficient for the combined scores of the California 11language1* tests.

Compared to the general trend of the

results of similar studies which have been reported In the literature, the correlation of

stands out as being

extraordinary. Another important result Is the strong relationship with the criterion shown by the multiple correlations of both batteries.

What this means in terms of prediction

efficiency will be discussed in the following paragraphs. Referring again to Table V, page 28, one sees that of the 32*8 per cent of the total variance (Coefficient of Multiple Determination) accounted for by the Guilford-

28

TABLE V FACTORS IN THE SOLUTION OF THE APTITUDE SURVEY MULTIPLE CORRELATION AND REGRESSION COEFFICIENTS

^lk

rlk

/^lkrlk

*l/*k

blk

.2 1 9 0

•499

.2 0 9

II

.1 0 2 6

.2 8 6

.0 2 9 3

1.559

.16 0

III

.0455

.2 0 5

.0 0 9 3

.4 1 2

.0 1 9

VII

.2549

.278

.0709

1 .8 8 8

.3285

.573 : R (Note:

Subscript ”1” indicates criterion.)

H

2=

II W ro

.522



Jp.95

CO

Part I

29

TABLE VI FACTORS IN THE SOLUTION OF THE TEST OF MENTAL MATURITY MULTIPLE CORRELATION AND REGRESSION COEFFICIENTS

Part 2

^lk

*lk

^ L k rlk

.Oij.ll*.

.1 7 2

.0071

4.270

.0 1 8

tfl A

blk

3

-.1193

.1 9 8

- .0 2 3 6

4.104

-.490

4

.0 4 6 7

.5 0 6

.0 2 3 6

2.994

.140

5

.1*476

.557

.2493

5.448

2.439

6

.4586

.5 6 8

.2 6 0 5

1.124

.515

2L= .5169 = R2 .719 = R (Note:

Subscript "I" indicates criterion.)

30 Zimmerman battery correlation, the greatest share was con­ tributed by the Verbal Comprehension test (I), and a small amount by the Mechanical Knowledge test (VII) and the General Reasoning test (II).

The contribution of the Numerical

Operations test (III) was so small as to make its inclusion in the predictive equation of doubtful value. The regression equation determined for this battery is: X c = 31.7+.209(XX ) +.l6o(X2 ) +.019(X,) +.1j.81(X7 ) where X c is the predicted criterion score and X^ is the score on Part I of the battery for one individual supervisor. The 11standard error of estimate” was found to be 7.8, which means that two thirds of the criterion D-scores lie within 7*8 points of the criterion scores predicted from scores on the Survey tests.

In other words, the test score

of a supervisor has a 67 p^r cent chance of predicting his D-score within 7*8 points.

Without such knowledge, the best

prediction of his criterion standing would be the mean D-score, with a two-to-one chance that the obtained criterion score would not be in error by more than 9*5 points (the standard deviation).

Thus, a gain of 1.7* o r 18 per cent,

is evidenced by use of these tests. The predictive efficiency of the California Test of Mental Maturity.

A review of Table VI, page 2ij., shows that

31 the multiple R of this battery (.7 1 9 ) accounted for 5 1 .7 per cent of the total D-score variance.

The vocabulary and in­

ference tests, variables 6 and 5 * were the main contributors to the predictive variance, accounting for more than 90 Per cent.

The Similarities test (3) and the Numerical Quantities

test (!{.) contributed very slightly, while the Manipulation of Areas test (2) contributed a negligible amount (and could be withdrawn from use in this particular prediction situation). The regression equation based on the beta weights became: X 0 = 27.6+( .018) (X2 )+(-.ii-90) (X3 )+(.liJ.0) (Xj^)+(2.439) (X-)+.5l5(X6 ) where the subscripts refer to the criterion and to the variab­ les of the battery, as before. The flstandard error of estimate” for the California battery turned out to be 6 .6 , which is an apparent advance in predictive efficiency over the Guilford-Zimmerman tests with their corresponding error of 7*8 points.

Compared to

the same standard deviation of criterion scores (9 *5 )# the standard error of estimate gives a reduction in error of 3 0 .5 pe*» cent, with knowledge of scores on the subject tests. The relative contributions of the particular tests, in both this and the Survey battery, must be viewed relative to the combination with others in the multiple correlation. Because of the frequent overlap of variance, eliminating one

32 or more tests from the battery would change the contributions of the others. The California Test of Mental Maturity results are a good example.

The r^*s in Table II, page 22, represent the

variance accounted for by each test when considered alone. If the correlations between tests were zero, the three language tests of the California battery, when combined, would account for over 75 per cent of the total P-score variance.

However, due to intercorrelation, or overlap,

the Numerical Quantity test (variable If) contribution, for example, was decreased from 2^.6 per cent to 2.If P er cent. The significance of the difference between the multiple correlation coefficients.

The question arises as to whether

or not the California Test of Mental Maturity multiple R of .719 is significantly greater than the Aptitude Survey multiple R of .573.

Or, phrased another way, "what are the chances

of obtaining a multiple R for the Aptitude Survey as great as the one obtained for the California Test of Mental Maturity, if we were to sample our population again?" If one is willing to make the assumptions that (1) one multiple R is the true coefficient, or mean, about which the coefficients of other samplings would distribute them­ selves, and (2 ) that such a group of values is normally dis­ tributed,^ then an approximate answer can be obtained. 2 Guilford, Fundamental Statistics in Psychology and Education, p . 209•

33 In the present situation the Survey battery multiple R was chosen as the assumed true mean.

The California

battery correlation coefficient deviates from this mean by •llj.6 units.

In terms of the standard error of the mean

coefficient (.0 7 4 ) the normal deviate is 1 .9 7 *

In a normal

distribution a deviation as large as this would be expected 4 . 9 times out of one hundred like population samples and indicates a significant difference (at the f? P 6** cent level of confidence) between the two coefficients. As another test of significance* Rider1s logarithmic transformation method was applied.^*

His normal deviate was

obtained by the formula: x 2 (Xi - X 2 ) + (l/N-L-3 + l/N2“3 )s where X^ and X 2 are the transformations for the correlation coefficients and Nj and N2 the respective number of cases# The normal deviate obtained by this method was x 2 l.l6 l.

A normal deviate this large would be expected

approximately 5 * 4 times in one hundred samplings, and, as suchi- is just short of the level that would make the differ­ ence between the correlation coefficients significantly different from zero*

3 Ibid.T P* l6 l# 4 Paul R. Rider, An Introduction to Modern Statistical Methods (New York: John F/iley and Sons, Inc., 1939 )» pp. b4“% *

3k As evidenced by the two approximations of significance, the difference between the multiple correlations of the two batteries is not conclusive, but, nevertheless, the probability of real difference is rather strong. The intercorrelations.

An unexpected result was the

size of the intercorrelations among Parts I, II, and III of the Aptitude Survey (Table III, p. 23).

Such values are not

uncommon among aptitude tests; nor are the California battery intercorrelations lower, compared test for test with the Survey tests.

However, for tests designed for factorial

purity they are higher than is desirable. Guilford and Zimmerman expect the intercorrelations of their tests generally will be very low, with few exceptions. Preliminary analyses of standardization results showed corre­ lations of .19 between Parts I and II, as against the .50 from the supervisor results;

.19 between Parts II and III,

compared to the present .lj.3 ; and .1 6 , contrasted to -.1 8 , between Parts II and VII.^

These results point to striking

differences between the standardization population and the present supervisory group.

Some of the other standardization

intercorrelations reported, those between Parts I and VII, II and IV, and III and IV were reasonably close to the corre­ sponding coefficients computed for this study. 5 Guilford and Zimmerman, tfThe Guilford-Zimmerman Apti­ tude Survey,u Journal of Applied Psychology, 32:32-33* February, 191^8 .

35 Perhaps one of the reasons for the sizeable inter­ correlations for the supervisors of this study is that in groups such as this sample, the practice in, and knack of, taking paper-and-pencil tests are greater factors than in standardization groups where tests of this type have given much lower intercorrelations. It is conceivable that factors such as number of aptitude tests taken in school, number of aptitude tests taken,since then, and length of time since school attendance could easily affect the facility of taking ’’pressure” tests such as the well intercorrelated Numerical Operations and Reasoning tests.

Positive standing on these factors could

influence positively verbal facility.

This may explain the

higher-than-expected intercorrelations of the Verbal Compre­ hension test in these results.

Such influences should have

less biasing effect in college standardization groups, since the range on these factors would be more restricted. The low correlations of perceptual Speed with every other variable in its battery attest to its purity.

There

was enough dispersion in the scores of this test so that if there were common variance with this test it should have showed up. That the only intercorrelations of consequence in­ volving the Mechanical Knowledge test are negative is worth noting.

The slight negative relation with the reasoning

36 test shows that one has little success in reasoning out items that should have ansv/ers based on past experience. reasoning would seem a handicap.

In fact,

The even stronger negative

relation with Numerical Operations cannot be rationalized. However, conclusions concerning the results of this test have to be made in the light of a small N and a relatively large standard error. Since the intercorrelations of the California Test of Mental Maturity are not as crucially important to the approach to aptitude testing upon which the battery is based, they will not. be discussed in this paper. The correlations between similar tests of the two batteries.

As a matter of interest in the comparability of

tests in opposing batteries, the coefficient of correlation between the two verbal tests was calculated.

The correlations

of the General Reasoning test with the Numerical Quantity and Inference tests, and of the Numerical Operations test with the Numerical Quantity test were determined also. The great apparent similarity between the vocabulary tests was supported by a coefficient of .827-

The same was

found for the Numerical Quantity and General Reasoning tests, which yielded a coefficient only slightly lower, .613.

The

obtained correlation between the Numerical Operations and Numerical Quantity tests was •J4.O8 , while that between the

37 General Reasoning test and the Inference test was a relatively low .233. The high correlation of the vocabulary tests is not surprising, since these variables are so similar in content. The same is true for the General Reasoning test and the Numerical Quantity test.

That the relationship was not

even greater may be a result of an effort on the part of the authors of the former to cut down on the number variance by simplification of the quantities involved.^

However, the

Numerical Quantity test correlated with the Numerical Oper­ ations test no more than the General Reasoning test, thus leaving the reduced number variance content of the General Reasoning test in doubt. Again the results should be weighed relative to the particular situation, and the comments made concerning the high degree of intercorrelation within batteries apply here also. Interpretation of results.

The precision of this

study might have been improved materially if the sample had been more homogeneous in character.

If the supervisors

were either all from the office force, or all from the factory force, the relationships might have been even higher, since the criterion could then be more meaningful. b Ibid., p. 27.

Furthermore,

38 the results for such a group could be handled more definitively in applications to other groups. The results of the study would have been improved further by considering a wider range of talent.

Ideally, the most

meaningful experimental situation would be one in which the unsatisfactory supervisors were so incapable in supervisory performance as to be released from their positions and the satisfactory supervisors of excellent ability.

Some of the

supervisors of this study no doubt were of excellent quality, but the fact that all of the cases lasted the several months between the administration of the two batteries makes it doubt­ ful that the former is represented sufficiently.

Consequently,

the range of variation in ability is definitely restricted in character. However, the relationships found between the tests and the criterion of this study are superior to the results of the majority of comparable studies reported in the literature. It is probable that one reason for the many low correlations . obtained by other investigators is the use of criteria having low job-validity and low reliability.

Vi/hether the advantage

of the correlations of this study lies in the superiority of the tests or In the criterion, the results indicate that apti­ tude tests can do a good job.

In addition, they promise even

better results once they are improved in independence of factor content and new variables are provided to account for the hidden variance in the true criteria.

CHAPTER V I

SUMMARY AND CONCLUSION I.

SUMMARY

Five variables of the Guilford-Zimmerman Aptitude Survey and the six variables of the California Short-Form Test of Mental Maturity were administered to eighty-four aircraft company supervisors.

Using a composite score of

the supervisor’s over-all standing with the company as the criterion, a correlational analysis of the results was made to determine the predictive relations of the tests, and to compare the two batteries as to effectiveness in the predic­ tion of supervisory success. The findings of the study.

The more important

determinations of the analysis are as follows: 1.

The highest correlating variable of each battery

was its vocabulary test. 2*

The California Test of Mental Maturity yielded a

multiple correlation of .719 (when 5 tests were included and optimally weighted), representing a

30

cent increase in

accuracy of prediction beyond that which would prevail without its use* 3.

The Guilford-Zimmerman Aptitude Survey yielded a

multiple correlation of .573 (with Ij. tests included and

ko optimally weighted), representing an 18 per cent increase in predictive efficiency. ij..

The apparent superiority of the California Test

of Mental Maturity in this prediction situation was shown to be significant by one test of significance, and just short of the common standard of significance by the appli­ cation of another criterion* 5.

The significance of the multiple correlations

was beyond much doubt. 6.

The tests of each battery were found to be

approximately equally intercorrelated.

However, the Per­

ceptual Speed and Mechanical Knowledge tests of the Guilf ord-Zimmerman Aptitude Survey indicated noteworthy independence of variance. 7.

Ho good explanation for the spurious intercorre­

lations of the other Aptitude Survey parts could be found, though practice and knack of paper-and-pencil test writing was considered a possibility. 8.

The variance relationships between several

tests of one battery and similar tests of the other battery was determined and the findings rationalized in the light of the expectations for the pure-factor tests.

II.

CONCLUSION

This thesis has shown that paper-and-pencil aptitude tests can substantially increase the accuracy or predicting the success of aircraft company supervisors.

Further gains

are to be sought by use of techniques designed to maximize variance of crucial factors.^

One such technique, which

most probably would have increased the efficiency of the Guilford-Zimmerman battery in this study, is the use of a suppression variable.

For example, a weighted score for

a combination of "age” and ffnumber of aptitude tests taken previously”, might have served to suppress the unwanted common variance of the Verbal Comprehension, Reasoning, and. Numerical Operations tests, while maximizing their unique contributions• The results should be applicable to other types of large engineering, assembling, and fabricating companies as well.

The findings apply to a varied assortment of supervisors

including personnel from factory, office, and staff specialist groups.

The relations found may be strengthened or weakened

by considering a more exclusive group, but in all probability they would be strengthened, in a well designed study.

An

investigation similar to this, but using a more exclusive 1 J. P. Guilford and William B. Michael, ‘’Approaches to Univocal Factor Scores,11 Psychometrika, 13:1-22, March,

1948.

k2 group, would be a valuable contribution in this field of research. Further research should be pursued in this field with factorially pure tests.

Though the Verbal Comprehension

test was the only unique-factor test to show strong relation­ ship with the criterion in this study, other basic factor tests, including the Spatial Orientation and Spatial Visualization parts of the Guilford-Zimmerman Aptitude Survey, should be included in future studies.

Until this

is done, the value of the pure-test technique for selection and classification purposes cannot be fairly ascertained.

BIBLIOGRAPHY

A.

BOOKS

Bingham, Walter Van Dyke, Aptitudes and Aptitude Testing. New York: Harper, 1937* 396 P P • Burtt, Harold E., Principles of Employment Testing. edition; New York: Harper, 1942.' £68 pp.

Revised

Fisher, R. A., Statistical Methods for Research Workers. Revised edition; Edinhurgh: Oliver and Boyd, 194-5. 354 pp. Guilford, J. P., Fundamental Statistics in Psychology and Education. Fourth edition; New York: McGraw-Hill Book Company, Inc., 194-2. 333 PP* , editor, Printed Classification Tests. Report N o / AAF Aviation Psychology Program Research Reports. Washington: Government Printing Office, 1957• 9^9 P P • _______ , Psychometric Methods. New York: McGraw-Hill Book C omp any, Inc., 19 3 ^ 366 p p . Hull, Clark L., Aptitude Testing. Company, 1 9 2 5 . B3F~PP.

New York: World Book

Rider, Paul R., An Introduction to Modern Statistical Methods. New York: John Wiley and SonsT Inc., 1939* 220 pp. Thurstone, L. L., Primary Mental Abilities. Psychometric Monographs, N o . 1• Chicago: University o f ,Chicago Press, 1936* i21 pp. , The Reliability and Validity of Tests. Edwards Brothers, 1931* 113 PP* B.

Ann Arbor:

PERIODICAL ARTICLES

Baldwin, E* F., and L. F. Smith, ftThe Performance of Adult Female Applicants for Factory,Work on the LikertQuasha Revision of the Minnesota paper Form Board Test,” Journal of Applied Psychology, 28:4.68-4.70, December, 1944-* Bills, Marian A., f,Trends in Selection for Employment,” Personnel, I5:l84--193> May, 1939*

45 Burt, Cyril, ’’Validating Tests for Personnel Selection,11 British Journal of psychology, 34:1-19, September, 1943• Driver, R. S., t!The Value of Psychological Tests in In­ dustry,ff Personnel, 1 9 :656-0 6 4 , March, 1943* Prandsen, Arden N., and John iy[. Hadley, ”The Prediction of Achievement in a Radio Training School,** Journal of Applied Psychology, 27:303-310, August, _1*343• Guilford, J. P., ’’New Standards for Test Evaluation,” Educational and Psychological Measurement, 6:425-436, October, 1*946 • ______ , ffThe Discovery of Aptitude and Achievement VariabTes,” Science, 106:279-282, September 26, 194?• , and William B. Michael, ,1Approaches to Univocal Factor Scores,” Psychometrika, 13:1-22, March, 1946. _______ , and Wayne S. Zimmerman, ’’The Guilford-ZImmerman Aptitude S u r v e y , ” Journal of Applied Psychology, 32: .24-34* February, l94*3* _______ , and Wayne S. Zimmerman, ’’Some AAF Findings Con­ cerning Aptitude Factors,” Occupations, 26:154-156, December, 1947* * Harrell, Willard, ’’Testing Cotton Mill Supervisors,” Journal of Applied Psychology, 24:31-35, January, 194°* _______ , and Richard Faubion, ’’Selection Tests for Aviation Mechanics,” Journal of Consulting Psychology, 4 :104~105, No. 2, March, 1946* Hay, Edward N., ’’Tests in Industry,” Personnel Journal, 20: 3-15, May, 194l. _______ , ’’The Use of Psychological Tests in Selection and Promotion,” Personnel, 16:114-123, February, 1940. Holliday, F. A., ”A Survey of an Investigation into the Selection of Apprentices for the Engineering Industry,” Occupational Psychology, 16:1-19, January, 194^. Irwin, R. Randell, ’’Lockheed*s Full Testing Program,” Personnel Journal, 21:103-106, September, 194^* ~ Kirkpatrick, Forrest H., ”Common Sense about Tests,” Personnel Journal, 21:277-281, February, 1943*

1*6 Mandell, Milton M., “Testing for Administrative and. Super­ visory Positions,’1 Public Personnel Review, 9*190-193* October, 19^8 -----------------------McDaniel, J. W., and William A* Reynolds, “A Study of the Use of Mechanical Aptitude Tests in the Selection of Trainees for Mechanical Occupations,11 Educational and Psychological Measurement, 4:191-197* N o . ,3** Autumn, 1944* Richardson, Marion W . , f,The Interpretation of a Test Validity Coefficient in Terms of Increased Efficiency of a Selected Group of Personnel,f1 Psychometrika, 9*245-248, December, 1944* Sartain, A. Q., “Relation Between Scores on Certain Standard Tests and Supervisory Success in an Aircraft Factory,11 Journal of Applied Psychology, 30:328-332, August, 1946* Schultz, Richard S., “Personnel Selection in Aviation Industry,“ Personnel Journal, 19*99"*105* September, 194^* Shuman, J. T., “The Value of Aptitude Tests for Factory Workers in the Aircraft Engine and Propeller Industries,” Journal of Applied Psychology, 29*156-160, 185-190> April and June, 1945* Smith, Robert E., “Foreman Selection Through Merit Rating,” Personnel, 20:270-277* March, 1944* Solomon, Richard S., “Do Your Tests Pick Good Workers?” Personnel Journal, 20:177-183* November, 194^» Stockford, Lee, “Selection of Supervisory Personnel,“ Personnel, 24*186-199* November, 1947* , and H. W. Bissell, “Factors Involved in Establishing a Merit Rating Scale,” Personnel, 26:94-ll6, September, 1949* Stuit, Dewey B., “The Effect of the Nature of the Criterion Upon the Validity of Aptitude Tests,“ Educational and Psychological Measurement, 4 *871-6 7 6 , No. 4** Winter,

T947T

Taylor, H. C., and J. T. Russell, “The Relationship of Validity Coefficients to the Practical Effectiveness of Tests in Selection: Discussion and Tables,” Journal of Applied psychology, 23*565-578, October, 1939*

kl Thompson, Claud Edward, ff3electing Executives by Psychological Tests,” Educational and Psychological Measurement, 7:773778, No. l{_, Winter, T957Tucker, Ledyard B., ”A Note on the Computation of a Table of Intercorrelations,11 Psychometrika, 13:2ij.5-250. Wadsworth, Guy W., Jr., ”Tests Prove Their Worth to a Utility,” Personnel Journal, llj_:183-187* November, 1935. , ”The Use of Tests in Selection,” Personnel Admini­ stration, 2:1-8, No. 6, February, 19lj-0. C.

MANUALS

Guilford, J. P., and Wayne S. Zimmerman, The GuilfordZimmerman Aptitude Survey— A Manual of Instructions and Interpretations. Beverly Hills: Sheridan Supply Company, 19^7 • 8~pp. Sullivan, Elizabeth T., Willis W. Clark, and Earnest W. Tiegs, Manual of Directions, California Short Form Test of Mental Maturity--Advanced S-Form. Los Angeles: California Test Bureau, l^jTT 8 pp.

APPENDIX I