Measures of Association
MEASURES OF ASSOCIATION
Long before there were statisticians, folk knowledge was commonly based on statistical associations. When an association was recognized between stomach distress and eating a certain type of berry, that berry was labeled as poisonous and avoided. For millennia, farmers the world over have observed an association between drought and a diminished crop yield. The association between pregnancy and sexual intercourse apparently was not immediately obvious, not simply because of the lag between the two events, but also because the association is far from perfect—that is, pregnancy does not always follow intercourse. Folk knowledge has also been laced with superstitions, commonly based on erroneously believed statistical associations. For example, people have believed that there is an association between breaking a mirror and a long stretch of bad luck, and in many cultures people have believed that there is an association between certain ritual incantations and benevolent intervention by the gods.
Scholarly discussions sometimes focus on whether a given association is actually true or erroneously believed to be true. Is there an association between gender and mathematical ability? Between harsh punishment and a low incidence of crime? Between the size of an organization and the tendency of its employees to experience alienation? In contemporary discussions, questions and conclusions may be expressed in terms of "risk factors." For example, one might seek to find the risk factors associated with dropping out of school, with teen suicide, or with lung cancer. "Risk factors" are features that are associated with these outcomes but that can be discerned prior to the outcome itself. Although a reference to "risk" suggests that the outcome is undesirable, researchers may, of course, explore factors that are associated with positive as well as negative outcomes. For example, one could examine factors associated with appearance in Who's Who, with the success of a treatment regimen, or with a positive balance of international trade.
Referring to a "risk factor" entails no claim that the associated factor has an effect on the outcome, whereas a statistical association is sometimes erroneously interpreted as indicating that one variable has an effect on the other. To be sure, an interest in the association between two variables may derive from some hypothesis about an effect. Thus, if it is assumed that retirement has the effect of reducing the likelihood of voting, the implication is that retirement and non-voting should be statistically associated. But the reverse does not hold; that is, the fact that two variables are statistically associated does not, by itself, imply that one of those variables has an effect on the other. For example, if it is true that low attachment to parents encourages involvement in delinquency, it should be true that low attachment and delinquency are statistically associated. But a statistical association between low attachment and delinquency involvement might arise for other reasons as well. If both variables are influenced by a common cause, or if both are manifestations of the same underlying tendency, those variables will be statistically associated with each other, even if there is no effect of one on the other. Finding a statistical association between two variables, even a strong association, does not, in itself, tell the reason for that association. It may result from an effect of one variable on another, or from the influence of a common cause on both variables, or because both variables reflect the same underlying tendency. Furthermore, if the association is transitory, it may appear simply because of an accident or coincidence. Discovering the reason for a statistical association always entails inquiry beyond simply demonstrating that the association is there.
WHY MEASURE ASSOCIATION?
The focus here is on measures of association for categorical variables, with brief attention to measures appropriate for ordered categories. Quantitative variables, exemplified by age and income, describe each case by an amount; that is, they locate each case along a scale that varies from low to high. In contrast, categorical variables, exemplified by gender and religious denomination, entail describing each case by a category; that is, they indicate which of a set of categories best describes the case in question. Such categories need not be ordered. For example, there is no inherent ordering for the categories that represent region. But if the categories (e.g., low, medium, and high income) are ordered, it may be desirable to incorporate order into the analysis. Some measures of association have been designed specifically for ordered categories.
The degree of association between categorical variables may be of interest for a variety of reasons. First, if a weak association is found, we may suspect that it is just a sampling fluke—a peculiarity of the sample in hand that may not be found when other samples are examined. The strength of association provides only a crude indication of whether an association is likely to be found in other samples, and techniques of statistical inference developed specifically for that purpose are preferred, provided the relevant assumptions are met.
Second, a measure of the degree of association may be a useful descriptive device. For example, if the statistical association between region and college attendance is strong, that suggests differential access to higher education by region. Furthermore, historical changes in the degree of association may suggest trends of sociological significance, and a difference in the degree of association across populations may suggest socially important differences between communities or societies. For example, if the occupations of fathers and sons are more closely associated in Italy than in the United States, that suggests higher generational social mobility in the latter than in the former. Considerable caution should be exercised in comparing measures of association for different times or different populations, because such measures may be influenced by a change in the marginal frequencies as well as by a change in the linkage between the variables in question. (See Reynolds 1977)
Third, if a statistical association between two variables arises because of the effect of one variable on the other, the degree of association indicates the relative strength of this one influence as compared to the many other variables that also have such an effect. Unsophisticated observers may assume that a strong association indicates a causal linkage, while a weak association suggests some kind of noncausal linkage. But that would be naïve. The strength of association does not indicate the reason for the association. But if an association appears because of a causal link between two variables, the strength of that association provides a rough but useful clue to the relative importance of that particular cause relative to the totality of other causes. For example, income probably influences the tendency to vote Democratic or Republican in the United States, but income is not the only variable affecting the political party favored with a vote. Among other things, voting for one party rather than another is undoubtedly influenced by general political philosophy, by recent legislative actions attributed to the parties, by specific local representatives of the parties, and by the party preferences of friends and neighbors. Such multiple influences on an outcome are typical, and the degree of association between the outcome and just one of the factors that influence it will reflect the relative "weight" of that one factor in comparison to the total effect of many.
Fourth, if a statistical association between two variables arises because both are influenced by a common cause, or because both are manifestations of the same underlying tendency, the degree of association will indicate the relative strength of the common cause or the common tendency, in comparison to the many other factors that influence each of the two variables. Assume, for example, that participation in a rebellious youth subculture influences adolescents to use both alcohol and marijuana. If the resulting association between the two types of substance use is high, this suggests that the common influence of the rebellious youth subculture (and perhaps other common causes) is a relatively strong factor in both. On the other hand, if this association is weak, it suggests that while the rebellious youth subculture may be a common cause, each type of substance use is also heavily influenced by other factors that are not common to both types.
Fifth, the degree of association indicates the utility of associated factors ("risk factors") as predictors, and hence the utility of such factors in focusing social action. Assume, for example, that living in a one-parent home is statistically associated with dropping out of high school. If the association between these two variables is weak, knowing which students live in a one-parent home would not be very useful in locating cases on which prevention efforts should be concentrated for maximum effectiveness. On the other hand, if this association is strong, that predictor would be especially helpful in locating cases for special attention and assistance.
In summary, we may be interested in the degree of statistical association between variables because a weak association suggests the possibility that the association is a fluke that will not be replicated, because changes in the degree of association may help discern and describe important social trends or differences between populations, because the degree of association may help determine the relative importance of one variable that influences an outcome in comparison to all other influences, because the degree of association may reflect the degree to which two variables have a common cause, and because the degree of association will indicate the utility of associated factors in predicting an outcome of interest.
MEASURING THE DEGREE OF ASSOCIATION
The degree of statistical association between two variables is most readily assessed if, for a suitable set of cases, the relevant information is tallied in a cross-classification table. Table 1 displays such a
college attendence by race in a sample of 20-year-olds in centerville, 1998 (hypothetical data) | ||||
attending college? | white | black | asian-american | total |
yes | 400 | 60 | 80 | 540 |
no | 300 | 140 | 20 | 460 |
total | 700 | 200 | 100 | 1,000 |
table. For a contrived sample of young adults, this table shows the number who are and who are not attending college in each of three racial groupings. Hence the two variables represented in this table are (1) race (white, black, Asian-American) and (2) attending college (or not) at a given time. The frequencies in the cells indicate how the cases are jointly distributed over the categories of these two variables. The totals for each row and column (the "marginal frequencies" or simply the "marginals") indicate how the cases are distributed over these two variables separately.
If young adults in all racial groupings were equally likely to be attending college, then there would be no association between these two variables. Indeed, the simplest of all measures of association is just a percentage difference. For example, blacks in this set of cases were unlikely to be attending college (i.e., 30 percent were enrolled), while Asian-Americans were very likely to be attending (80 percent). Hence we may say that the percentage attending college in the three racial groupings represented in this table ranges from 30 percent to 80 percent, a difference of 50 percentage points. In this table, with three columns, more than one percentage difference could be cited, and the one alluded to above is simply the most extreme of the three comparisons that could be made between racial groupings. Generally speaking, a percentage difference provides a good description of the degree of association only in a table with exactly two rows and two columns. Even so, citing a percentage difference is a common way of describing the degree of statistical association.
Leaving aside the difference between percentages, most measures of association follow one of two master formulas, and a third way of assessing association provides the basis for analyzing several variables simultaneously. The oldest of these master formulas is based on the amount of departure from statistical independence, normed so that the measure will range from 0 (when the two variables are statistically independent and hence not associated at all) to 1.0 or something approaching 1.0 (when the cross-classification table exhibits the maximum possible departure from statistical independence). The several measures based on this master formula differ from each other primarily in the way the departure from statistical independence is normed to yield a range from 0 to 1.
The second master formula is based on the improvement in predictive accuracy that can be achieved by a "prediction rule" that uses one variable to predict the other, as compared to the predictive accuracy achieved from knowledge of the marginal distribution alone. The several measures based on this master formula differ from each other in the nature of the "prediction rule" and also in what is predicted (e.g., the category of each case, or which of a pair of cases will be higher). When such a measure is 0 there is no improvement in predictive accuracy when one variable is predicted from another. As the improvement in predictive accuracy increases, these measures of association will increase in absolute value up to a maximum of 1, which indicates prediction with no errors at all.
A third important way of assessing association, used primarily when multiple variables are analyzed, is based on the difference in odds. In Table 1, the odds that an Asian-American is attending college are "4 to 1"; that is, 80 are in college and 20 are not. If such odds were identical for each column, there would be no association, and the ratio of the odds in one column to the odds in another would be 1.0. If such ratios differ from 1.0 (in either direction), there is some association. An analysis of association based on odds ratios (and more specifically on the logarithm of odds ratios) is now commonly referred to as a loglinear analysis. This mode of analysis is not discussed in detail here.
Departure from Statistical Independence. The traditional definition of statistical independence is expressed in terms of the probability of events; that is, events A and B are statistically independent if, and only if: "P(A)" may be read as the probability that event A occurs. This probability is usually estimated empirically by looking at the proportion of all relevant events (A plus not-A) that are A. Referring to Table 1, if event A means attending college, then the probability of event A is estimated by the proportion of all relevant cases that are attending college. In Table 1, this is .54 (i.e., 540 were attending out of the table total of 1,000).
"P(A|B)" may be read as the conditional probability that event A occurs, given that event B occurs, or, more briefly "the probability of A given B." This conditional probability is usually estimated in a manner parallel to the estimation of P(A) described above, except that the relevant cases are limited to those in which event B occurs. Referring to Table 1 again, if event A means attending college and event B refers to being classified as Asian-American, then P(A|B) = .80 (i.e., 80 are attending college among the 100 who are Asian-American).
As indicated above, the traditional language of probability refers to "events," whereas the traditional language of association refers to "variables." But it should be evident that if "events" vary in being A or not-A, then we have a "variable." The difference between the language used in referring to the probability of "events" and the language used in referring to a statistical association between "variables" need not be a source of confusion, since one language can be translated into the other.
If we take as given the "marginal frequencies" in a cross-classification table (i.e., the totals for each row and each column), then we can readily determine the probability of any event represented by a category in the table; that is, we can determine P(A). Since P(A|B) = P(A) if statistical independence holds, we can say what P(A|B) would be for any A and B in the table if statistical independence held. Otherwise stated, if the marginal frequencies remain fixed, we can say what frequency would be expected in each cell if statistical independence held. Referring again to Table 1 and assuming the marginal frequencies remain fixed as shown, if statistical independence held in the table, then 54 percent of those in each racial grouping would be attending college. This is because, if statistical independence holds, then P(A) (i.e., .54) must equal P(A|B) for all B (i.e., the proportion enrolled in each of the columns). Hence, if statistical independence held, we would have 378 whites in college (i.e., 54 percent of the 700 whites in the table), 108 blacks in college (i.e., 54 percent of the 200 blacks in the table), and 54 Asian-Americans in college (i.e., 54 percent of the 100 Asian-Americans in the table.) Evidently, the number not enrolled in each racial grouping could be obtained in a similar way (i.e., 46 percent of each column total), or by subtraction (e.g., 700 whites minus 378 whites enrolled leaves 322 whites not attending college). Hence, the frequencies that would be "expected" for each cell if statistical independence held can be calculated, not just for Table 1 but for any cross-classification table.
If the "expected" frequencies for each cell are very similar to the "observed" frequencies, then the departure from statistical independence is slight. But if the "expected" frequencies differ greatly from the corresponding "observed" frequencies, then the table displays a large departure from statistical independence. When the departure from statistical independence reaches its maximum, an ideally normed measure of association should then indicate an association of 1.0.
A quantity called chi square is conventionally used to reflect the degree of departure from statistical independence in a cross-classification table. Chi square was originally devised as a statistic to be used in tests against the null hypothesis; it was not designed to serve as a measure of association for a cross-classification table and hence it does not range between 0 and 1.0. Furthermore, it is not well suited to serve as a measure of association because it is heavily influenced by the total number of cases and by the number of rows and columns in the cross-classification table. Even so, calculating chi square constitutes the first step in calculating measures of association based on departure from statistical independence. For a cross-classification table, this statistic will be zero when the observed frequencies are identical to the frequencies that would be expected if statistical independence held, and chi square will be progressively larger as the discrepancy between observed and expected frequencies increases. As indicated below, in the calculation of chi square, the differences between the frequencies observed and the frequencies expected if statistical independence held are squared and weighted by the reciprocal of the expected frequency. This means, for example, that a discrepancy of 3 will be more heavily weighted when the expected frequency is 5 than when the expected frequency is 50. These operations are succinctly represented in the following formula for chi square: where χ2 = chi square
Σ is the instruction to sum the quantity that follows over all cells
Oi = the frequency observed in the ith cell
Ei = the frequency expected in the ith cell if statistical independence holds
To illustrate equation (2), consider Table 2, which shows (1) the observed frequency (O) for each cell as previously shown in Table 1; (2) the frequency expected (E) for each cell if statistical independence held (in italics), and, (3) for each cell, the squared difference between observed and expected frequencies, divided by the expected frequency (in bold type). When the quantities in bold type are summed over the six cells—in accord with the instruction in equation 2—we obtain a chi square of 76.3. There are various ways to norm chi square to create a measure of association, although no way of norming chi square is ideal since the maximum possible value will not be 1.0 under some commonly occurring conditions.
The first measure of association based on chi square was Pearson's Coefficient of Contingency (C), which is defined as follows:
This measure can never reach 1.0, although its maximum possible value approaches 1.0 as the number of rows and columns in the table increases. For Table 1, C = .27. An alternative measure of association based on chi square is Cramer's V, which was developed in an attempt to achieve a more appropriately normed measure of association. V is defined as follows:
The instruction in the denominator is to multiply the table total (N) by whichever is smaller: the number of rows minus 1 or the number of columns minus 1 (i.e., the "Min," or minimum, of the two quantities in parentheses). In Table 1, the minimum of the two quantities is r − 1 = 1, and the denominator thus becomes N. Hence, for Table 1, V = .28.
While this measure can reach an upper limit of 1.0 under certain conditions, it cannot be 1.0 in some tables. In Table 1, for example, if all of the 540 persons attending college were white, and all blacks and Asian-Americans were not in college, chi square would reach a value of approximately 503 (and no other distribution that preserves the marginals would yield a larger chi square). This maximum possible departure from statistical independence (given the marginal frequencies) yields a V of .71.
In the special case of a cross-classification table with two rows and two columns (a "2 x 2 table") V becomes phi (Φ), where The maximum possible value of phi in a given table is 1.0 if and only if the distribution over the two row categories is identical to the distribution over the two column categories. For example, if the cases in the two row categories are divided, with 70 percent in one and 30 percent in the other, the maximum value of phi will be 1.0 if and only if the cases in the two column categories are also divided, with 70 percent in one and 30 percent in
College Attendence by Race: Observed Frequencies (from Table 1), Expected Frequencies Assuming Statistical Independence (in Italic), and (O–E) 2/E for Each Cell (in Bold Type) | ||||
attending college? | white | black | asian-american | total |
χ2= 1.3 + 21.3 + 12.5 + 1.5 + 25.0 + 14.7 + 76.3 | ||||
c = 0.27 | ||||
v = 0.28 | ||||
yes | 400 | 60 | 80 | 540 |
378 | 108 | 54 | ||
1.3 | 21.3 | 12.5 | ||
no | 300 | 140 | 20 | 460 |
322 | 92 | 46 | ||
1.5 | 25.0 | 14.7 | ||
total | 700 | 200 | 100 | 1,000 |
the other. Some consider measures of association based on chi square to be flawed because commonly encountered marginals may imply that the association cannot possibly reach 1.0, even if the observed frequencies display the maximum possible departure from statistical independence, given the marginal frequencies. But one may also consider this feature appropriate because, if the degree of statistical association in a cross-classification table were perfect, the marginal distributions would not be disparate in a way that would limit the maximum value of the measure.
A more nagging concern about measures of association based on the departure from statistical independence is the ambiguity of their meaning. One can, of course, use such measures to say that one association is very weak (i.e., close to zero) and that another is relatively strong (i.e., far from zero and perhaps close to the maximum possible value, given the marginals), but "weak" and "strong" are relatively crude descriptors. The measures of association based on chi square may also be used in making comparisons. Thus, if a researcher wished to compare the degree of association in two populations, C or V could be compared for the two populations to determine whether the association was approximately the same in both and, if not, in which population the association was stronger. But there is no clear interpretation that can be attached to a coefficient of contingency of precisely .32, or a Cramer's V of exactly .47.
Relative Reduction in Prediction Error. We shift now to measures of association that reflect the relative reduction in prediction error. Since such measures indicate the proportion by which prediction errors are reduced by shifting from one prediction rule to another, we follow common practice and refer to them as proportional reduction in error (PRE) measures. Every PRE measure has a precise interpretation, and sometimes it is not only precise but also clear and straightforward. On the other hand, the PRE interpretation of some measures may seem strained and rather far removed from the common sense way of thinking about the prediction of one variable from another.
The basic elements of a PRE measure of association are:
- a specification of what is to be predicted, and a corresponding definition of prediction error. For example, we might say that what is to be predicted is the row category into which each case falls, and a corresponding definition of prediction error would be that a case falls in a row category other than that predicted. Referring again to Table 1, if we are predicting whether a given case is attending college or not, a corresponding definition of prediction error would be that our predicted category (attending or not) is not the same as the observed category for that case.
- a rule for predicting either the row variable or the column variable in a cross-classification table from knowledge of the marginal distribution of that variable alone. We will refer to the prediction error when applying this rule as E1. For example, if what is to be predicted is as specified above (i.e., the row category into which each case falls), the rule for predicting the row variable from knowledge of its marginal distribution might be to predict the modal category for every case. This is not the only possible prediction rule but it is a reasonable one (and there is no rule based only on the marginals that would have higher predictive accuracy). Applying this rule to Table 1, we would predict "attending college" (the modal category) for every case. We would then be wrong in 460 cases out of the 1,000 cases in the table, that is, each of the 460 cases not attending college would be a prediction error, since we predicted attending college for every case. Hence, in this illustration E1 = 460.
- A rule for predicting the same variable as in step (2) from knowledge of the joint distribution of both variables. We will refer to the prediction error when applying this rule as E2. For example, continuing with the specifications in steps (1) and (2) above, we specify that we will predict the row category for each case by taking the modal category for each column. Thus, in Table 1, we would predict "attending college" for all whites (the modal category for whites), "not attending college" for all blacks (the modal category for blacks), and "attending college" for Asian-Americans. The prediction errors are then 300 for whites (i.e., the 300 not attending college, since we predicted attending for all whites), 60 for blacks (i.e., the 60 attending college, since we predicted nonattending for all blacks), and 20 for Asian-Americans, for a total of 380 prediction errors. Thus, in this illustration E2 = 380.
- The calculation of the proportion by which prediction errors are reduced by shifting from the rule in step (2) to the rule in step (3), that is, the calculation of the proportional reduction in error. This is calculated by:
The numerator in this calculation is the amount by which error is reduced. Dividing this amount by the starting error indicates what proportion of the possible reduction in prediction error has actually been achieved. Utilizing the error calculations above, we can compute the proportional reduction in error. This calculation indicates that we achieve a 17.4 percent reduction in prediction error by shifting from predicting the row category that is the marginal mode for all cases to predicting the column-specific modal category for all cases in a given column.
The PRE measure of association with prediction rules based on modal categories (illustrated above) is undoubtedly the simplest of the many PRE measures that have been devised. This measure is called lambda (̻), and for a given cross-classification table there are two lambdas. One of these focuses on predicting the row variable (̻r), and the other focuses on predicting the variable that is represented in columns (̻c). In the illustration above, we computed ̻r; that is, we were predicting college attendance, which is represented in rows. Shifting to ̻c (i.e., making the column variable the predicted variable), we find that the proportional reduction in error is 0. This outcome is evident from the fact that the modal column for the table ("White") is also the modal column for each row. Thus, the prediction errors based on the marginals sum to 300 (i.e., the total who are not "white") and the prediction errors based on rowspecific modal categories also sum to 300, indicating no reduction in prediction error. Thus, the proportional reduction in prediction error (as measured by lambda) is not necessarily the same for predicting the row variable as for predicting the column variable.
An alternative PRE measure for a cross-classification table is provided by Goodman and Kruskal's tau measures (τr and τc) (1954). These measures are based on rediction rules that entail distributing predictions so as to recreate the observed distributions instead of concentrating all predictions in the modal category. In doing so, there is an expected number of misclassified cases (prediction errors), and these expected numbers are used in calculating the proportional reduction in error. In Table 1, τr = .25 and τc = .05. Although ̻c was found to be zero for Table 1 because the modal column is the same in all rows, ̻c is not zero because the percentage distributions within rows are not identical.
Other PRE measures of association have been developed, and some have been designed specifically for a cross classification of ordered categories (ordinal variables). For example, if people were classified by the highest level of education completed (e.g., into the categories pre–high school, high school graduation, bachelor's degree, higher degree) we would have cases classified into a set of ordered categories and hence an ordinal variable. If the same cases were also classified into three levels of income (high, medium, and low) the result would be a cross classification of two ordinal variables. Although several measures of association for ordinal variables have been devised, the one now most commonly used is probably Goodman and Kruskal's gamma (γ). Gamma is a PRE measure, with a focus on the prediction of order within pairs of cases, with order on one variable being predicted with and without knowledge of the order on the other variable, disregarding pairs in which there is a tie on either variable.
These and other PRE measures of association for cross-classification tables are described and discussed in several statistics texts (See, for example, Blalock 1979; Knoke and Bohrnstedt 1991; Loether and McTavish 1980; Mueller et al. 1977). In some instances, the prediction rules specified for a given PRE measure may closely match the specific application for which a measure of association is sought. More commonly, however, the application will not dictate a specific kind of prediction rule. The preferred measure should then be the one that seems likely to be most sensitive to the issues at stake in the research problem. For example, in seeking to identify "risk factors" associated with a relatively rare outcome, or with a very common outcome, one of the tau measures would be more appropriate than one of the lambda measures, because the modal category may be so dominant that lambda is zero in spite of distributional differences that may be of interest.
MEASURES OF ASSOCIATION IN APPLICATION
When the initial measures of association were devised at the beginning of the twentieth century, some regarded them as part of a new mode of inquiry that would replace speculative reasoning and improve research into the linkages between events. At the end of the twentieth century, we now recognize that finding a statistical association between two variables raises more questions than it answers. We now want to know more than the degree to which two variables are statistically associated; we want also to know why they are associated: that is, what processes, what conditions, and what additional variables are entailed in generating the association?
To a limited degree, the measures of association discussed above can be adapted to incorporate more than two variables. For example, the association between two variables can be explored separately for cases that fall within each category of a third variable, a procedure commonly referred to as "elaboration." Alternatively, a new variable consisting of all possible combinations of two predictors can be cross-classified with an out-come variable. But the traditional measures of association are not ideally suited for the task of exploring the reasons for association. Additional variables (e.g., potential sources of spuriousness, variables that mediate the effect of one variable on another, variables that represent the conditions under which an association is weak or strong) need to be incorporated into the analysis to yield an improved understanding of the meaning of an observed association—not just one at a time but several simultaneously. Additional modes of analysis (e.g., loglinear analysis; see Goodman 1970; Knoke and Burke 1980) have been developed to allow an investigator to explore the "interactions" between multiple categorical variables in a way that is roughly analogous to multiple regression analysis for quantitative variables. Computer technology has made such modes of analysis feasible.
The same technology has generated a new use for relatively simple measures of association in exploratory data analysis. It is now possible to describe the association between hundreds or thousands of pairs of variables at very little cost, whereas at an earlier time such exhaustive coverage of possible associations would have been prohibitively expensive. Measures of association provide a quick clue to which of the many associations explored may identify useful "risk factors" or which associations suggest unsuspected linkages worthy of further exploration.
references
Blalock, Hubert M., Jr. 1979 Social Statistics, 2nd ed. New York: McGraw-Hill.
Bohrnstedt, George W., and David Knoke 1988 Statistics for Social Data Analysis, 2nd ed. Itasca, Ill.: F. E. Peacock Publishers.
Costner, Herbert L. 1965 "Criteria for Measures of Association." American Sociological Review 30:341–353.
Fienberg, S. E. 1980 The Analysis of Cross-Classified Categorical Data. Cambridge, Mass.: MIT Press.
Goodman, Leo A. 1970 "The Multivariate Analysis of Qualitative Data: Interactions among Multiple Classifications." Journal of the American Statistical Association 65:226–257.
——1984 The Analysis of Cross-Classified Data Having Ordered Categories. Cambridge, Mass.: Harvard University Press.
Goodman, Leo A., and William H. Kruskal 1954 "Measures of Association for Cross Classifications." Journal of the American Statistical Association 49:732–764.
——1959 "Measures of Association for Cross Classifications: II. Further Discussion and References." Journal of the American Statistical Association 54:123–163.
——1963 "Measures of Association for Cross Classifications: III. Approximate Sampling Theory." Journal of the American Statistical Association 58:310–364.
——1972 "Measures of Association for Cross Classifications: IV. Simplification of Asymptotic Variances." Journal of the American Statistical Association 67:415–421.
Kim, Jae-on 1984 "PRU Measures of Association for Contingency Table Analysis." Sociological Methods and Research 13:3–44.
Knoke, David, and George W. Bohrnstedt 1991 Basic Social Statistics. Itasca, Ill.: F. E. Peacock Publishers.
Knoke, David, and Peter Burke 1980 Loglinear Models. Beverly Hills, Calif.: Sage.
Loether, Herman J., and Donald G. McTavish 1980 Descriptive Statistics for Sociologists: An Introduction. Boston: Allyn and Bacon.
Mueller, John H., Karl F. Schuessler, and Herbert L. Costner 1977 Statistical Reasoning in Sociology. Boston: Houghton Mifflin.
Reynolds, H. T. 1977 Analysis of Nominal Data. Beverly Hills, Calif.: Sage.
Herbert L. Costner