Size Distributions in Economics
Size Distributions in Economics
The size distributions of certain economic and socioeconomic variables—incomes, wealth,firms, plants, cities, etc.—display remarkably regular patterns. These patterns, or distribution laws, are usually skew, the most important being the Pareto law [see PARETO, article on CONTRIBUTIONS TO ECONOMICS] and the log-normal, or Gibrat, law (below). Some disagreement about the patterns actually observed still exists. Theempirical distributions often approximate the Gibrat law in themiddle ranges of the variables and the Pareto law in the upper ranges. The study of size distributions is concerned with explaining why the observed patterns exist and persist. The answer may be found in the conception of the distribution laws as the steady state equilibria of stochastic processes that describe the underlying economic or demographic forces. A steady state equilibrium is a macroscopic condition that results from the balance of a great numberof random microscopic movements proceeding in opposite directions. Thus, in a stationary population a constant age structure ismaintained by the annual occurrence of approximately constant numbers of births and deaths—the random events par excellence of human life.
The steady state explanation is evidently inspired by the example of statistical mechanics in which the macroscopic conditions are heat and pressure and the microscopic random movements are performed by the molecules. Characteristically, the steady state is independent of initial conditions, i.e., the initial size distribution. In economic applications this is important because it means that the patterndetermined by certain structural constants tends to be re-established after a disturbance is imposed on the process. This will only be the case, however, if the process leading to the steady state is reallyergodic, that is, if the influence of initial conditions on the state of the system becomes negligible after a certain time; and it will be relevant in practice only if this time interval is sufficientlyshort.
The idea that the stable pattern of a distribution might be explained by the interplay of a multitude of small random eventswas first demonstrated in the case of the normal distribution. Thecentral limit theorem shows that the addition of a great number ofsmall independent random variables yields a variable that is normallydistributed, when properly centered and scaled. A stochastic process that leads to a normal distribution is the random walk on a straightline with, for example, a 50 percent probability each of a step in one direction and a step in the opposite direction. It is only natural that attempts to explain other distribution patterns shouldhave started from this idea. The first extension was to allow the random walk to proceed on a logarithmic scale. The resultingdistribution is log-normal on the natural scale and is known as thelog-normal or Gibrat distribution. The basic assumption, in economic terms, is that the chance of a certain proportionate growth or shrinkage is independent of the size already reached—the law of proportionate effect. This law was proposed by J. C. Kapteyn, byFrancis Gallon, and, later, by Gibrat (1931).
Let size (of towns, firms, incomes) at time t be denoted by Y(t), and let ε(t)represent a random variable with a certain distribution. We have
where Y(0) is size at time 0, the initial period. For small time intervals the logarithm of size can be represented as the sum of independent random variables and an initial size which will become negligible as t grows:
log (t)=log Y(0)+ε(1)+ε(2)··ε+ε(t).
If the random variablese are identically distributed with mean m and varianceσ2, the distribution of log Y(t) will be normal withmean rat and variance σ2t.
This random walk corresponds to the process of diffusion in physics which is illustrated by the socalled Brownian movement of particles of dustput into a drop of liquid. Since it implies an ever growing variance, the idea of Gibrat is not itself enough to provide an explanation for a stable distribution. There must be a stabilizing influence to off set the tendency of the variance to increase; indeed, a distinguishing feature of the various theories presently to be reviewed lies in the kind of stabilizer they introduce to offset the diffusion. Two interesting cases may be noted here. One possibility is to modify the law of proportionate effect and assume that the chances of growth decline as size increases. This approach has been taken by Kalecki (1945), who assumes a negative correlation between the size and the jump and obtains a Gibrat law with constantvariance. Another possibility is to combine the diffusion process of the random walk with a steady inflow of new, small units (firms, cities, incomes). Some units may continue indefinitely to increase in size, but their weight will be offset by that of a continuous stream of many new, small entrants, so that both the mean and the variance of the distribution will remain constant. This approach, which leads to the Pareto law, has been taken by Simon (1955).
Review of various models. Descriptions of various models will illustrate the methods employed. Models differ with regard to thedistribution law explained, the field of application (towns, incomes,etc.), and the type of stochastic process used.
Champernowne’s model. Champernowne (1953 ) presents a model that explains the Pareto law for the size distribution of incomes. The stochastic process employed is the so-called Markov chain [see MARKOV CHAINS]. The model is based on a matrix of probabilities oftransition from one income class to another in a certain interval of time, say a year. The rows are the income classes of one year, the columns the income classes of the next year. The income classes are chosen in such a way that they are equal on the logarithmic scale (for example, incomes from 1 to 10, from 10 to 100, etc.). The probability of a jump from one income class to the next income class in the course of a year is assumed to be independent of the income from which the jump is made (the law of proportionate effect). Thenumber of income earners is constant.
The number of incomeearners in income class s is then determined as follows. The number of incomes in class s at time t + 1 is
where s, u, andt take on integer values, p(u) is the probability of a jump over uintervals (i.e., the transition probability), and the size of the jump is constrained to the range +1, —n. In the steady state equilibrium reached after a sufficiently long time has passed, the action of the transition matrix leaves the distribution unchanged. Wethen have
as t→ . This difference equation issolved by putting f(s) = zs. The characteristic equation
has two positive real roots, one of which is unity. Toassure that the other root will be between 0 and 1, Champernowneintroduces the following stability condition:
Therelevant solution is which gives the number of incomes inincome class s. If the lower bound of this class is the log of theincome Y8, then the probability of an income exceeding Y,is given by
log P(Y8)=s log b.
Since s is determined by
log Y8 = sh + log Ymin,
where h is the class interval and Ymin is the lowerboundary of the lowest income class, it follows that
log P(Y8) = γ-α log Y8
where the parameters γ and α are determined by b, h, and Ymin. This is the Pareto law with Pareto coefficient α.
Champernowne’s stability condition implies that themathematical expectation of a change in income is negative. Thiscounteracts the diffusion. How can the stability condition bejustified on economic grounds? It may be connected with the fact thatin this model every income earner who drops out is replaced by a newincome earner. Since, in practice, young people have on the averagelower and more uniform incomes than old people, the replacement ofold income earners by young oner, usually means a drop in income.Thus, Champernowne’s stability condition, as far as itseconomic basis is concerned, is very similar to the entry of new,small units that act as a stabilizer in Simon’s model.
Rutherford’s model. Rutherford’s model (1955) leads,in his opinion, to the Gibrat law for the size distribution ofincomes. Newly entering income earners, assumed to be log-normallydistributed at the start, are subject to a random walk and thus toincreasing variance during their lifetimes. The process of birth anddeath of income earners, which is explicitly introduced into themodel, acts as the stabilizer.
The distribution of total incomeis obtained by summing the distributions for all age cohorts thatcontribute survivors. Rutherford’s method is to derive themoments of the distribution by integration over time of the momentsfor the entrance groups. The distribution is built up “synthetically” from the moments, as it were. In the absence of an analyticalsolution with a definite distribution law, some disagreement remainsabout the result.
Simon’s model. In Simon’s model(1955), which leads to what he calls the Yule distribution, theaggregate growth of firms, cities, or incomes is given a priori. Thestochastic process apportions this given increment to various unitsaccording to certain rules, which are weakened forms ofthe law of proportionate effect and rules of new entry. As aconsequence of this procedure, there is no possibility of shrinkageof individual units. The given aggregate emphasizes theinterdependence of fortunes of different firms (the gain of one isthe loss of another)—a point that is neglected in other models, suchas that of Steindl (1965). On the other hand, the aggregate is, inreality, not given; it is not independent of the action of the firms,which may increase their total market by advertising, productinnovation, and so on.
The process of apportionment may bedescribed as follows. We may conveniently think of populations ofcities, so that f(n, N) is the frequency of cities with n inhabitantsin a total urban population of N; to be realistic, we shall assumethat a city exceeds a certain minimum number of inhabitants; n willmeasure the excess over this minimum, and N will correspondingly bethe sum of these excess populations. An additional urban inhabitantis allocated to a new city with a probability a and to an existingcity, of any size class, with a probability proportionate to thenumber of (excess) inhabitants in that size class. Then,
We assume that there is a steady state solution in which thefrequencies of all classes of cities change in the same proportion,that is, in which
Using this relation and defining arelative frequency of cities as we obtain fromthe above equations
or, setting 1/(1 - α) = ρ,
This expression is the Yule distribution. Using aproperty of the T-function, it can be shown that the Yuledistribution asymptotically approaches the Pareto law for largevalues of n, that is,.
This model is applicable to cases in which size is measured by astock, for example, number of employees of a firm. Simon provides analternative interpretation of it that applies to flows, such asincome and turnover of firms. For example, the total flow of incomeis given, and each dollar is apportioned to existing and new incomeearners according to the rules given above.
Using simulationtechniques, Ijiri and Simon (1964) show that the pattern of the Yuledistribution persists if serial correlation of the growth ofindividual firms in different periods is assumed. This finding isimportant because, in reality, growth is often affected by"constitutional" factors, such as financial resources and researchdone in the past.
The model of Wold and Whittle. Wold and Whittle(1957) present a model of the size distribution of wealth in whichstability is provided by the turnover of generations, as inRutherford’s model. On the death of a wealth owner, hisfortune is divided among his heirs (in equal parts, as asimplification). The diffusion effect is provided by the growth ofwealth of living proprietors, which proceeds deterministically atcompound interest. The model is shown to lead to a Paretodistribution, the Pareto coefficient depending on the number of heirsto an estate and the ratio of the growth rate of capital to themortality rate of the wealth owners.
Steindl’s models.Steindl’s models (1965, chapters 2, 3) are designed to explainthe size distribution of firms, but they can equally well be appliedto the size distribution of cities. The distribution laws obtainedare, for large sizes, identical with the Pareto law. LikeRutherford’s model, Steindl’s models rest on acombination of two stochastic processes. One is a birth-and-deathprocess of the population of cities or firms; the other is astochastic process of the growth of the city or firm itself.
Theway in which the interplay of these two processes brings about thePareto law can be explained in elementary terms. We start with thesize distribution of cities. The number of cities can be explained bya birth process, if we assume that cities do not die. Let us assumethat new cities are appearing at a constant rate, e, the birth rateof cities. The number of cities increases exponentially, and the agedistribution of cities at a given moment of time is
where t is age and R(f) is the number of cities with agein excess of t; in other words, R(t) is the rank of the town aged t +dt, and R(0) is the total number of towns existing at the moment oftime considered. The size of the city—its number ofinhabitants—increases, on the average, with age. If the rate ofbirths plus immigration, X, and of deaths plus emigration, /JL, areconstant, we obtain an exponential growth function for the size ofthe city:
Eliminating t between eqs. (1) and (2), weget
This is the Pareto law, and thePareto coefficient is seen to be the ratio of the growth rate of thenumber of cities to the growth rate of a city.
Thisdemonstration, which on the face of it is deterministic in character,can be supplemented by a graphical illustration in which thestochastic features are included. In Figure 1 the distribution ofcities according to age is plotted in the vertical (In R, t) plane.The abscissa shows the age of the city, and the ordinate shows thelog of the rank of the city. Each city is thus represented by a dot,and the regression line fitted to these points represents relation(1). In the horizontal (t, In n) plane, we show the exponentialgrowth of cities with age, as in relation (2). Again each city may berepresented by a dot showing age and size. The scatter diagram in thehorizontal plane may be regarded as a stochastic transformation ofthe time variable into the size variable. If the size of each cityhas been found on the scatter diagram, the cities can be reorderedaccording to size; we then obtain, in the third (InR, Inn) plane, thetransformed relation (3) between the number of cities (rank) and thesize of a city.
If firms are studied, we must take into accountthe death of firms. We might assume that a firm dies when it ceasesto have customers. We can imagine that the age distribution in plane1 of Figure 1 includes the dead firms; they are automaticallyeliminated in the transformation to size, being transferred to thesize class below one. In the exponential relation (1), e must nowrepresent the net rate of growth of the number of firms if the birthof firms is assumed to be a constant ratio of the population.
Figure 1 illustrates how the evolution in time of the number offirms (cities) is mapped onto the cross section of sizes. Thisprocess may be compared to sedimentation in geology, where ahistorical development is revealed in a cross section of the layers.We can also see how irregularities in the evolution over time willaffect the size distribution. If an exceptional spurt of births ofnew firms occurs at one point of time (after a war, for example), theregression line in plane 1 will be broken and its upper part shiftedupward in a parallel fashion. The same thing will happen to thetransformed distribution in plane 3.
Thecomplete model for firms may be described as follows. The size of afirm is measured by the number of customers attached to it. This isgoverned by a birth-and-death process. Let us denote by o(At) amagnitude that is small in comparison with At. There is a chance X At+ o(At) of a customer’s being acquired and a chance of a customer’s being lost in a short period of time,At; multiple births and deaths have a chance of o(At). Theprobability that a firm has n or more than n customers is given by
where P(n, t) is the probability that a firm of age thas n or more than n customers. The term r(t) is the density of theage distribution of firms, including dead firms; for large t it isthe steady state of a renewal process and is given by r(t) =ce~’’, where e is the net rate of growth of the firmpopulation and c is a constant. The number of firms with less thanone customer, P(0, t) — P(l, t), equals the dead firms. The value ofP(n, t) is obtained as the solution of a birth-and-death process forthe customers of a firm:
where. This expression can be expanded in seriesand inserted in the above integral; this yields, integratingterm by term,
where is the Beta integral. Hence,
If, we can neglect the terms with v above a certain value. Thus, if n → and p has a moderate value, we can use the approximations
therefore, as,
P (n)→Cn-w.
The following features of the solution may be remarked: Since the approximation depends on the value of μ/λ which is the mortality of firms of high age, the smaller the mortality of firms, the greater will be the proportion of the distribution that conforms to Pareto’s law. The mean of the distribution will be finite if 1. This is important in connection with disequilibria, which can arise through changes in λ μ , and e. It can be shown that the Pareto solution applies to the growing firm (λ μ, the above case) and, in a modified form, to the shrinking firm (λ μ); but it does not obtain for the stationary firm (λ — μ).
The above solution for the distribution according to customers can be shown to be valid also for the distribution according to sales, if firms grow mainly by acquiring more customers and not by getting bigger customers. This is often true in retail trade but not in manufacturing. An alternative model assumes the other extreme—that firms grow only by getting bigger orders. This model is based on the theory of collective risk. The capital of the firm, a continuous variable, is subject to sudden jumps at the instant when orders are executed and to a continuing drain of costs, which is represented deterministically by an exponential decline. The steady state solution obtained from this process is, for large values of capital, identical with the Pareto law; for moderate values, the distribution has a mode and represents, albeit with some complications, a modification of the “first law of Laplace,” which was proposed by Frechet (1939) for income distributions.
Size as a vector. It would be natural to measure thesize of a firm by a vector, including employment, output, capital,etc., and apply the steady state concept to the joint distribution ofseveral variables. Regression and correlation coefficients obtainedin a cross section could then be regarded, like the Paretocoefficient, as characteristics of the steady state. It may beguessed that the growth of the number of firms will have an influenceon these parameters as well.
Practically no work has been done inthis direction, but it is the only way to clear up the meaning ofcross-section data and their relation to time series data and to thetheoretical parameters of the underlying stochastic process. Thesituation in economics is totally unlike that in physics, where theprocesses are stationary and the ergodic law establishes the identityof time and phase averages. (Only the cosmogony of F. Hoyle, in whichthe continuous creation of matter offsets the expansion of theuniverse to establish a steady state of the cosmos, offers a parallelto the growth processes considered above.) The surprise expressed atone time at the difference in estimates of income elasticities fromcross-section data and from time series data appears naive in thislight because we could only expect them to be equal if the processesgenerating households, incomes, and consumption were stationary.
But the population of households or the population of firms is notstationary. A cross section of firms shows the growth path of thefirm through its different stages of evolution; but the number offirms of a given age depends on the past growth of the total numberof firms, and this may influence the regression coefficient.Moreover, the growth path is not unique, because there are severalprocesses superimposed upon one another (growth paths depending onage of firm, age of equipment, age of the management, etc.). Forexample, the short-run and long-run cost curves are inevitably mixedup in a cross section of firms. [See CROSS-SECTION ANALYSIS.]
Howmuch “stability” and why. The starting point of the theories herereviewed is the stability of distributions, but stability must not betaken literally. The distributions do change in time, but the changeis usually slow. The tail of the distribution of firms or, to alesser extent, of wealth is composed of very old units, and time mustpass before it can be affected by, for example, a change in new entryrates or in growth rates of firms. Thus, the reason for the quasistability of distributions is that the stock of firms, etc., revolvesonly slowly. Indirectly this also accounts for the quasi stability ofthe distribution of incomes, because income is largelydetermined by wealth or its equivalent in the form of education. Aneven more enduring influence on the income distribution is thedifferentiation of skills and professions, which evolves slowly, as asecular process.
The explanations advanced in this article do notexclude the possibility that distribution patterns may changeabruptly—for example, as a consequence of taxation, in the case ofnet incomes; or as a consequence of a big merger movement, in thecase of firms.
Josef Steindl
[Directly related are the entriesIncome Distribution, article onSize; Ranksize Relations.]
BIBLIOGRAPHY
Champernowne, D. G. 1953 A Model of IncomeDistribution. Economic Journal 63:318-351.
Frechet, Maurice 1939Sur les formules de repartition des revenus. InternationalStatistical Institute, Revue 7:32-38.
Gibrat, Robert 1931 Lesmegaliths economiques. Paris: Sirey.
Ijiri, Yuji; and Simon,Herbert A. 1964 Business Firm Growth and Size. American EconomicReview 54:77-89.
Kalecki, Michael 1945 On the GibratDistribution. Econometrica 13:161-170.
Mansfield, Edwin 1962Entry, Gibrat’s Law, Innovation, and the Growth of Firms.American Economic Review 52:1023-1051.
Rutherford, R. S. G. 1955Income Distributions: A New Model. Econometrica 23:277-294.
Simon, Herbert A. (1955) 1957 On a Class of Skew DistributionFunctions. Pages 145-164 in Herbert A. Simon, Models of Man: Socialand Rational. New York: Wiley. -> First published in Volume 42 ofBiometrika.
Simon, Herbert A.; and Bonini, Charles P. 1958 TheSize Distribution of Business Firms. American Economic Review48:607-617.
Steindl, Josef 1965 Random. Processes and the Growthof Firms: A Study of the Pareto Law. London: Griffin; New York:Hafner.
Wold, H. O. A.; and Whittle, P. 1957 A Model Exploringthe Pareto Distribution of Wealth. Econometrica 25:591-595.
Zipf,George K. 1949 Human Behavior and the Principle of Least Effort: AnIntroduction to Human Ecology. Reading, Mass.: Addison-Wesley.