Saturday, April 23, 2016

Should we reconsider how we collect income distribution data?

Should we change the way we collect inequality data and the way we measure inequality? For indeed we do not collect the data and use methodologies independently of what we are interested in and what our view of the world is.
            To make this understandable let me explain the three ways that have historically characterized collection of income distribution data. They emphasized alternatively (1) horizontal inequality, (2) the middle class and the distribution of all incomes but the top, and (3) the top of income distribution.
            The first approach (horizontal inequality) consists of being interested in the mean incomes of various groups, checking how they differ and how they evolve in time. Distributions around these means are considered of secondary importance: they are collected but not much worried about.  Historically the first approach originated in the Soviet Union where household surveys, conducted by the state at regular intervals, began in the early 1920s. The Soviet concern was not with distributions but with how a “typical”, “average” worker was doing, compared to an “average” peasant, and later to an “average” collective farmer. So, the focus of surveys was on the “averages”: worker of an average skill, with a partner (wife) of equally average (“normal”, “usual”) skill, with kids of “typical” ages etc.
            Obviously, the data were not collected only for such average households but also for those that were a bit less common. Yet, the surveys were uninterested in the extremes: neither poor, nor atypical nor rich were included. This had led, in all socialist countries, by the 1960s to the plethora of (published) data contrasting mean incomes of workers vs. farmers’ households or of the employees vs. pensioners. But such approach left both ends of the distribution (the top and the bottom) under-represented and the distribution itself truncated. It was not very useful for inequality nor for poverty statistics.
            Before you think how limited this approach was, think twice. Because it is exactly the same approach that is today urged by many who care only (or almost only) about gender or racial or other “type-against-type” inequality. They focus on whether women on average are paid less than men, and while this concern is, like the old-fashioned Soviet concern of how workers are doing compared to farmers, legitimate, it leaves the entire distribution out.
I have criticized this approach in my recent bookby pointing out that if we were to equalize the means that would still leave unfinished the job of very inequality that could remain among women or men. The distributions can be vastly unequal while their means are the same. Thus getting the means equal is only the beginning.
            Enter the second approach that characterized collection of income distribution statistics since many countries started collecting standardized data in the 1950s and 1960s. We are concerned here with entire distributions, with the poor, the less poor, the middle class, the upper middle class etc. But we are mostly concerned with the “dominant”, large groups, the middle classes and not much with the top of income distribution. This for two reasons.  

The first is confidentiality. Since the time when many Western countries started allowing access to micro data, their statistical offices were concerned that very rich people, who are few in numbers, could be  identified if researchers had access to their age, education,  number of children and place of residence. Thus anonymity, in principle guaranteed by the surveys, and on which surveys depend if they wish to ensure people’s participation, would be severely compromised.Accordingly, the rich were undersurveyed.
The second reason is that extremely rich people are very rare, and if they (one or two of them) happened to be included in this year’s survey (remember, surveys are samples), it could push inequality statistics unusually high, and make the results look out of line with historical data. When a researcher then studies inequality, or newspapers publish the results, it would appear that inequality went up for some fundamental reason while it happened simply because a few rich people were included in the sample. Many statistical offices thus decided to censor the top of the income distribution by using the so-called “top coding” which sets the maximum incomes that could be reported either by category or in total. So, if you for example told the enumerator that your capital income was $5 million, and the top code for that category was $1 million, the survey would register your income as $1 million.  
            Here we come to the third approach. As the rich economies have become more unequal, and the gap between the top of the income distribution and everybody else had grown, the popular as well as research interest has shifted toward the top. Notice the progression which responds to the progression in societal interest: from how is a typical worker is doing compared to a typical farmer, to how unequal is a society and how large is the middle class, to how rich are the top 1%.
The use of fiscal data, popularized by Piketty first on the French data, and later by others, responded precisely to that interest (or perhaps contributed to create such interest).  Even the statistical measures used changed: instead of an overall distributional statistic like the Gini coefficient, the focus was on top income shares. The fiscal data indeed give a better picture of top incomes than household surveys. IRS in the sample it gives to researchers nevertheless still does some intentional “blurring” at the top, but we surely have much better data on household pre-tax income of the top 1% than with household surveys. (For a recent comparison of US survey and tax data, see here.)
Still, there are at least two problems. First, the rich especially, but everybody else as well, have a clear interest in minimizing their incomes to reduce taxes they pay. Second, the rich engage, as we have seen in the Panama Papers, in massive schemes to hide their assets and income.  Thus, despite our best efforts to uncover the full extent of top incomes we are only at the beginning of a long road.
So it is perhaps the right time to think how fiscal data should be improved, how fiscal and household survey should be made more compatible, and most ambitiously, whether better administrative data (like the world register of wealth proposed by Piketty and Zucman)  should be created, both to tax wealth and to combat fiscal evasion. We are already moving to the next stage of methodological development where the concern with incomes of the rich, partly because they have become so much richer than the others, partly because they wield huge political power, and partly because they are hiding their assets, may take center stage.

Saturday, April 2, 2016

The Schumpeter hotel: income inequality and social mobility

In one  of his rare discussions of inequality, Joseph Schumpeter illustrated in a metaphor the difference between the inequality we observe at a moment in time and social (or inter-generational) mobility. Suppose, Schumpeter writes, that there is a multiple-story-high hotel with higher floors containing fewer people and having much nicer rooms. At any given moment, there would be lots of people on the ground floor living in cramped small rooms, and just a few people in the nice and comfortable top-floor rooms with a view. But then let the guests move around and change the rooms every night. This  is what, Schumpeter said, social mobility will do: at every given moment of time there are rich and poor but as we extend the time period, today’s rich are yesterday’s poor and vice versa. The guests from the ground floors (or at least their children) have made it to the top, those from the top might have tumbled down to the bottom.
Now, Schumpeter’s metaphor was for a long time a metaphor for US inequality too. It was granted that in the 20th, and even in the 19th, century US income inequality might have been greater than inequality in Europe, but it was also held that US society was much more fluid, less class-bound and that there was greater social mobility. (That view of course conveniently overlooked the huge racial divide in the US.) In other words, inequality was the price that America paid for high social mobility.
This was a reassuring  picture consonant with the idea of the American dream.  But was it  true? We actually never knew it, beyond anecdotal evidence of migrants’ lives, since no consistent empirical studies of inter-generational mobility existed until very recently. But before I go to their findings,  I would like to focus in  very simple terms  on the relationship between inequality and social mobility.
Consider a diagram classifying societies according to social mobility and inequality. From the US example mentioned above, we  would place US in the <high inequality, high mobility> quadrant. It is easy to imagine <high inequality, low mobility> societies: feudal societies will be one extreme example but all societies with high income differences and entrenched power of the elites would fall into that category too. So, let us write in Latin America or Pakistan in the SE quadrant. It is also relatively easy to imagine what countries we would place in <low inequality, high mobility> quadrant: probably Nordic countries with strong public education that allows high inter-generational mobility while redistribution of current income ensures low inequality.
It is much more difficult to find examples of <low inequality, low mobility> societies. It seems somewhat natural to think that if a society exhibits low inequality, it will be hard to keep sons and daughters of people who have just a slightly lower income (than another group) permanently below incomes of sons and daughters of that latter group. One can even wonder what mobility in these cases really means: if incomes between people and classes differ by an infinitesimal amount and your children remain richer by that infinitesimal amount above mine, I am not sure that that kind of lack of social mobility really matters much. Perhaps some guild-like societies where occupations cannot be freely chosen but income differences between the occupations are small could be placed in that category. Communist societies had some aspects that could make them (weak) candidates to be placed in North-East quadrant. 

High social mobility
Low social mobility
Low income inequality
Nordic countries
Guild-like societies
Communist ?

High income inequality
US (American dream)
Latin America

So now that we have organized our thinking, let us consider the empirical evidence. Most famously, it comes from the recent work of Miles Corak, building on previous studies by Gary Solon, Blunden, Gregg and Macmillan, Björklund and Jäntti and others.  What these authors find is that there is a strong correlation between current and inter-generational inequality, or in other words, between inequality and low social mobility: the more unequal the society the less likely is the next generation to move upwards (or conversely, the less likely is the decline of the rich). So in terms of our simple diagram, Corak finds that societies are aligned along the diagonal: there are no outliers, whether the societies exhibiting the American dream or the guild-like ones.
The implication of that finding which was dubbed by Alan Krueger the Great Gatsby curve is that there is no American exceptionalism. The comforting picture of high inequality which does not impede mobility between generations turns out to be false. US does not behave any differently than other societies with high inequality. High income inequality today reinforces income differences between the generations and makes social mobility more difficult to achieve. This is also the point of my recent paper with Roy van der Weide. We use US micro data from 1960 to 2010 to show that poor people in US states with higher initial inequality experienced lower income growth in subsequent periods).
This important finding that the actually existing societies (as opposed to the dreamed up ones) are aligned along the diagonal of our table has two important implications:  (1) American exceptionalism in the matters of income distribution does not have a basis in reality, and (2) we can use, with a good degree of confidence, the easily available data on current inequality as predictors of social mobility. Thus one cannot argue that societies with high inequality in incomes are societies with high equality of opportunity. On the contrary, observed high income inequality today implies low equality of opportunity.