Why we should be paying more attention to the SEPH employment numbers

The first Friday of every month is Employment Data Day in Canada and the US. The US Bureau of Labor Statistics releases employment data from its Current Population Survey (CPS – the household survey) and its Current Employment Statistics survey (CES – the payroll or establishment survey). For reasons I'll get into later, the establishment survey is generally considered to be the more reliable of the two. Here at home, Statistics Canada releases employment data from its Labour Force Survey. (The headline numbers are in Cansim Table 282-0087.)

The CPS and the LFS are based on samples of households – some 60,000 in the US, about 55,000 in Canada. But Statistics Canada also publishes employment data based on its own establishment survey – the Survey of Employment, Payrolls and Hours (SEPH – headline numbers in Cansim 281-0025). We don't hear nearly as much about the SEPH as we do the LFS – mainly because instead of being released on the first Friday of the month following the month in question, SEPH data are released two months later. The demands for media space being what they are, the SEPH is covered with all the enthusiasm it can muster for two-month-old news.

But maybe we should be paying closer attention to the SEPH.


Here is a representative description for why US analysts believe that the establishment employment data are more a reliable indicator of changes in employment than the household survey:

Both the trader and analyst communities, as well as the Bureau of Labor Statistics itself regard the establishment survey as being more reliable than the household survey. This is mostly because of the larger sample size of the establishment survey; while the household survey questions about 60,000 households across the United States, the establishment survey has a sample size of 400,000, about seven times larger.

(See also this pdf from the BLS)

As far as statistical precision goes, StatsCan's SEPH is even more accurate than the CES, because it is a census compiled from administrative files:

Administrative information for total gross monthly payrolls and the total number of employees for the last pay period in the month are obtained from payroll deduction (PD) accounts maintained by Canada Revenue Agency. Public Institutions Division of Statistics Canada provides information for general government services at the provincial and federal levels.

Recall that a census gathers data from the entire population: there is no sampling, so there is no sampling error. In contrast, the stated standard error for the LFS survey is 27,000 – which means that aside fom December 2009, every announced change in employment announced by the LFS during the recession and recovery has not been insignificantly different than zero at the 5% level.

But that doesn't mean that the SEPH is a fortiori a better indicator for labour market conditions: what it measures and what the LFS measures are different things:

Estimates of employment, wages and hours derived from these two surveys differ for a number of reasons.

First, the reference periods are different. LFS data are collected during a "reference week," usually the week following the 15th of the month. For SEPH, the reference period is an entire month.

The LFS includes people who are self-employed, as well as workers who take unpaid leave. SEPH does not cover these groups. Industry coverage for the LFS is comprehensive; SEPH excludes agriculture, fishing and trapping, and religious organizations.

The two count multiple job holders differently. In the LFS, people with more than one job are counted only once as "employed." SEPH is a count of filled positions on payroll, so each job is counted separately.

Finally, national estimates produced by the LFS do not include people living in the three territories or on reserves while SEPH does. LFS estimates are based on where people usually reside. SEPH counts employees in the province or territory where they work, although this has little effect on the comparability at the national level.

(These differences are similar to those between the CPS and the CES.)

Here is what the two series look like over time. The obvious problem with the SEPH is that it started in March 1994:

Plotyy_lfs_seph

So we have two competing data sources for Canadian employment – to what extent do they agree? In the short term, the differences are surprisingly large. Here is a scatter plot for the monthly growth rates for the two series:

Lfs_seph_scatter

If the two series moved in proportion, the scatter plot would be clustered along the blue 45-degree line. But the coefficient of correlation between the two monthly employment growth rate series is only 0.26, and the two series move in opposite directions in 29% of the sample. In the US, the two series go back to 1948, and they are more closely linked than in Canada: a correlation coefficient of 0.43.

Update: Nick wonders what we'd see with the axes reversed:

Lfs_seph_nick

 

Here is how the two series have moved during the recession and the recovery. Since the SEPH doesn't cover the self-employed, I've added another series for LFS employment for employees:

Lfs_seph_2008m01_2012,01

The story we've been telling from the LFS data is that employment growth has been flat since mid-2011. Perhaps we should be revising that narrative towards one in which employment growth has continued, but at a slower rate.

Like almost everyone else, I've been treating the SEPH as almost an afterthought. I started taking a closer look at the SEPH because I couldn't – and still can't – make the link between the bad LFS numbers we saw out of Quebec in the last few months of 2011 and, well, anything else.

But now that I've taken a closer look, I'm going to start making a bigger deal of the SEPH numbers from now on.

 

19 comments

  1. Determinant's avatar
    Determinant · · Reply

    Speaking of employment numbers, do you wish to comment on Miles Corak’s Globe & Mail article on employment data: http://www.theglobeandmail.com/report-on-business/economy/economy-lab/the-economists/unemployment-is-actually-worse-than-numbers-show/article2325252/ ?

  2. Unknown's avatar

    The problem is that those numbers start in 1997. Not much use for business cycle analysis, since they weren’t being collected in the last two recessions.

  3. Unknown's avatar

    From your description of the differences between LFS and SEPH, I was actually a little surprised that the two series tracked as closely as they did over the recession.
    Since LFS is a sample, and SEPH is a population,…..my gut tells me to regress delta LFS on delta SEPH, rather than vice versa. (I’m not sure if that makes sense).

  4. Rob Pettapiece's avatar

    “The demands for media space being what they are, the SEPH is covered with all the enthusiasm it can muster for two-month-old news”
    This trickles down, of course. If the media will ask for comment about the LFS more than the SEPH, then those civil servants, academics, researchers, etc. will spend more time analyzing the LFS (in preparation/briefing), and so I’d guess there is likely more institutional knowledge out there about the LFS. So it gets harder for anyone to cover the SEPH when everyone’s attention has swung away from it.
    (On top of being two months old, the SEPH doesn’t produce an unemployment rate, which is all some people seem to care about.)
    Last year I found there was little predictive value in a year’s worth of LFS numbers for any CMA, even two years’ worth. Maybe the SEPH is better at that?

  5. Unknown's avatar

    Stephen, interesting post. Coming from a more micro/labour perspective, a couple of reasons why us types look at LFS much more than SEPH.
    – It’s difficult to get access to micro data files for any kind of enterprise-level or administrative data set – so the LAD (Longitudinal administrative database), for example, is very under utilized.
    – administrative data collected from enterprises doesn’t tend to have a lot of demographic characteristics, so it can’t be used to analyze many of the things that micro and labour economists are interested in, e.g. what happens to the probability of marriage or divorce or child birth when a person gets laid off.
    But you’re right, it’s odd that the SEPH numbers get so little attention.

  6. Unknown's avatar

    Nick – Ah, the inverse regression problem. I should have thought of that. I’m adding that graph now.

  7. Unknown's avatar

    Frances – yes, the LFS is indeed the data source for most of the interesting behavioural questions. But for the purposes of business cycle analysis, all we really want to know is whether or not employment levels are increasing, decreasing or staying the same.

  8. Ajax's avatar

    I think that the problems of the past that have plagued SEPH have contributed to its 2nd tier status among Canadian employment series. I can recall two issues, there are probably others. First, there have been breaks in the series, making time series analysis problematic. Second, in the late 80s or early 90s, SEPH ran into repeated errors and Stats Can had to issue retractions … eventually the agency pulled the survey from its regular release schedule for a couple of months. This severely damaged the survey’s credibility and economists ignored it thereafter.

  9. Unknown's avatar

    The series I have in front of me starts in 1994. I don’t see how they could get a census based on administrative data wrong.

  10. Unknown's avatar

    Each quarter, you post a forecast of GDP based on two monthly GDP numbers and the employment number for the last month of the quarter. Do you use the LFS or SEPH number for the latter? Have you ever investigated which works better?

  11. Simon van Norden's avatar
    Simon van Norden · · Reply

    “Recall that a census gathers data from the entire population: there is no sampling, so there is no sampling error.”
    While true in spirit, I suspect StatCan analysts might have a different story. For example, I recall some good statisticians worrying that changes to the Census Act would adversely affect the reliability of results from Canada’s Census (not to be confused with the SEPH.)
    Looking through the links that Stephen kindly provided, I found lots of detail on methodology, but nothing to tell me how precise or accurate the SEPH is thought to be, and certainly nothing that I can usefully compare to the standard error of 27,000 that Stephen mentions for the LFS. The best I could find was a mention that the SEPH numbers are revised regularly after their initial release. That doesn’t tell us much, but I think it means that we should think of the SEPH numbers as estimates.

  12. Unknown's avatar

    Stephen:”The demands for media space being what they are, the SEPH is covered with all the enthusiasm it can muster for two-month-old news.”
    It shouldn’t be. When SEPH is released , it’s supposed to be news. A competent pro would use the data. The way we do.
    From my experience in the media (economics journalist in my youth, politics later), most economic and business writers are ignorant and biased. And some are outright dishonest to a level that would not be tolerated in music critics or restaurant reviewers.
    Incompetent: most economic writers have no training in the subject. The only one I know who has trained is Alain Dubuc from La Presse, who is a professionnal economist. ( He has other problems.)
    Without training, their “knowledge” is mostly Chamber of Commerce platitudes overheard at business cocktails. And what they write must please the newspaper owner, rarely a fount of wisdom.
    A week ago, I wrote to Dubuc about the whole mess. I didn’t get an answer. Not that I expected a personnal one, it would be a hassle. But he should have talked about it in his blog.
    From what I hear from my media friends, high unemploymnet in QC fits the narrative necessary in some political-media circles. You could feel the glee in Dubuc’s column “Le party est fini”.
    AFAIK, the only one in the media to have covered the controversy adequately is Jay Brian in The (Montreal)Gazette. He always work his pieces thoroughly. He is one of the few justs in the media Sodom and Gomorrah but, instead of appearing in the main ed and op-ed pages , like a leper, he is cast off in the business page.

  13. Neil's avatar

    The SEPH as a census makes me curious. Many small employers only file remittances quarterly, and the numbers included when they file are # of employees during the 3-month remittance period. Since everyone’s on the same schedule, this would mean every three months, there’s a bump in the amount of available data, and the other 2 months would lack a lot of small employers. The qualifications for quarterly remittances are to have less than $3,000 per month owed to CRA, and to have not missed a payment deadline in the previous year.
    Maybe this is too small of a group to matter. Maybe there’s something else I’m missing.

  14. Roland Jodoin's avatar
    Roland Jodoin · · Reply

    The last time I had a thorough look at SEPH data (and it was thorough, trust me), I found major breaks in detailed series at the 3- or 4-digit level. But I’m an old guy… we’re talking about work conducted in the early 1990s about late 1980s data… In any case, it looks like the ‘bad’ data from the 1980s was of poor quality and was not kept as they transitioned to NAICS.
    I think what scared many analysts away from SEPH back then is that lower-quality detailed series stood out because of the presence of those breaks (breaks never look good) while the variance in LFS series, which should be randomly distributed, was more pleasant to the eye.
    When I worked at StatCan (outside of the labour survey area) the saying among analysts was “For level, trust LFS. For industry detail, trust SEPH.” This is because the industry coding in LFS data is done by respondents, who often have no clue what their establishment’s industry is, as opposed to their corporation’s. The good old example was that of the Coca-Cola truck driver, who works in the transportation industry, not beverages. Unless I’m mistaken, the productivity stats include a combo of the two.

  15. Roland Jodoin's avatar
    Roland Jodoin · · Reply

    And… how about doing that scatter plot again, this time with quarterly data? Maybe there’s a tiny timing issue because of the difference in reference period.
    By the way, what did you find about Quebec, Stephen?

  16. philip cross's avatar
    philip cross · · Reply

    It is no surprise that the monthly correlation of SEPH and LFS is so low. Besides differences in coverage, there are conceptual differences. The most important is LFS measures employment; the P in SEPH stands for payrolls. You could have been hired by an organization (and thus employed in the LFS), but you are not in SEPH until you receive your first pay stub. I just got an email from someone asking how long the delay can be, no more than a couple of weeks, right? Maybe in the private sector, which contracts out this activity to four big firms, who presumably are quite good at this. But then there is government, where the delay can literally be months. Moreover, the gap between LFS and SEPH has changed over time; as the private sector has become more efficient, the federal govt has gone in the other direction. Why can’t govt benefit from the technology available in the private sector? In a word, the Financial Accountability Act, which added layers of complexity to an already arcane process in the federal govt. I will let Prof Gordon and others comment on the efficiency of issuing payrolls in the education and other parts of the public sector. Fortunately, these differences in timing are a problem only in the short-run. If you compare LFS and SEPH on a year-over-year basis, for example, the two move together quite well, even before adjusting for different coverages.

  17. Unknown's avatar

    Thanks for stopping by and taking the time to comment!
    As per Roland Jodoin’s suggestion, I was going to take a closer look at the longer-run coherence in the growth rates. A follow-up post is forthcoming.

  18. Wendy's avatar

    Thanks for this. Maybe it will help explain discrepancies we’ve found between strong commercial real estate demand in, yet mediocre LFS numbers for, Calgary. Only problem is, as far as I can tell from reading CANSIM SEPH table list, SEPH isn’t available at the city level.

  19. Roland Jodoin's avatar
    Roland Jodoin · · Reply

    Great to have you join the forum, Philip. Although maybe you had already commented under some exotic pseudonym.

Leave a reply to Jacques René Giguère Cancel reply