Identifying Similarly Situated Employees in Employment Discrimination Cases

The intent of most employment equity analyses is to determine what the treatment of a protected group of employees would have been in the absence of discrimination. To be valid, those analyses have to take into account any legally relevant differences between the protected employees and a comparison group of employees (e.g., all of the other employees, white employees, males, or white males). The analyst can take into account differences in qualifications, productivity, and other circumstances either by using methods that statistically control for such variables (such as logistic regression analysis) or by identifying pools of similarly situated employees with minimal average differences between the protected and nonprotected employees within each pool.

In order to increase accuracy and validity, pools can be based on: (a) the current composition of the work force; (b) the composition of the work force annually during the relevant period; (c) even more frequent snapshots of the composition of the work force; or (d) the specific circumstances of each selection event and the particular employees who met those criteria on that selection day. Pools that are specifically tailored to each selection event are derived from an employee history file by using a computer program that passes through the file and identifies everyone who met the relevant criteria on the date of the selection event. For a few dozen selection events, the computer code defining each pool can be written by hand, but for hundreds or thousands of selection events, one needs a preceding computer program that will produce the computer code that passes through the employee history file.

The advantage of identifying pools of similarly situated persons is that the analyses to detect the effects of employment discrimination can then be simple to perform and simple to explain to the court (although multivariate analysis methods may be more powerful when the qualifications are continuous variables). As with any analysis that is performed many times for subsets of the work force, the results from separate pools need to be aggregated properly to determine whether there is a pattern and practice of employment discrimination that is not detectable in the individual analyses.