In an otherwise excellent review and analysis of the effect of new case law in California on the use of the fifth edition of the Guides to the Evaluation of Permanent Impairment, the authors take a pot shot at Functional Capacity Evaluation.  They report, “Functional capacity evaluations (FCE) are of no value in rating permanent impairment or permanent disability within the context of workers’ compensation litigation; the results of these assessments are often unreliable.  FCE protocols vary in quality and in the context of determination of benefits and litigation it is common for examinees to under-demonstrate their capabilities.  Studies have conclusively shown that FCE performance does not predict sustained return to work in claimants with chronic back pain.  Scientific scrutiny has additionally demonstrated that work restrictions which are based on FCEs are harmful to the health of examinees.  FCEs are not a comparable method of evaluation of impairment and therefore do not rebut the Guides.”

The authors are arguing that impairment ratings based on the Guides should not be able to be rebutted with better information in the formula that the state of California uses to develop a disability rating.

I would like to open up a civil discussion that throws light on these important issues, beginning with a few principles on which we can likely agree.

The central principle is that everydisability rating system must fairly lead to compensation that reflects the actual work disability.

Another principle is that better decisions are made when the information provided is pertinent to the question.

Another principle is that there is always a trade-off in test-based decision-making between safety, reliability, validity, and practicality and, when the trade-off is handled poorly, utility suffers.


The argument put forth by Brigham and Uehlein goes to the distinction between reliability and validity, with the former putting a ceiling on the latter, but not providing a substitute for the latter.  With regard to the present issue, the Guides is unquestionably more reliable than ad hoc impairment ratings of a physician, even one who is very well trained.  That is the strength of the Guides.  Continuing, good reliability only sets the stage for good validity.  Measuring systems can be devised that are very accurate but produce information that contributes little to the final decision, which is an issue of validity.  Said another way, to the degree that information is related to the question the decision will be correct.  In the present case, everyone agrees that the Guides provides precursor information rather than pertinent information.  Impairment is not directly related to disability, but is pertinently related to functional limitation, which is directly related to disability.  To the degree that functional limitation information is reliable, its validity will always be superior to impairment information.

The Test Factors Hierarchy is a helpful thought tool.  In any testing situation, four factors presented in hierarchical order must be adequately addressed in order for utility to occur.  These factors in order are Safety, Reliability, Validity, and Practicality.  There is always a tension among these factors, with satisfaction of the senior factors more important, but all requiring consideration.  When working with a person who has a medical impairment, Safety is always an important consideration.  The interface of a particular person to a particular test may lead to a safety risk that is inappropriate to undertake; the test is thus not administered.  Reliability is the next factor in the hierarchy and is necessary to establish so that Validity can occur.  However, many strategies to improve reliability vitiate validity, causing dependable data to be meaningless to the issue at hand.  If the information is not pertinent to the question posed, there is no validity, even if the information is reliable.  Practicality is the fourth factor and is comprised of all the costs associated with the test.  In recent years, practicality has had inordinate importance in workers’ compensation decision-making, threatening the utility of information collecting processes such as functional capacity evaluation.

The authors of the current paper provide four peer-reviewed scientific references to support their critique of functional capacity evaluation.  Of course, many other studies that reach different conclusions have been published, including several in which I participated.  I would like to attempt to throw more light on this subject by describing my first peer-reviewed study in this area. 

In 1986, a study entitled “Evaluation of Lifting and Lowering Capacity” was published in the Vocational Evaluation and Work Adjustment Bulletin.  We presented data from 132 consecutive functional capacity evaluations, 84% of which were within the workers’ compensation context in California.  Although each of these injured workers was permanent and stationary, only four subjects had received recommendations concerning the amount and type of work that he or she should be able to perform, as distinguished from performance restrictions.  Even when work restrictions were provided they were not useful in planning a return to work, such as, “restricted from heavy lifting and repeated bending and stooping”.  A subset (24) of the evaluations compared the injured worker’s FCE performance to job demands, with 14 of these resulting in a finding that he or she would be able to perform the work indicated.  Based on average case costs at that time, we calculated a net of $2.15 saved for every dollar spent on functional capacity evaluation. 

Perhaps most importantly, the work restrictions of the physician were found to not predict function.  In the most conservative approach possible, only those 34 subjects who had injuries to the lumbosacral spine were considered, with their work restrictions, which resulted in a Spearman Rank Order Correlation of r = -.159 compared with the actual maximum safe and dependable lift demonstrated by the injured worker. 

This paper concludes with, “Use of the results of the evaluation procedures are suggested as an alternative to the physician’s report of work restrictions as a basis of vocational exploration and as a means of job assignment after injury.”  This recommendation is precisely on target 23 years later, and is reflected in the new case law.

As we move into the future, thoughtful people who are concerned about fairness are encouraged to use the Test Factors Hierarchy to evaluate test-based decision-making systems.  One way to look at the most recent case law is that it is an attempt to return to this hierarchy, in which the Practicality of the process is given subordinate importance.  In the paper, a quotation from a letter from Governor Arnold Schwarzenegger to the insurance commissioner begins, “Given our current economic environment…”.  Of course we must consider the economic environment, but we must also respect the tacit contract that is the foundation of all worker’s compensation systems that allows the injured worker to receive fair treatment if he or she is willing to give up the opportunity to sue the employer for damages.  Obviously, this is a system that can work better than the tort system, but requires fair and equitable treatment for all.


  1. C. Brigham and W. Uehlein, Impairment Ratings in California: Analysis of the Almaraz and Guzman Decision and Impact on the Use of the AMA Guides, Fifth Edition. The Guides Newsletter, 2009: p. 1-13.
  2. L. Matheson, Evaluation of lifting and lowering capacity. Vocational Evaluation and Work Adjustment Bulletin, 1986. 19(4): p. 107-111.