Evaluating non-formal learning in the context of the Four-Level Model

Donald L Kirkpatrick first published his ideas on evaluating learning in 1959 in a series of articles in the US Training and Development Journal. The articles were subsequently included in his book Evaluating Training Programs (originally published in 1975; I have the 2006 edition).

In this text he outlined and further developed his theories on evaluating culminating in the Four-Level Model, arguably the most widely used and popular approach for the evaluation of training and learning. Kirkpatrick’s Four-level model is now considered an industry standard across the HR and training communities (see Table 1.1).

Table 1.1 Kirkpatrick’s Four-level Model

Kirkpatrick’s Model

Learning Effect

Level 1: Reactions.

Evaluate participants’ satisfaction with the learning intervention.

Level 2: Learning and
Level 3: Behaviour.

What do participants know they didn’t know before? How are they using knowledge in their jobs?

What is the learning and performance effect of the intervention?

Level 4: Organisation-level benefits.

Has the development of higher levels of domain knowledge improved organisational productivity?

According to the model, evaluation should always begin with Level One, and as time and budget allows, should move sequentially through Levels Two, Three, and Four. Information from each prior level serves as a base for the next level’s evaluation. Each successive level represents a more precise measure of the effectiveness of the training program, however, each level also requires a more rigorous and time-consuming analysis. As we shall see, the characteristics of NFL mean that not all of the four levels can be applied to the evaluating of this approach to learning.

In Chapter 5 of Evaluating Training Programs (2006) Kirkpatrick and Kirkpatrick stress the significance of measuring learning, because “no change in behaviour can be expected unless one or more of the learning objectives have been accomplished” (p.42). They set out “helpful” guidelines for the measurement of learning:

Table 1.2 Guidelines for evaluating learning

Guidelines for evaluating learning

  1. Use a control group if practical
  1. Evaluate knowledge skills and/or attitudes both before and after the program
  1. Use a paper-and-pencil test to measure knowledge and attitudes
  1. Use a performance test to measure skills
  1. Get a 100 per cent response
  1. Use the results of the evaluation to take appropriate action

This approach is incompatible with the assessment of non-formal learning (particularly in the context of e-learning) in a number of ways:

  1. The singular nature of each non-formal learning event means that no control group can exist to measure a difference between it and the “experimental group” (p.43).
  2. Similarly; a significant number of the learners that access NFL do so asynchronously, via a number of different learning channels and over a broad time span in a relatively ad hoc manner. This “just enough, just in time” aspect of non-formal learning is one of its strengths, but it is unrealistic to attempt to measure learning using an experimental method in such an environment.
  3. The lack of summative assessment in non-formal learning precludes both pencil-and-paper and performance testing to measure learning.
  4. The distributed nature of access to these NFL interventions, over both time and location makes getting a 100 per cent learner response rate practically impossible.

I’ll be discussing approaches to overcoming these challenges in my next posting.


Kirkpatrick, D. & Kirkpatrick, P. (2006) Evaluating Training Programs. 3rd ed. San Francisco, CA: Berrett-Koehler Publishers, Inc.