[Home] [Current Papers] [Technology and Human Error]

If you have Adobe Reader you can also download a .pdf version. image001

Can Technology Eliminate Human Error?

    by A G Foord & W G Gulland, 4-sight Consulting


    Abstract

    This paper argues that it would not be possible to design technological systems to eliminate all human errors during operation because people are involved in: specifying, designing, implementing, installing, commissioning and maintaining systems as well as operating them. The paper illustrates this with examples of incidents caused by human error and concludes that, even if systems can operate without human intervention, there is still the possibility of human error at other phases of the lifecycle. Thus to improve process safety it will be necessary to focus on behaviour and methods of working during all phases of the lifecycle so as to remove or reduce opportunities for human error.

    Keywords

    safety, human behaviour, methods of working, specification errors, design errors, installation errors, maintenance errors.


Introduction

Trevor Kletz (2004) chaired a half-day discussion at the Hazards XVIII (2004) conference in Manchester on “Human Factors & Behaviour”. The speakers put forward two opposed views. Some described ways of changing behaviour and others thought that it was more effective, particularly when trying to improve process safety, to change designs or methods of working so as to remove or reduce opportunities for human error. All agreed that there is a need for both approaches - the differences were ones of emphasis - but nevertheless there was a real disagreement on which approach was the more practicable and successful.

The disagreement reported implies that some people suggest focusing primarily on a design that eliminates human errors during operation. This paper gives examples of incidents in a variety of industries to illustrate:

  • the problems of focusing on designs when trying to improve process safety; and
  • the need to consider the behaviour and methods of working of designers.

When no reference is cited, the information came from a private communication or our own experience.

Human error

The authors agree that, when trying to improve process safety, we need to consider all three approaches to reduce the opportunities for human error:

  1. ways of changing behaviour, and
  2. designs, and
  3. methods of working.

Any discussion of the value of focusing on designs needs to consider:

  1. the various types of human error; and
  2. the design process itself.

Types of human error

Fig 2 from HSG 48
Figure 1. Types of Human Failures.
Based on Figure 2 from HSG48 (1999).

Figure 1 based on Figure 2 from HSG48 (1999) provides one classification of types of human failure. Thus, we might think that a design could aim to eliminate all seven of the categories of errors, mistakes and violations shown in Figure 1 during operation. A design would then not be accepted unless evaluated against all possible skill-based errors and mistakes and violations during operation of the design. For example, Hazards XVIII (2004) mentions on page 663 an “Application Guide on Human factors in Engineering Design” (Kariuki & Löwe 2004) - a guideline to assist engineers to design a process facility that addresses the capability of the operator.

Design process

However, operation is only part of the lifecycle of any design and the guideline (Kariuki Löwe 2004) does not cover the capability of the designer or the activities involved in design or any parts of the lifecycle other than operation and maintenance. Human error needs to be considered during:

  1. specifying
  2. designing
  3. implementing
  4. installing & commissioning
  5. operating and
  6. maintaining.

The book Out of Control (2003) summarises the analysis of 34 incidents in the process industry. On pages 44 & 45 it states:

“… a total of 56 causes were identified for the 34 incidents. This data has been grouped in Table 2, which gives the percentage of the primary causes attributable to each lifecycle phase.”

Figure2
Figure 2. Allocation of Primary Causes of Incidents to Project Lifecycle Phases.
Based on Figure 10 from Out of Control (2003).

Figure 2 (based on Figure 10 from Out of Control 2003) presents these figures in a pie chart. The summary in Out of Control reveals that technology failures resulted in only a small proportion of the incidents. In the 34 incidents analysed, 44% had inadequate specification as their primary cause.

Other primary causes listed are:

20% changes after commissioning;

15% design and implementation;

15% operation and maintenance; and

6% installation and commissioning

In order to produce a design that eliminated all human errors during operation we would also need eliminate human error from all the other phases of the lifecycle. Out of Control (2003) illustrates how difficult this is to achieve for control, monitoring and protection systems. The following examples illustrate that similar problems apply to many designs, not just control, monitoring and protection systems.

1. Specification error example

The specification should include both the function required and the environment within which the function is to be performed. Kletz (1999) states on page 158 “For example, a Hazan showed that the probability of a leak of toxic material was acceptably low. At times small fragile packages containing toxic substances had to be moved but they were conveyed in trolleys and kept in them.  However, when the lift was out of order a package was carried downstairs and placed on a table. It slid off and the contents leaked.”

Software tools are available to check that explicit specifications are consistent, but it is much harder to ensure that specifications are complete. It is difficult to imagine a technology that does not depend upon human behaviour to define a specification, although technology could be used to warn that the environment is not consistent with the specification. For example, monitoring systems are now available that can warn of abuse of packages during transportation.

2. Design errors

For example, the instructions for one of the first Apple portable computers stated "with the aid of a large hammer it is possible to insert the battery the wrong way round!" Unfortunately the same care was not taken with the design of a light guard on a guillotine using an edge mounted printed circuit board. The guard stopped the blade from operating if anything interrupted the beams of light in front of the guillotine. After maintenance the printed circuit board was reinstalled upside down. Interrupting the light beams then caused the blade to move, injuring an operator. This was not fail-safe and is primarily a design error, not just a maintenance and testing error.

3. Implementation errors

A supplier provided 20 flanges, 19 mild steel and 1 stainless steel (as he did not have enough mild steel flanges). He thought this was helpful as the stainless steel flanges were better than the required mild steel spec. He did not consider that mild steel welding would have failed on the stainless steel flange. Fortunately the customer used 100% material testing and the mistake was identified before welding commenced. Given a good specification, implementation errors should be easier to detect, but design and implementation errors are still a significant cause of incidents.

4. Installation and commissioning errors

The low pressure trip failed at a power station. Examination revealed that the commissioning override had never been removed.  (This might be considered a design error as the override should have included a timer that limited how long the override could remain in place.)

A distillation column was upset every time it rained. Months were spent trying to analyse what was happening. Eventually a small plastic protective plug was found in the vent port of the air supply regulator of the valve controlling the heat input to the column; normally there was sufficient leakage around the plug for the regulator to work correctly, but when rainwater got into the threads between the regulator and the plug it formed a seal. The plug should have been removed as part of the installation and commissioning process.

5. Operation errors despite design for safety

Some decades ago there was an unusual increase in the incidence of minor shunt road accidents. Initial analysis showed the increase was occurring with more expensive cars, and further analysis showed these cars were fitted with anti-lock braking systems (ABS). After questioning many drivers it eventually emerged that a significant number of drivers were driving closer to the car in front, knowing that ABS would enable them to brake harder and more effectively.

This is now recognised as a possible effect of any car safety device: seat belts; air bags; active suspension; etc. Some drivers may drive nearer the edge of the safe envelope assuming that the technology will protect against their driving errors. “Risk compensation” (sometimes known as “risk homeostasis”) is the term now used to describe this behaviour.

A chemical plant had separate storage tanks for acid and caustic soda. Both were delivered by road tanker. One day the supervisor challenged an operator who was leaving the control room with a large spanner.  The operator explained that he needed to change a coupling on the plant as it did not fit the tanker that had just arrived. The supervisor explained that it was designed that way to prevent liquid being unloaded into the wrong tank. The design plus the intervention of the supervisor prevented an accident arising from mixing acid and caustic soda.

6. Maintenance errors

Repair or replacement

    1. Aircraft windscreen (Aviation Safety Network 1990)

    The windscreen of a British Airways BAC One-Eleven 528FL passenger aircraft was replaced with bolts of which 84, out of a total of 90, were of smaller than specified diameter. The maintenance work looked complete, but on 10 June 1990 at altitude when the aircraft was pressurised, the windscreen blew out. The commander was sucked halfway out of the windscreen aperture and was restrained by cabin crew whilst the co-pilot flew the aircraft to a safe landing at Southampton Airport.

Cleaning

    1. Shutdown valve

    A solenoid operated pneumatic valve (SOV) was a vital part of a shutdown system. The exhaust port of the SOV was not covered while the surrounding area was cleaned by sand blasting. Sand entered the SOV and caused a fail to danger.  Fortunately this was discovered at the next trip (proof) test, so no injuries resulted.

    2. Aircraft pressure ports (Job 1998)

    The underside of an aircraft was to be cleaned so the static (reference) ports for both the air speed indicator and the altimeter were covered with strong adhesive tape before cleaning began. The maintenance procedure required that after completing the cleaning, the cleaner should remove the tape and the maintenance supervisor should sign that he had checked that the tape had been removed. Despite these precautions, the aircraft took off with both static ports still covered with tape. During the flight the pilot received the bizarre combination of simultaneous warnings for both overspeed and stall (the stick shaker operated as part of the stall warning). The pilot made the plane dive to avoid stalling but the altimeter read 9700 feet when in fact the plane was less than 1000 feet above the sea and the “too low terrain” warning sounded as well as the overspeed and stall warnings. Sadly the plane crashed into the sea with the loss of all crew and passengers before the false alarms were recognised.

Conclusion

The examples in Out of Control (2003) and those above illustrate that human error may occur when specifying, designing, implementing, installing, commissioning, operating, and maintaining systems. Even if technological systems can operate without human intervention, there is still the possibility of human error at other phases of the lifecycle.

Design may reduce the possibilities for human error during operation, but only if human error can also be eliminated during all the other phases of the lifecycle. Thus it will be necessary to focus on human behaviour and methods of working during the whole lifecycle, not just operation. Focusing on “the design for operation” alone will not be enough.

References

  1. Aviation Safety Network (1990) [online]. [accessed January 2005]. Available from World Wide Web:
    http://aviation-safety.net/database/record.php?id=19900610-1
  2. Hazards XVIII (2004) IChemE Symposium Series No 150,: Process Safety - Sharing Best Practice, IChemE, ISBN: 0852954603, pages 652-725
  3. HSG48 (1999) Reducing error and influencing behaviour, HSE Books, ISBN 0-7176-2452-8, page 12
  4. Job, M (1998) Air Disasters Volume 3, Aerospace Publications Pty, ISBN 1 875671 34 X
  5. Kariuki, G. & Löwe, K. (2004) INCORPORATION OF HUMAN FACTORS IN THE DESIGN PROCESS, guide prepared for PRSIM Focus Group 4, Institute for Plant and Process Technology, Process Safety and Plant Technology, Technische Universität Berlin, Germany – [online]. [accessed January 2005]. Available from World Wide Web: http://www.prism-network .org/
  6. Kletz, T. (1999) HAZOP and HAZAN Identifying and assessing process industry hazards, IChemE, ISBN 0-85295-421-2, page 158
  7. Kletz, T. (2004) Trevor's Corner, Previous Issues, Corner no 1 [online]. [accessed January 2005]. Available from World Wide Web:
    http://psc.tamu.edu/TrevorSays/T%27s%20corner%201%20Rev.pdf
    from Mary Kay O'Connor Process Safety Center - http://psc.tamu.edu
  8. Out of Control (2003), Second edition, HSE Books, ISBN 0-7176-2192-8, pages 44 & 45