Human Computer Interaction is a relatively new field that has grown rapidly in recent years due to the ever-increasing use of computers and computer based systems. Lee and Paz (1991) commented relatively early in the life of human computer interaction that more research was needed as computer interfaces were no longer in only the domain of the designers of the interface and that designers of the interface were actually unique rather that being representative of the user as they were in the past. Johnson, Clegg and Ravden (1989) noted that literature on human computer interaction was increasing due to the "increasing use of computer based systems, the variety of people who are interacting with them, both within and outside work, and the rapid pace of technological advance". Significant progress has been made since those articles with many research studies and literature reporting on different aspects of human computer interaction (Baecker, Grudin, Buxton and Greenberg, 1995; Preece, Rogers, and Sharp, 2002; Hall, 1997; Dix, Finlay, Abowd and Beale, 1993; Carroll, 1997; Olson and Olson, 2003).

Human computer interaction aims to design and develop interactive products that "support people in their everyday and working lives" (Preece et al, 2002). The human computer interface (HCI) is the component of the computer product that interacts with humans. The aim of the design of a HCI is to make a product easy, effective, and enjoyable to use for the user (Preece et al, 2002). Poor interface design has been reported to result in increased user stress, lower work rates, decreased job satisfaction and misuse or lack of use of the computer system (Lee and Paz, 1991). Henderson, Podd, Smith and Varela-Alvarez (1995) and Johnson (1989) also reported that poor interface design could increase mistakes, user frustration, result in poor system performance, employee dissatisfaction, high staff turnover, absenteeism and tardiness.

Adhering to interface design principles recommended in HCI research and literature can help to reduce the severity and incidence of the negative effects of poor interface design. One important step in the recommended design principles of the HCI is evaluation. Evaluation of the HCI as a part of design will allow the designer to see how the product interacts with its users. "Without evaluation, designers cannot be sure that their software is usable and is what users want" (Preece et al, 2002). There are a number of different evaluation methods and techniques that are discussed by many authors that can be used to evaluate HCIs. The purpose of this report is to review and discuss the evaluation methods and also to investigate the techniques used to evaluate HCIs to assist potential evaluators to understand and enable them to make informed decisions about the most appropriate techniques for the HCI that they are evaluating.

Evaluation of the human computer interface

The evaluation of HCIs enables the evaluator to assess how a design is suited to users' needs and what users think of the design. Evaluation of the HCI should be performed throughout the entire design and production process, otherwise known as iterative design and evaluation. (Baecker et al, 1995; Preece et al, 2002; Christie, Scane & Collyer, 2002). Baecker et al (1995) further defined iterative design and evaluation by instructing the evaluator to iterate or repeat the design and evaluation of a HCI until a satisfactory result is achieved. Johnson et al (1989) reported that the benefits of using evaluation as a part of the design process were increased user satisfaction, increased sales, decreased development costs, increased productivity, decreased product returns, and decreased training costs.

Approaches to the evaluation of the human computer interface

Before any evaluation of a HCI is performed, the evaluator must consider the approach to the evaluation that will be taken. Approaches to HCI evaluation relate to the general theory, methodology or perspective behind the evaluation and help to guide the evaluation. They help the evaluator to focus on what it is they are trying to achieve. Approaches are also a way of grouping together and defining tools and techniques used in the evaluation of HCIs.

Preece et al (2002) and Christie et al (2002) have both defined different models of approaches to the evaluation of HCIs to enable the evaluator to be guided to the most appropriate tools and techniques needed for the evaluation. The different approaches are labelled paradigms (Preece et al, 2002) and perspectives (Christie et al, 2002). Both approaches are sound and guide the evaluator to the same or similar tools and techniques for evaluation although Christie et al (2002), through the nature of the publication that it was written for has taken a more expert approach.

Christie et al (2002) discussed theoretical perspectives in the evaluation of HCI. Theoretical perspectives attempt to broadly describe the theoretical aspects of what the evaluation is trying to achieve. The theoretical perspectives encourage the evaluator to think about the evaluation in a more psychological sense and determine what type of measures the evaluation is trying to achieve. The five perspectives are listed and are briefly described below.

1. Cognitive perspective

2. Social psychological perspective

3. Organizational perspective

4. Psychophysiological perspective

5. Communication perspective

Christie et al (2002) reported that the Cognitive perspective relates to the compatibility or likeness of the information processing model that is between the human user and the computer interface. Norman (1988) clearly defined this issue in his mental model diagram as illustrated in Figure 1, where the designer model, the user model and the system image can sometimes be different. The cognitive perspective looks at ways of evaluating HCIs with the goal of making the designer's model, the user's model and the system image as similar as possible.

Figure 1: Norman's mental model of HCI

The Social psychological perspective is concerned with how humans interact with each other through the use of computer interfaces. The Organisational perspective relates to how to apply a HCI model to the information processing and work flow requirements of the organisation. The Psychophysiological perspective emphasizes evaluation of human behaviour and the Communication perspective focuses on the effects that "constraints on communication between human or electronic partners in a task can have on the process and outcome of the work done" (Christie et al, 2002).

Preece et al (2002) discussed a different approach to the evaluation of HCIs. Before any tools or techniques for the evaluation are decided the evaluation paradigm must be determined. Evaluation paradigms are similar to the theoretical approaches discussed above but are more general approaches to evaluation. The evaluation paradigms can also be used to group and categorise different tools and techniques for the evaluation of HCIs.

Preece et al (2002) reported that each evaluation paradigm has a different purpose and that its use will vary according to the individual strengths, weaknesses and beliefs of the evaluator and also to the requirements of the evaluation. Paradigms represent broad "ways" of performing the evaluation and are used by the evaluator to structure the evaluation by giving insight into the paradigm and guiding the evaluator to appropriate tools and techniques for the paradigm chosen to follow. The level of control of the evaluation, the location of the evaluation, the type of data the evaluator wishes to collect, the time in the development that the paradigm is used to evaluate the HCI, and the role that users play in the evaluation will help to decide which paradigm to use (Preece et al, 2002). The four paradigms are;

1. Quick and dirty evaluation

2. Usability testing

3. Field studies

4. Predictive evaluation

Quick and dirty evaluation is an informal way of getting feedback from users or professionals. It is relatively inexpensive and can be done quickly but will not produce carefully documented findings (Preece et al, 2002). The implications for missing issues with this paradigm are large and designers should ensure that consequences for errors are relatively small.

Usability testing is a paradigm that has been the dominant approach to evaluation of HCIs in the past (Preece et al, 2002) and has been the subject of many studies (Van den Haak et al, 2004; Henderson et al, 1995). Usability can also be described as an evaluation technique but is also recognised by Preece et al (2002) as an evaluation paradigm because there are many different aspects of usability testing.

Field studies can be used to evaluate the HCI in the users' normal setting to enable the evaluator to understand how the user will interact with the HCI. Field studies can allow the evaluator to gain valuable insight into the HCI in its proposed environment.

Predictive evaluation is a paradigm used by experts to predict usability problems in HCIs. Users are not required when conducting predictive evaluations and so predictive evaluation can be a quick and inexpensive way of evaluating a HCI. Heuristic evaluation is a form of predictive evaluation.

Each evaluation paradigm has different issues and requirements that make it suitable for different evaluation situations. Triangulation, or the use of multiple paradigms is recommended in evaluations and different evaluation techniques (that will be discussed later) can be used across different paradigms (Preece et al, 2002).

The evaluation paradigms described by Preece et al (2002) are targeted at less experienced evaluators and are more basic in their description. The evaluation paradigms may be less useful to expert evaluators as the theoretical perspectives listed by Christie et al (2002) but are no less valid. Within these theoretical perspectives and evaluation paradigms are many different tools and techniques used to evaluate HCIs.

Techniques for the evaluation of the human computer interface

Evaluation techniques are more specific than evaluation approaches when evaluating HCIs. There are many different techniques that can be used for the evaluation of HCIs and there is no one specific evaluation technique that can be used for all situations, as there are many different evaluation situations. The most appropriate evaluation technique to use will depend on many factors and will be guided by the use of the evaluation paradigms or theoretical perspectives as discussed above.

Techniques can sometimes cross over the different evaluation paradigms and theoretical perspectives and can also be associated with multiple paradigms and perspectives (Preece et al, 2002; Christie et al, 2002). The type of evaluation technique used will also depend on who will be involved: experts, users or both. Different techniques are used for experts and users. Not involving users will help to speed the process of evaluation and reduce the cost of the evaluation but may also reduce the validity and reliability of the evaluation.

Three sources of techniques for evaluation of HCIs will be investigated; however those techniques presented by Preece et al (2002) will be discussed in fuller detail to enable this article to focus on evaluation guidelines for the more novice user.

Christie et al (2002) presented a comprehensive list of HCI evaluation tools and techniques. These tools and techniques have been related back to the theoretical perspectives such as the cognitive and the social psychological perspectives briefly discussed above. Christie et al (2002) has taken a more expert evaluation approach to describe the tools and techniques, relying on the reader possessing more expert knowledge and experience regarding the evaluation of HCIs. The tools and techniques discussed also require the evaluator to research other literature to gain a better insight into the technique. The tools and techniques that Christie et al (2002) listed are:

* expert panel analysis

* predictive models

* audits/ guidelines

* objective metrics

* dialogue error analysis

* focus groups

* questionnaires

* interviews

* stakeholder analysis

* observation of users

* physiological data

* user walkthrough

* controlled tests

* field trials

* critical events

Lee and Paz (1991) also listed a number of methods for the evaluation of HCIs. These methods do not follow formal and familiar terms such as usability testing or heuristic evaluation. The names of each method are more descriptive yet require the evaluator to interpret the method in their own way. Evaluation methods reported by Lee and Paz are:

* Concept test or paper and pencil test

* Friendly users

* Hostile users

* Simulation of users

* Simulation trials

* Iterative informal lab experiments

* Formal lab experiments

* Field trials

The techniques described by Preece et al (2002) will be discussed in greater detail. Techniques for the evaluation of HCIs are grouped into five main categories and are well defined, require less interpretation and are more numerous than those presented by Christie et al (2002) and Lee and Paz (1991). The categories for the tools and techniques help to distinguish what the technique is trying to achieve and gives clarity to the exact methodology that should be used in the evaluation. The five categories are:

1. Observing users

2. Asking users their opinions

3. Asking experts their opinions

4. Testing users' performance

5. Modeling users' task performance to predict the efficacy of a user interface

Observing users is a valuable tool to determine "what they do, the context in which they do it, how well technology supports them and what other support is needed" (Preece et al, 2002). Observing users can be applied to quick and dirty, usability and field studies paradigms. Observations can be performed in laboratory settings or in the users' natural environment. The data collected can be in the form of notes, photos, video footage and/ or audio recordings and may be exhaustive or be simple penned notes on what the evaluator observed whilst the user was interacting with the HCI.

Asking users is a way of getting direct and immediate feedback from the user about the HCI. Simply asking users is an effective way of finding out "what they do, what they want to do, what they like and what they don't like" (Preece et al, 2002). Asking users can be used in the quick and dirty, usability and field studies evaluation paradigms and may come in the form of interviews, questionnaires, discussions and focus groups. The data collected may be in the form of notes taken by the evaluator, the results of the questionnaire or simply notes taken by the user.

Asking experts can be an easy and inexpensive way of learning more about the HCI that is being evaluated. Asking experts involves inspections and walkthrough evaluations and can be used with quick and dirty and predictive evaluation paradigms. The quality of the data collected can depend on the experience and number of experts that are used to perform the evaluation. A team of experts will find a larger proportion of the problems associated with an HCI than one expert will (Preece et al, 2002).

Heuristic evaluation is a form of asking experts where an expert or team of experts evaluates a HCI following a set of principles called heuristics. Examples of heuristic principles are consistency and standards, error prevention, and visibility of system status and function (Lathan, Sebrechts, Newman and Doarn, 1999). Experts assess a HCI with consideration to heuristic principles and record usability problems to enable the problems to be rectified.

User testing is usability testing. It involves measuring user's performance to evaluate the HCI. Measurements can include time taken to perform a task, errors made and number of keystrokes performed. Data collected is then analysed so that it can be compared to a different design. User tests are confined to the usability paradigm and tests are conducted in controlled settings and involve typical users of the HCI performing typical well-defined tasks.

Modeling users' task performance or predictive models are used by experts and omit users from the evaluation, making the evaluation generally quicker and less expensive. Predictive models attempt to predict the user performance, efficiency and possible problems associated with HCIs. Predictive models are found only in the predictive paradigm and are generally used early in the development process. The GOMS (Goals, operators, methods and selection rules) model is an example of a predictive model (Preece, 2002).

A method for evaluation of the human computer interface

Johnson (1989) reported that evaluation methods for the usability of HCIs should be systematic, based on existing criteria, iterative, general, participative, sensitive, simple to use, face valid, related to realistic usage of system and reasonably exhaustive. To enable the evaluator to achieve these requirements the evaluator must follow a clear methodology when attempting to evaluate a HCI.

Preece et al (2002) presented a checklist entitled the DECIDE framework to assist evaluators to approach, plan and follow through a HCI evaluation. This framework may assist the evaluator to systematically design the evaluation so that the best possible HCI can be achieved. The DECIDE framework consists of the following key points:

1. Determine the overall goals that the evaluation addresses

2. Explore the specific questions to be answered

3. Choose the evaluation paradigm and techniques to answer the questions

4. Identify the practical issues that must be addressed, such as selecting participants

5. Decide how to deal with the ethical issues

6. Evaluate, interpret, and present the data.

Using an approach to evaluation of HCI such as the DECIDE framework will also assist the evaluator to the most appropriate method and finally the most appropriate tool.

Conclusion

The evaluator has many choices about how to evaluate a HCI and can apply different approaches and techniques to many different situations. The choice of approach can be dependent on factors that the evaluator has control over or factors that the evaluator must work around. For example the evaluator may be required to evaluate and report on a HCI in a short period of time with no budget with which to perform the evaluation, requiring an evaluation paradigm such as quick and dirty with an evaluation tool such as heuristics. Alternatively the evaluator may have a larger budget and more time resulting in a well structured predictive evaluation paradigm where heuristics may also be used but with a panel of experts to assess, discuss and report on issues identified with the HCI.

Applying multiple evaluation methods to the evaluation so that a fuller understanding of the HCI may be achieved is recommended. This may be achieved by selecting different technique categories such as testing users and asking users their opinion. Using multiple techniques will help to ensure the evaluation will achieve better results and identify more problems.

This paper has presented numerous approaches, methods, models, tools and techniques that can be used to evaluate HCIs as found in a small number of research articles and literature studies. Many other studies no doubt have interpreted the evaluation of HCIs into their own version but there are underlying theories and techniques that will always be required to be adhered to such as the predictive models evaluation paradigm and techniques such as usability testing and heuristics.

Bibliography

Baecker, R., Grudin, J., Buxton, W. and Greenberg, S (eds) (1995). Readings in Human-Computer Interaction (2nd Ed). (San Francisco, Morgan Kaufmann Publishers).

Carroll, J. (1997) Human Computer Interaction: Psychology as a science of design. Annu. Rev. Psych. 48, 61-83.

Christie, B., Scane, R., and Collyer, J. (2002) Evaluation of human-computer interaction at the user interface to advanced IT systems. In Evaluation of Human Work: A practical ergonomics methodology. 2nd Ed. Taylor and Francis.

Dix, A., Finlay, J., Abowd, G. and Beale, R. (1993) The design process, Chapter 5. Human computer interaction (Hemel Hempstead, Prentice Hall). Pp 147-190.

Hall, R (1997) Proceedings of the 33rd Annual Conference of the Ergonomics Society of Australia, 25-27 November 1997, Gold Coast Australia. (Canberra, Ergonomics Society of Australia) pp 53-62.

Henderson, R., Podd, J., Smith, M. and Varela-Alvarez, H. (1995) An examination of four user-based software evaluation methods. Interacting with computers. (7) 4, 412-432.

Johnson, G., Clegg, C. and Ravden, S. (1989) Towards a practical method of user interface evaluation. Applied Ergonomics. (20) 4, 255-260.

Lathan, C., Sebrechts, M., Newman, D. and Doarn, C. (1999) Heuristic evaluation of a web-based interface for internet telemedicine. Telemedicine Journal. (5) 2, 177-185.

Lee, C., and Paz, N. (1991) Human-computer interfaces: modelling and evaluation. Computers Industrial Engineering. 21, 577-581.

Olson, G. and Olson, J. (2003) Human Computer Interaction: Psychological Aspects of the Human Use of Computing. Annu. Rev. Psych. 54, 491-516.

Preece, J., Rogers, Y., and Sharp, H. (2002). Interaction Design: beyond human-computer interaction. John Wiley & Sons, USA.

Wilson, J. (2001) A framework and a context for ergonomics methodology. In Evaluation of Human Work. A practical ergonomics methodology. 2nd Ed. Taylor and Francis.

Van den Haak, M., de Jong, M. and Schellens, P. (2004) Employing think-aloud protocols and constructive interaction to test the usability of online library catalogues: a methodological comparison. Interacting with Computers. 16, 1153-1170.