The Journal of Credibility Assessment and Witness Psychology Journal Flag

1997, Vol. 1, No. 2, 44-67

Published by the Department of Psychology of Boise State University

The Suggestibility of the Child Witness:

The Role of Individual Differences and Their Assessment

Johann Endres, Psychological Institute, University of Bonn

Copyright 1997 by the Department of Psychology of Boise State University and the Author. Permission for non-profit electronic dissemination of this article is granted. Reproduction in hardcopy/print format for educational purposes or by non-profit organizations such as libraries and schools is permitted. For all other uses of the this article, prior advance written notice is required. Send inquiries by hardcopy to: Charles R. Honts, Ph. D., Editor, The Journal of Credibility Assessment and Witness Psychology, Department of Psychology, Boise State University, 1910 University Drive, Boise, Idaho 83725, USA.

Abstract: The present paper gives an overview of the individual-differences approach to eyewitness suggestibility in the broader context of the forensic evaluation of witnesses and their statements. Suggestibility as an individual trait has been researched for the last century, mostly by European psychologists, and several instruments for its assessment have been developed. The Bonn Test of Statement Suggestibility (BTSS) which can be used with children from about 4 to 10 years is described extensively and some preliminary hypotheses regarding its potential use as a diagnostic tool are considered.

The Suggestibility of the Child Witness:

The Role of Individual Differences and Their Assessment

The evaluation of individual statements of crime witnesses is a task that has been performed by German psychologists for legal courts in the last 40 years in estimated tens of thousands of cases. The methodology employed relies extensively on Criteria-Based Content Analysis (CBCA; Raskin & Esplin, 1991; Steller & Koehnken, 1989; Undeutsch, 1967, 1982) which was developed in Germany and Sweden from the 1950s. According to the "Undeutsch hypothesis", subjectively truthful experience-based reports will differ in a wide range of aspects, the "reality criteria", from fabricated accounts of events that were not actually experienced. This intuition has been recently validated in a number of field studies and experimental simulations in which marked differences were found between truthful reports and fictitious stories on most of the 19 criteria currently in use (e.g., Horowitz, Lamb, Esplin, Boychuk, Krispin & Reiter-Lavery, 1997; Lamb, Sternberg, & Esplin, 1994).

The rationale underlying CBCA as an evidentiary tool is its ability to discriminate between two rivaling hypotheses concerning the origin of the statement: a "reality hypothesis" and the alternative hypothesis claiming that the statement is an intentional construction. The expert's task in statement evaluation is to gather information and arrive at a rational decision between these hypotheses not by an assessment of the dispositional honesty or truthfulness of the witness but by applying the reality criteria to that specific statement.

For several reasons, a properly conducted statement assessment has to be more than just the scoring of a given statement on the 19 CBCA criteria (Raskin & Esplin, 1991). First, a statement assessment interview has to be conducted which will yield sufficient material to decide upon the criteria. Second, whether the criteria are met cannot be determined without additional knowledge about the witness's cognitive abilities and motivational disposition and about the background of the accusation. Finally, the discrimination between a subjectively truthful and an intentionally fabricated statement is not the whole story. Several alternative hypotheses regarding the genesis of the statement are conceivable. Other than willful deceit, practically relevant sources of inaccuracy in statements are errors due to deficient memory and suggestive influences. These sources of error are not targeted by traditional CBCA methodology, and there is no available evidence that the CBCA criteria are useful to evaluate hypotheses referring to these influences. In the broader framework of Statement Validity Assessment (SVA; Raskin & Esplin, 1991), a number of additional variables have been proposed that should be taken into account in the evaluation of these hypotheses. These include psychological characteristics of the witness (suggestibility, cognitive and emotional characteristics) as well as situational factors and investigative questions.

Table 1: Psychological Constructs Relevant For Witness Testimony Evaluation.
COGNITIVEwitness ability accuracy of testimony error, mistake, confusion
MOTIVATIONALgeneral credibility statement validity, specific credibility lie, deception, confabulation
SOCIAL (COGNITIVE AND MOTIVATIONAL) suggestibility of the witness suggestiveness (of the interview) contamination, distortion, pseudomemory

The taxonomy given in Table 1, which elaborates on the one proposed by Steller, Volbert and Wellershaus (1993) by including social factors, presents an alternative classification scheme for the variables that may be relevant in forensic assessment. The six constructs described in the following result from a cross-tabulation of the type and source of error involved.

The general credibility of a witness, a concept widely favored by the legal profession, does not enjoy great respectability in the eyes of empirical psychologists. Even though the variance due to individual differences in deceitful behavior may have been underestimated in the landmark studies by Hartshorne and May (1928; cf. Robinson, 1996, p. 81) and even though abnormal psychology describes psychopathy as an enduring pattern of lying and manipulative behavior (Hare, Forth, & Hart, 1989), knowledge of a person's previous behavior in different circumstances certainly does not by itself provide sufficient reason to accept his or her testimony as truthful or to reject it as dishonest. Almost everyone would lie when their most vital interests are at stake, and even a person with a bad reputation for frequent lying may become the victim of a crime and give a truthful report about it.

Whether the particular testimony of a witness is a subjectively truthful account of an incident or a fabricated story refers to statement validity (also known as specific credibility). This, and only this, is the issue for which CBCA claims to represent an efficient technique.

Developmental and general cognitive prerequisites of giving testimony are referred to as witness ability. They comprise perceptual functioning, memory and language skills, and the ability to understand the task of giving testimony (McGough, 1994). Lack of these faculties may impede very young children or demented persons from giving testimony, but in practice there is seldom need to assess these competencies apart from performance. More crucial issues that cannot easily be resolved pragmatically concern the ability to appreciate the difference between truth and falsehood or between fancy and reality (reality monitoring; Lindsay & Johnson, 1987).

Accuracy errors are often due to situational factors. The deterioration of memories has been found to depend upon, among many other factors, attention and emotional arousal at encoding, time elapsed since the event, and cues present or absent at retrieval. Accuracy errors are most controversial in testimony involving the identification of a suspected offender (Cutler & Penrod, 1995), and extensive research has been devoted to the question of how recognition accuracy can be enhanced.

One factor known to interfere with performance on a wide range of memory tasks, from physical judgments to person identification, is suggestive questioning (Loftus, 1979). The suggestiveness of a question or of an interview procedure can be defined in terms of their potential to influence a person's reporting of events or objects. The controversy over whether this influence is exerted through cognitive (memory impairment or source confusion) or social/motivational mechanisms (demand characteristics or conformity needs) is still unresolved (Ceci & Bruck, 1993). A typology of verbal suggestive techniques is given below.

Suggestibility has also been conceptualized as an individual trait variable, as a person's susceptibility or vulnerability to suggestive influences (Binet, 1900; Gudjonsson, 1992). As this individual-differences view is not universally accepted, the present paper is concerned with the evidence available on this issue, the existing assessment instruments, and the potential relevance of a person's suggestibility for statement assessment.

In the context of Statement Validity Assessment (SVA) as described by Raskin and Esplin (1991), the psychological expert's task in evaluating a particular statement will always comprise all six of these aspects, but to a varying extent. As Undeutsch (1967) has firmly emphasized, it is the validity of a specific statement that is most central in crime allegations, and not the habitual "truthfulness" of the witness. If the description of an offender or the identification of a subject is the issue, the evaluation of situational factors that might lead to memory distortion often has priority. The suggestiveness of interview procedures becomes relevant if the allegations do not arise from a spontaneous disclosure but result from more or less extended "memory work" with the witness.

It is less clear which role the personal variables - motivational dispositions, cognitive abilities and individual suggestibility - should be assigned in statement assessment. Most of the cited authors agree that they should be considered as background variables that provide additional information for or against some hypotheses and moderate the evaluation of the specific content criteria. Moreover, information about these background variables is useful in planning the interview and deciding which language levels, question formats and interview aids are appropriate for the witness. Research on these background variables is scarce, however, and their discussion in SVA literature is limited to casuistic illustrations and some general remarks (Raskin & Esplin, 1991).

Whereas questions of children's general witness ability and general credibility were extensively discussed in the early 20th century, researchers' attention moved to issues of specific credibility and of accuracy errors in the 1970s and 1980s. Recently, perhaps due to heightened sensitivity to child sexual abuse and increased efforts to detect and convict offenders, the dangers associated with suggestibility have become the focus of interest (Warren & McGough, 1996). Most of this recent research on suggestibility has addressed situational factors: effects of misleading questioning or postevent information, cues and memory props, and various instruction on memory accuracy or distortion.

Suggestibility: State or Trait?

Suggestibility research has a century-old history. Given this, it is astonishing that the diverse strands in this tradition, one directed at exploring situational determinants and one concerned with individual differences, have for the most part remained indifferent to each other.

Both aspects were already clearly present in the pioneer work of Binet (1900), who treated eyewitness suggestibility as an indicator of hypnotic suggestibility and who discovered empirical evidence for the differential suggestive effects of various types of questions. The explicit formulation that statements are a conjoint product of the interviewer and the interviewee goes back to William Stern, who pioneered psychological eyewitness research in Germany ("The statement as a mental achievement and product of interrogation" was the title of one of his influential papers; Stern 1904). In a later work, Stern (1926) reviewed several forensic cases and further elaborated his idea that suggestibility depends both on characteristics of the witness and of the interview situation. He thought that younger children and girls were more suggestible and that suggestibility was, moreover, related to "character" and also to the type of questions asked. Another source of suggestive influences, according to Stern (1926) lies in the previous answers of the interviewee who, either thoughtlessly or in the intent to deceive, may have committed him- or herself to some erroneous detail (e.g. concerning a person's appearance) and will later stick to it.

The fact that American experimental psychologists have focused on the situational determinants of suggestibility whereas Continental European researchers have been concerned with suggestibility as an individual trait may reflect the different roles of psychological experts in the Anglo-American adversary justice system and the Continental inquisitory system. What still appears "alien and heretical" (McGough, 1991, p. 166) from the experience of the American legal tradition, namely psychologists giving expert opinions on the credibility of a statement or a witness, has for decades been common legal practice in several European countries.

Despite the development of at least five tests of statement or interrogative suggestibility in the last 30 years for use in forensic assessment practice, disregard and not criticism has been the attitude most experimental psychologists have taken towards the individual differences approach. Recently, some critical arguments were brought forward by Baxter (1990) who claimed that the trait hypothesis neglects situational determinants and that the large situational effects let individual variation appear irrelevant. This line of criticism, however, seems to be based upon a misunderstanding of the trait concept. To describe a certain person as high in trait anxiety does not mean that he or she will be fearful of everything all the time, but just that she or he is disposed to show higher levels of fear in a greater variety of situations than most other people. Moreover, to claim that someone is outstanding in mathematical abilities is wholly reconcilable with the observation that she or he will fail in most items of a test comprising very difficult or unsolvable mathematical problems; he or she may still be able to solve a lot more problems than an average person.

Baxter (1990) further asserted that even high transsituational consistency and high correlations between diverse suggestibility measures would not provide convincing evidence for the trait hypothesis, as persons may consistently accept suggestions in different situations for quite different reasons. This is not convincing, either. If the diverse psychological processes leading to suggestive responses were unrelated one would not expect consistent individual differences in the form of substantial correlations across those situations. If, hypothetically, the same persons from a sample scored consistently high (or low) on algebraic and geometric tasks, this has to be considered as strong evidence for the existence of a common mathematical ability factor underlying individual differences for both task types.

In the field of children's suggestibility, recent years have seen a great number of studies which showed that these effects do not only occur in the laboratory, but also in real-world memory tasks, and which clarified the psychological and situational factors which determine or mediate these effects (e.g., Ceci & Bruck, 1993, 1995; Warren & McGough, 1996). But individual differences have been addressed in only a few of these studies, and mostly with adult subjects (Hyman & Billings, in press; Schooler & Loftus, 1993; Tomes & Katz, 1997). Specifically, in the studies of children's memories for real events involving their own bodies the suggestibility effects were often limited to a small part of the sample under study and most children were not influenced by the experimental manipulation. In one study of children's memories about a visit to a doctor (a study that is often cited as evidence of the potentially damaging effects of subtly misleading questions) only three children falsely reported genital touch (Saywitz, Goodman, Nicholas & Moan, 1991). Unfortunately, no information is available on the psychological characteristics of these children.

To some extent, the substantial variation between the subjects is clearly age-related (Ceci & Bruck, 1993). Three- and four-year-old children have consistently shown quite high levels of suggestibility, whereas adolescents do not differ much from adults in most experimentally studied memory tasks. Frontal lobe maturation may be the neurological basis for this developmental trend (Schacter, Kagan & Leichtman, 1995). The few failures to find these age effects are probably due to insufficient statistical power or to bottom or ceiling effects. This pronounced age trend clearly demonstrates the importance of individual differences although, of course, they do not prove temporal stability.

Evidence for stable individual differences, however, comes from correlational studies. Gudjonsson (1992) found substantial correlations between subjects' scores on his test of interrogative suggestibility and several personality variables, among them anxiety, intelligence, field dependence, and self-esteem. Habitually anxious and inhibited persons with weak cognitive abilities tend to be especially vulnerable to suggestive interviewing. Eysenck's (Eysenck & Furneaux, 1945) findings on what he called "primary" (hypnotic) suggestibility as unrelated to suggestibility in the witness context were confirmed in subsequent studies which found no correlation between hypnotic susceptibility and interrogative suggestibility (Stukat, 1958; Gudjonsson, 1992).

Evidence for systematic individual differences in young adults' tendency to produce false childhood memories after repeated suggestive questioning comes from a recent study by Hyman and Billings (in press). Subjects who developed pseudo-memories of childhood events scored higher on the Dissociative Experiences Scale (DES) and the Creative Imagination Scale (CIS). Stable individual factors related to vividness of mental imagery and reduced standards of reality monitoring thus apparently play a role in the implantation of false memories (Hyman & Pentland, 1996).

For child subjects, correlations between suggestibility scores and various cognitive and personality variables have been found. Zimmermann (1982b) reported a positive correlation (.47) between test scores and teacher's ratings of a child's suggestibility in a sample of 220 nine- and ten-year-olds. In another sample of 170 subjects between 12 and 16 years he also found moderate positive correlations with extraversion (.58), neuroticism (.40), self-rated dishonesty (.49), and fantasy proneness (.43), whereas scholastic achievement in maths (.10) and German (.19) were only weakly related (Zimmerman, 1982a). Danielsdottir, Sigurgeirsdottir, Einarsdottir, and Haraldsson (1993), using a version of the Gudjonsson Suggestibility Scale (GSS; Gudjonsson, 1987) with four age groups of children (6, 8, 10 and 12 years), found consistent negative correlations with the WISC vocabulary subtest which, however, only reached significance in the oldest subsample.

That suggestibility test scores may predict behavior in realistic legal circumstances is impressively, though anecdotally, illustrated by several British case studies involving false confessions reviewed by Gudjonsson (1992). In the famous case of the so-called "Birmingham Six", six innocent persons were suspected as IRA supporters and wrongfully convicted for a bombing homicide. After the verdicts were upturned, it was discovered that the four accused who had falsely confessed all scored higher on the GSS than the two "resisters", who despite extreme pressures did not confess.

Thus, suggestibility effects appear to be a function of situational and individual factors, and possibly their interaction. Situational factors include a weak memory representation of the event in question, uncertainty, high trust in the interviewer, and unrealistic expectations of the interviewee's performance. Individual factors include both cognitive and temperamental components. Clearly, both situational and individual factors are necessary to produce suggestibility effects. Thus, assessment of individual suggestibility demands the presence of typical suggestive influences.

Types of Suggestive Influences In Questioning

The main purpose of an interview is, obviously, to gather information from the person questioned. However, interviewing inevitably also constitutes a flux of information in the other direction, from the interviewer to the interviewee (Flammer, 1981). Asking a question will give the other person a hint what the interviewer already knows and what information he lacks, and the interviewer's reactions to answers may give away what he expects to hear or what is surprising and new for him. Thus, an interview will to some degree also be a process of learning on the side of the interviewee (Underwager & Wakefield, 1990). In addition to the exchange of information, an interview is further characterized by various aspects of mutual influence that can be described in terms of power, conformity, and compliance.

If every question transports some amount of information, how do suggestive and nonsuggestive questions differ? A question may be defined as leading or suggestive to the extent that it includes information about the desired or expected answer. Drawing from several lists compiled by Stern (1904, p. 339 ff.), Lipmann (1908, p. 15 ff.), Arntzen (1989, p. 23 ff.), Gudjonsson (1992), and Bender and Nack (1995, vol 2, p. 73 ff.), and enlarged by some additional speculation, the following question types shown in Table 2 can be discriminated. As no comparative data on effect sizes are available, they have been arranged according to their presumed intensity (in terms of the amount of information about the expected answer that is implied). The suggestive intensity of the question types is not necessarily closely related to their effectiveness, as more subtle methods might be more effective than more obvious tactics of influence.

Table 2. Types of Questions With Low and High Suggestivity.
Low Suggestivity

Open question

"What happened?" "What did you see?"
Identification question"What time was it?"
Selection question"Was it a man or a woman?"
Yes-No question"Did the man say anything?"
High Suggestivity

Leading question with premises

"Did he put the stolen money in his pocket?"
Implied description and evaluation"How fast did X run when you saw him flee from the shop?"
Implied expectation"And then the victim certainly cried for help?"
Incomplete disjunction in alternative question "Was it a red or a black car?"
Pressure toward conformity (social comparison) "A and B have stated that … Didn't you see that too?"
Illocutive particles, phrase and intonation "Indeed you did hear the shot, didn't you?"
Follow-up questions (elaboration of suggested content) "Now this shot that you finally admitted you heard: Where did it come from?"
Question repetition"Are you really sure? I'll ask you again: Did X take the money?"
Negative feedback"It is simply not possible that you don't remember this!"
Threat and promise"I will keep asking you until you tell me what X did to you. Else, I won't leave you alone. You will feel better after telling me."

In order to obtain a statement, the interviewer has to prompt the witness in some way, by means of requests ("Tell me what happened yesterday evening!") or questions. Open-ended questions (mostly beginning with the letter w: "What", "who", "when", "where", "why", and "how") only give a broad frame for the answer, without conveying any specific content information. Therefore, they usually demand full sentences as answers. This question type is universally recommended for interrogations of witnesses or suspects, particularly for the first phases of questioning. Identification questions ("Which color ...?", "What size ...?") are usually also not suggestive as they just outline the dimension on which some answer is expected.

Selection questions may be considered somewhat suggestive as they convey information on possible states and at the same time imply that the other person would know which is actually the case. They are not leading only if the "don't know" option is explicitly part of the enumeration and if the diverse possible states possess about equal subjective likeliness. An incomplete disjunction in an alternative question, however, may be quite misleading as it limits the number of alternatives and may thus carry the message that options not explicitly presented will be rejected.

Yes-no questions are mostly low to moderate in suggestive content, depending on how much they limit the range of possible answers and how accurately they represent the different options. Yes-no type questions might be regarded as a special case of selection questions as they explicitly or implicitly present a complete disjunction of two possible alternatives, affirmation and negation. However, in most cases those two will not be psychologically equivalent. First, it has long been known in test construction that subjects tend to favor acquiescent answers (Gudjonsson, 1992, p. 140). Second, the verbal description of a fact may evoke a mental image which in itself may exert a suggestive influence and which cannot be counteracted by an equally potent negative image. For example, the question, "Did the man hold a gun in his hand or did he not?", calls forth the ideational image of the weapon in an outstretched hand, but the abstract negation cannot command a similarly strong mental image of an absent pistol. Besides that, "yes" and "no" are not psychologically equivalent answers as the latter means contradicting a high-status adult, and contradicting a person may be considered impolite if no justifying reason can be given. An extremely suggestive form of the Yes-no question is an affirmative sentence with an interrogative intonation ("And he held a pistol?"), which may be understood as a purely rhetorical question, a request for affirmation.

The following question types are always suggestive as they provide explicit information about an expected answer. A leading question with a premise contains items of knowledge that did not yet occur in the interviewee's preceding answers. The suggestiveness is relatively clear when the premise is the focal content of the question (e.g., "Did the suspect threaten the policeman?") and probably much more subtle if it is presented in a syntactically less prominent position, in a subordinate clause, an adjective, or a adverbial phrase (e.g., "Do you remember the brutal-looking foreigner who threatened the policeman?"). The same applies to implied descriptions and evaluations. The well-known observation by Loftus (1979), that speed estimations depend upon the wording used in the question illustrates this. Implied expectations offer inferences and hints (which may be derived from generally accepted scripts or stereotypes) and present them as highly probable or logically conclusive, thus making it difficult to reject or to contradict them.

The Pressure toward conformity uses social comparison (peer pressure) or the force of authority by inducing conflicting tendencies and lowering confidence in an interviewee whose memory does not conform with what is presented to him as others' testimony or opinion. The conformity experiments of Asch (1951) illustrate how difficult it is for subjects to stick to their own judgment when they feel they are in an extreme minority position.

In everyday life, pragmatic (illocutive) particles ("yet", "indeed", "even"), phrases ("isn't it?") and intonation (putting stress on certain words) are widely used to convey to participants in a conversation how they should make social sense of the explicit content of a message and respond to it. Use of the determinate ("the") instead of the indeterminate ("a" or "an") article signalizes that a factual statement implied in a declarative sentence or a question is not open to discussion. "Indeed you have heard the shot, haven't you?" is semantically not much different from "Did you hear any shot?" However, the first question makes clearly that a negative answer will be disbelieved and may lead to trouble. What syntactically forms an interrogative clause may be converted to a request for affirmation by rather simple linguistic means.

The following techniques are less subtle, but not necessarily less efficient. That repeated questions are particularly suggestive has been established by Poole and White (1991). It may make a difference, however, if the repetition occurs immediately after the answer or in the context of a repeated questioning which may serve to check a first record or explore specific details. Repeated questions on multiple interviews can indeed enhance memory performance (Poole & White, 1995). If, however, an almost literal repetition of the question follows an answer immediately, this will unmistakably express the questioner's discontent with the first answer and his demand that the answer be changed. An exception could occur when the questioner makes it clear that he or she did not understand the answer or that she or he wants the answer to be confirmed. As question repetition violates the politeness norm that governs social interaction, a position of authority is usually presupposed.

In the context of ordinarily held conversation rules, question repetition is just one specific form of negative feedback which informs the person addressed that his performance falls short of standards and has to be improved. Negative feedback may also be given explicitly, by saying that parts of a statement are improbable, incredible or unacceptable, and should therefore be changed. This can further be combined with threats and promises, the announcement of reward or punishment contingent on certain answers. Crude as these latter forms of suggestion may appear, analyses of real investigation interview transcripts show that they do indeed occur even in professional and police interrogations (Ceci & Bruck, 1995, p. 145).

Some caveats are appropriate here. First, the list gives only verbal suggestive techniques and is probably far from complete; techniques like hypnosis, props such as anatomically detailed dolls, and the diverse channels of nonverbal behavior (gestures, facial expression and paralinguistic aspects of speech) have been disregarded here, although they may represent even more powerful suggestive tools. Second, the context of the question or the entire interview has to be considered and may make questions that in isolation seem rather innocuous highly leading. The actual suggestiveness of a question is an empirical issue and can certainly not be determined by definition or by theoretical arguments alone. And third, not all suggestive questions are inherently illicit and detrimental in an interview context. Questioning a young child or an extremely monosyllabic witness is often not feasible without some amount of prompting and cueing. However, the use of suggestive prompts and cues should be considered in detail. In accordance with a rule postulated by Arntzen (1989) only the answer surplus, that portion of information which is not contained in the question, should be utilized as evidence. Using an impossible suggestion (e.g., "And then the man just dissolved into thin air, or what happened?") or the contrary of the expected detail as a prompt may constitute a useful compromise.

Assessment of witness suggestibility

Assessment of individual suggestibility can either be done by means of standardized tests or in the course of the statement interview by asking suggestive questions. The latter, unstandardized method is that preferred by Arntzen (1993) and by Raskin and Esplin (1991). These "suggestive probes" should be directed at peripheral aspects of the reported events. By trying to elicit additional unrealistic descriptions a witness's vulnerability to this kind of influence can be determined. However, this procedure is problematic and should be used with caution (Raskin & Esplin, 1991, p. 278). As suggestibility is considered a natural and universal characteristic of human memory (Loftus, 1979), everybody can be tricked if their memory is uncertain and the technique is subtle enough. On the other hand, even very suggestible witnesses may be able to resist some misleading questions because they can detect the implausibility or impossibility of the suggested facts. Therefore, whether a child yields to some suggestive question cannot be considered as valid diagnostic information. Moreover, what first appears as a peripheral and inconsequential detail may later turn out to be a crucial point for conviction, and should therefore not be unnecessarily tainted by suggestive questioning.

For the purpose of a standardized assessment of suggestibility, independent of the statement in question, several psychometric tests have been developed, at least some of which are still used in forensic practice (Table 3).

Table 3. Psychological Tests of Suggestibility and Some of Their Characteristics.
TestAuthors StimuliTypes of Questions Sub-scalesTarget Age Test NormsRemarks
WSTBottenberg & Wehner, 1971 Table #5 of the TAT Affirmative sentencesnone 12 to 13N=169, only girls group test, written form
TASBurger, 1971 30 slides, scenes of everyday life 120 yes-no questions none7 to 14 N=200
SET-SZimmermann, 1978, 1982a, b, 1988 4 photos18 yes-no (assertive sentences) sexual content sugg. 9 to 10 and 12 to 16110 (younger) and 225 (older) forms for younger and older children
GSSGudjonsson, 1984, 1987 Short storyYes-no and alternative yield, shift- N=195, children and adults parallel form
BTSSEndres & Scholz, 1995 Illustrated short story Yes-no, alternative, repeated 3 (div. Question formats) 4 to 10N=62 parallel form

The "Test of Statement Suggestibility" by Burger (1971) used a set of 30 slides which are presented to the subject for half a second each. Afterwards, three nonsuggestive and three suggestive questions are asked about each picture which either refer to details not actually present or which misrepresent or negate details that were present. As a test score, the overall number of falsely answered suggestive questions is computed. Despite the test's high reliability (over .90) in an adult sample, it has never been used in research again, and probably not much in forensic practice. No significant correlations were found with an intelligence test and with the Body-Sway-Test (Hull, 1933) which has long been used as a measure of "primary" (i.e., hypnotic) suggestibility.

The "Würzburg Suggestibility Test" (WST) by Bottenberg and Wehner (1971) utilized Table 5 of the Thematic Apperception Test by Murray (1943) as a test stimulus which was presented to the subjects for 20 seconds. Afterwards they read 20 short statements (indicative sentences, not questions) referring to a picture (e.g., "There was a small carpet on the floor") and had to decide whether the statement was true. The authors considered half of these sentences high and the other half low in suggestiveness, depending on whether these incorrect statements were plausible in the context of the depicted scene or not. The number of affirmative answers was used as the test score for suggestibility. Preliminary test norms published by the authors refer to a sample of 12- and 13-year-old high-school girls who had answered the test in written form in class. The split-half reliability coefficient was .86, and positive correlations were found with teacher's and self-ratings of forgetfulness, distractibility, nervousness and affiliation (Bottenberg & Wehner, 1972). As it can be applied easily and quite economically, this test enjoyed some popularity with forensic experts, although it is seriously flawed: Forensic practice demands individual application, whereas the test was standardized as a group test; the suggestive technique (indicative sentences) is not representative of suggestive questioning procedures; and the subjects may easily guess the purpose of the test or misconceive it as a discrimination task.

Zimmermann (1979, 1982a, 1982b, 1988) developed two versions of another Suggestibility Test (SET-S) for 9- and 10-year-olds and for 12- to 16-year-old school children. Test stimuli were four pictures taken from a test of social attitudes depicting play scenes with children, each of which was shown for 30 seconds. Several suggestive questions referring to details of these pictures (e.g., "There were two adults in the background, weren't there?") were asked. The overall test score (ranging from 0 to 40) represents the portion of inaccurately answered suggestive questions. The consistency coefficient was .80 (Cronbach's Alpha), indicating good reliability. A separate test score was computed for eight suggestive questions with sexual content. Again, correlations were found with teachers' ratings and school marks.

The Gudjonsson Suggestibility Scale (GSS; Gudjonsson 1984), for which a second, parallel version was also published (Gudjonsson, 1987), is superior to these three tests in several respects. A short story and not a static picture is used as stimulus material. The list of 20 questions contains some "true suggestions", which were included in order to conceal the real purpose of the test. The other 15 questions are misleading and are composed of various types (yes-no-questions and false alternative questions). The "Yield" score gives the number of suggestive answers to these 15 questions. A second, "Shift" score gives the number of answer changes for these 15 questions when the subjects are told that they have made some mistakes and are given another chance to answer all 20 questions in a second turn. Total suggestibility scores can be computing by adding both scores. Same-session parallel test reliability was quite high (rtt=.92).

The methodological weaknesses of the existing German suggestibility tests and the fact that the GSS was originally developed for use with adult subjects were the reasons why we decided to develop a new assessment tool for children.

The Bonn Test of Statement Suggestibility (BTSS)

The BTSS as a standardized instrument for the assessment of individual suggestibility in children is characterized by the following features, some or most of which were lacking in the older tests reviewed above:

- It is specifically adapted for use with preschool and elementary school children, a group for which the question of suggestibility has the highest forensic relevance.

- It is constructed and standardized as an individual test. It is administered verbally, not as a written questionnaire or a tape, and thus shares important aspects of real witness interrogations.

- The stimulus story to which the questions refer comprises verbal as well as visual information and thus avoids restriction to just one sensory canal.

- Three different kinds of misleading suggestive questions are used, for which separate test scores are computed.

- In order to minimize face validity, additional leading questions with correct information are employed as filler items. These was included to make subjects belief that the test is about memory accuracy and not about malleability.

- Two parallel versions of the test are at hand, allowing for repeated administration in research context or in single-case assessment.

The current revised form (Appendix A) comprises two versions, each containing a short story of about 330 words, four colored pictures, and a set of 31 questions. Administration of the test takes about 30 minutes and has four phases: (1) the presentation of the stimulus story and the four illustrations; (2) a phase of free report in which the child is encouraged to retell the contents of the story; (3) a 15-minute interval in which a non-verbal intelligence test is administered; (4) questioning the child about the stimulus story.

The stimulus story is adapted to children in contents and form: Its presentation (reading it aloud to the child and simultaneously showing the illustrations) is similar to the ordinary reading-out of fairy tales to children. The protagonists' sex is the same as the child's; i.e., there is a male and a female form of each test version. In one version ("toy duck") the story is about a child who lends a toy duck to a friend for the weekend, and the friend breaks it and has it repaired. In the second version ("roller skates"), the story tells an accident involving two children skating on the pavement and a third child that is hurt by them while on its way shopping for a grandmother. Both stimulus stories have been taken, with minor modifications, from the diploma thesis of Bader (1993), where they were used in quite another context, the determination of children's cognitive prerequisites for civil-law responsibility.

The open questioning immediately following the stimulus story serves two functions: First, the child is required to recapitulate the story and thus to encode it. Second, the amount of information supplied by the child in this free-recall phase can be used as a control variable for memory performance. Thus it can be determined if the child, due to lack of attention or insufficient understanding, is able to reproduce the essential contents of the story; if not, the test is not applicable. (The amount of information provided by the child is scored by breaking up the stimulus story in about 60 meaningful items and determining which of these are present in the child's free recall.)

The following interval serves to weaken the memory trace, thus giving room for suggestive influences. The administration of a nonverbal intelligence test (the Culture Fair Test, Scale 1; Cattell, 1966) at the same time diverts the child's attention from the story content, without the danger of interference with other verbal material.

In the questioning phase, the child is asked 31 questions on the contents of the stimulus story, all of which are more or less suggestive and can be assigned to four classes:

  1. Distractor questions are leading questions in which a correct answer is suggested (e.g., "The boy's name was Oliver, wasn't it?"). These questions, most of them located near the beginning or at the end of the list, are filler items which serve to disguise the real purpose of the test. They are not used in further analyses.
  2. Misleading Yes-No questions state an incorrect fact in a way to make clear that an affirmative answer is expected (e.g., "Oliver was on his way to school when it happened, wasn't he?"). The correct answer to these questions would, of course, be "no".
  3. Wrongly disjunctive Alternative questions present two equally non-correct answer options in a way that seems to demand a positive choice from the subject (e.g., "Did he want to buy apples or bread?"). These questions are misleading either because the information in question is not available to the child (a correct answer would be "I don't know") or because the two alternatives are incomplete and do not include the correct answer (a correct and nonsuggestible answer would be to say "Neither of the two, but..." or to reject the question).
  4. Repeated questions are immediate repetitions of Yes-No or Alternative questions, irrespective of the answer just given (e.g., "Are you sure? Did he want to buy apples or bread?"). They convey the message that the answer the subject has just given is not accepted and that he or she is expected to change this answer. Answers are counted as suggestive if such a shift from the preceding answer occurs (not if the shift is from "I don't know" to "no" or vice versa). The nonsuggestible reaction would be to stick to the first answer or to reject the repetition.

Three test subscores for the question types Yes-No, Alternative and Repetition are computed which indicate the number of suggested answers given to the questions in these three classes. The sum of the three subscales is computed to give an Overall score of suggestibility.

Survey of results obtained with the BTSS

The BTSS has by now been used in several empirical studies (Endres, Scholz, & Poggenpohl, 1996; Endres, Scholz, & Summa, in press). In the following, some data on reliability and validity will be presented.

Two preliminary studies had yielded internal consistency coefficients for the three subscales (between .60 and .78) which were considered quite promising, compared to other tests for younger children. Moreover, age, the vocabulary scale of the WISC and nonverbal intelligence (measured with a German version of the Culture Fair Test by Cattell, 1966) were correlated with the Yes-No subscale, to a lesser degree with the Alternative subscale, but not with the Repetition subscale. On the basis of these data, an item revision was performed which resulted in the revised test version described above.

A sample of 62 children between 4 and 10 years were examined with the two versions of the revised BTSS, with several weeks in between. The following results were found (Table 4):

Table 4: Reliability Estimates And Correlations With Other Variables In Children From 4 To 9 Years Of Age.

Cronbach's a a for parallel version retest r r with free recall r with CFT r with age

(9 items)

.74.84 .67-.52** -.71**-.72**

(7 items)

.77.73 .65-.27 -.36*-.44**

(8 items)

.70.65 .32-.16 -.16-.28

(24 items)

.85.85 .66-.41** -.53**-.62**

N=62, ** p < .001 * p < .01

Internal consistency estimates (Cronbach's Alpha) were satisfactory, ranging from between .70 and .77 for the subscales and amounting to .85 for the total scale. Retest correlations with the parallel test were slightly lower, .67 and .65, for the first two subscales and only .32 for the Repetition subscale. These coefficients are quite high compared to many personality test subscales for younger subjects.

The intercorrelations between the three subscales were positive. But whereas Yes-No and Alternative were highly correlated, Repetition correlated only weakly with the two other subscales.

As expected, suggestibility correlated negatively with age, mental ability and the amount of information recovered in the free recall phase. These correlations were quite large for the Yes-No subscale and almost zero for Repetition, with Alternative ranging somewhere in between (cf. Figure 1). A similar pattern (age effects for the YIELD but not the SHIFT scale and higher correlations of the first scale with memory and a vocabulary test) was found by Danielsdottir et al. (1993) in a study with the Gudjonsson Suggestibility Scale in Icelandic children.

These results were taken as evidence that children's suggestibility can be measured reliably with our test and is a relatively stable variable, at least at short intervals. Moreover, the pattern of correlations indicates that suggestibility is not a homogeneous construct but consists of at least two different components. One component might be interpreted as cognitive, as it is highly correlated with age, mental ability and memory performance and is represented most clearly by the Yes-No scale. The meaning of the second component, represented by the Repetition scale, is less clear, but the low correlations with other variables and the low stability coefficients point to situational influences.

A further study with 92 preschool children (Endres, Scholz, & Poggenpohl, 1996) showed that instructional variation (a warning of "tricky" questions and the explicit permission to say "I don't know", combined with a training item) reduced the error rate in the test questions of the BTSS. In addition, children who had heard the stimulus story twice were less suggestible than children who had just heard the story once in the standard procedure. Thus, the BTSS provides a sensitive tool for studying situational effects on suggestibility in young children.

Issues of forensic application

As several of the older suggestibility tests are presumably still in use in forensic contexts in Germany, it might seem a good idea to employ the BTSS from now on due to its advantages. However, several cautions should be heeded.

First, although the test scores have been shown to be reasonably stable over time, convincing evidence for construct validity is still lacking. Up to now it has not been demonstrated that the susceptibility of child witnesses to suggestive influences in real sexual abuse investigations can be predicted (or postdicted) from test scores. Studies on the predictive validity of the test are therefore urgently required.

Second, individual suggestibility is just one, and certainly not the most important psychological construct of interest in statement evaluation. Even if the test's validity can be established, it would be highly questionable to discredit and repudiate a child's statement simply because he or she has obtained high scores on the suggestibility test. On the other hand, moderate or even low test suggestibility does not preclude serious distortions in a statement if extensive coaching and misleading influence have been exerted on the witness. Obviously, it is the interaction between individual susceptibility and the suggestive influences actually coming from a child's social environment and from investigators that can result in faulty statements and, possibly, false allegations. Therefore, as mentioned above, findings of individual suggestibility do not disconfirm a statement but lend more or less support for one of the hypotheses discussed earlier.

Under one constellation the usefulness of the test is apparent. In mass accusation cases, in which a large number of children often produce various accusations differing in content and seriousness, the test can be used to decide in which direction suggestive influences have been working. If the accusations of the highly suggestible children are more serious than those of low-scorers, this points to the conclusion that at least some of the accusations are due to faulty interviewing or coaching of the children. If, on the contrary, the low-scorers produce more serious accusations and more details, this would be indicative of their validity.

In individual assessments, the test scores should be seen in context with the child's behavior during the interview, especially his or her reactions to suggestive questions that relate to peripheral details of the events in question. The "suggestive probing" recommended by Arntzen (1989), Bender and Nack (1995) and Yuille, Hunter, Joffe and Zapurniuk (1993) tests whether the witness will deviate from his or her statement and accept alternative information. If a child is able to resist these attempts, this is taken as evidence in favor of the validity of the statement. The logic of this interview tactic can be strengthened by combining it with the standardized suggestibility test. If the child obtains high scores in the test but shows strong resistance concerning circumstances of the alleged events, this would indicate a strong memory representation. Further information regarding, among others, the context of the disclosure and previous interviews would be needed to decide if this is due to characteristics of the experience or to intense coaching. On the other hand, if the child scores low on the test but turns out to be quite malleable regarding the contents of the statement, this would raise serious doubts as to the validity of the allegation.

Much research remains to be done to clarify these issues. It should be kept in mind that assessing a child's suggestibility can only be the second best solution to the problem. Diminishing or precluding suggestive influences as far as possible in the investigative process could prevent most of the problems for which a standardized suggestibility test will always present an imperfect solution.


Arntzen, F. (1989). Vernehmungspsychologie [Psychology of interrogation] (2nd ed.). München: Beck.

Arntzen, F. (1993). Psychologie der Zeugenaussage [Psychology of testimony] (3rd ed.). München: Beck.

Asch, S. (1951). Effects of group pressure on the modification and distortion of judgments. In H. Guetzkow (Ed.), Groups, leadership, and men (pp. 177­190). Pittsburg, PA: Carnegie Press.

Bader, S. (1993). Die Entwicklung der zivilrechtlichen Deliktsfähigkeit gem. § 828 BGB: Empirisch­psychologische Überprüfung einer normativen Altersgrenze [The development of civil liability: Empirical validation of an age norm]. Unpublished diploma thesis, Psychological Institute, Bonn.

Baxter, J. (1990). The suggestibility of the child witness: A review. Applied Cognitive Psychology, 4, 393­407.

Bender, R., & Nack, A. (1995). Tatsachenfeststellung vor Gericht. Vol. II: Vernehmungslehre [Judicial evidence, vol. II: Interrogation] (2nd ed.). München: Beck.

Binet, A. (1900). La suggestibilité [Suggestibility]. Paris: Schleicher Frères.

Bottenberg, E. H., & Wehner, E. G. (1971). Suggestibilität. I. Konstruktion und empirische Überprüfung des Würzburger Suggestibilitäts­Tests (WST) [Suggestibility: I. Construction and empirical validation of the Wuerzburg Suggestibility Test (WST)]. Praxis der Kinderpsychologie und Kinderpsychiatrie, 20, 161­165.

Bottenberg, E. H., & Wehner, E. G. (1972). Suggestibilität: II. Einige persönlichkeits­ und leistungsdiagnostische Korrelate des Würzburger Suggestibilitäts­Tests (WST) [Suggestibility: II. Some personality and achievement correlates of the Wuerzburg Suggestibility Test (WST). Praxis der Kinderpsychologie und Kinderpsychiatrie, 21, 282­288.

Burger, H. (1971). Die suggestive Beeinflußbarkeit von Aussagen über Beobachtungen. Entwicklung und erste Überprüfung eines Tests zur Aussagesuggestibilität (TAS) bei der Glaubwürdigkeitsbeurteilung kindlicher und jugendlicher Zeugen [The suggestive malleability of statements about observations. Development and first validation of a Test of Statement Suggestibility (TSS) in the credibility assessment of child and juvenile witnesses]. Universität Freiburg/Br.: Philosphical Dissertation

Cattell, R. B. (1966). Handbook for the Culture Fair Intelligence Test, Scale 1. Champaign, IL: IPAT.

Ceci, S. J., & Bruck, M. (1993). Suggestibility of the child witness: A historical review and synthesis. Psychological Bulletin, 113, 403­439.

Ceci, S. J., & Bruck, M. (1995). Jeopardy in the courtroom. Washington, DC: American Psychological Association.

Cutler, B. L., & Penrod, S. D. (1995). Mistaken identification. The eyewitness, psychology, and the law. Cambridge: Cambridge University Press.

Danielsdottir, G., Sigurgeirsdottir, S., Einarsdottir, H. R., & Haraldson, E. (1993). Interrogative suggestibility in children and its relationship with memory and vocabulary. Personality and Individual Differences, 14, 499­502.

Endres, J., Poggenpohl, C. & Scholz, O.B. (1996). Pre­school children's statement suggestibility: Effects of memory trace strength and of warnings against misleading questions. Paper presented at the 6th European Conference on Psychology and Law, Siena, August 1996.

Endres, J., Scholz, O.B. & Summa, D. (in press). Aussagesuggestibilität bei Kindern ­ Vorstellung eines neuen diagnostischen Verfahrens und erste Ergebnisse [Statement suggestibility in children: Presentation of a new assessment method and first results]. In M. Stadler, T. Fabian & L. Greuel (Eds.), Psychologie der Zeugenaussage [Psychology of testimony]. München: PVU.

Eysenck, H. J., & Furneaux, W. D. (1945). Primary and secondary suggestibility: an experimental and statistical study. Journal of Experimental Psychology, 35, 485­503.

Flammer, A. (1981). Towards a theory of question asking. Psychological Research, 43, 407­420.

Gudjonsson, G. H. (1984). A new scale of interrogative suggestibility. Personality and Individual Differences, 5, 303­314.

Gudjonsson, G. H. (1987). A parallel form of the Gudjonsson Suggestibility Scale. British Journal of Clinical Psychology, 26, 215­221.

Gudjonsson, G. H. (1992). The psychology of interrogations, confessions and testimony. Chichester: Wiley.

Hare, R. D., Forth, A. E., & Hart, S. D. (1989). The psychopath as prototype for pathological lying and deception. In J. C. Yuille (Ed.), Credibility assessment (pp. 25­49). Dordrecht: Kluwer.

Hartshorne, H., & May, M. A. . (1928). Studies in the nature of character, Vol. I. New York: Macmillan.

Horowitz, S. W., Lamb, M. E., Esplin, P. W., Boychuk, T. D., Krispin, O. & Reiter­Lavery, L. (1997). Reliability of criteria­based content analysis of child witness statements. Legal and Criminological Psychology, 2, 11 ­ 21.

Hull, C. L. (1933). Hypnosis and suggestibility. New York: Appleton­Century.

Hyman, I. E., & Billings, F. J. (in press). Individual differences and false memories. Memory.

Hyman, I. E., & Pentland, J. (1996). The role of mental imagery in the creation of false childhood memories. Journal of Memory and Language, 35, 101­117.

Lamb, M. E., Sternberg, K. J., & Esplin, P. W. (1994). Factors influencing the reliability and validity of statements made by young victims of sexual maltreatment. Journal of Applied Developmental Psychology, 15, 255­280.

Lindsay, D. S., & Johnson, M. K. (1987). Reality monitoring and suggestibility: Children's ability to discriminate among memories from different sources. In S. J. Ceci, M. P. Toglia & D. F. Ross (Eds.), Children's eyewitness memory (pp. 92­121). New York: Springer.

Lipmann, O. (1908). Die Wirkung von Suggestivfragen [The effect of suggestive questions]. Leipzig: Barth.

Loftus, E. F. (1979). Eyewitness memory. Cambridge, MA: Harvard University Press.

McGough, L. S. (1991). Commentary: Assessing the credibility of witnesses' statements. In J. Doris (Ed.), The suggestibility of children's recollections (pp. 165­167). Washington, DC: American Psychological Association.

McGough, L. S. (1994). Child witnesses: Fragile voices in the American legal system. New Haven, CT: Yale University Press.

Murray, H. A. (1943). Thematic Apperception Test. Cambridge, MA: Harvard University Press.

Poole, D. A., & White, L. T. (1991). Effects of question repetition on the eyewitness testimony of children and adults. Developmental Psychology, 27, 975­986.

Poole, D. A., & White, L. T. (1995). Tell me again and again: Stability and change in the repeated testimonies of children and adults. In M. S. Zaragoza, J. R. Graham, G. C. N. Hall, R. Hirschman & Y. S. Ben­Porath (Eds.), Memory and testimony in the child witness (pp. 24­43). Thousand Oaks, CA: Sage.

Raskin, D. C., & Esplin, P. W. (1991). Statement validity assessment: Interview procedures and content analysis of children's statements of sexual abuse. Behavioral Assessment, 13, 265­291.

Robinson, W. P. (1996). Deceit, delusion, and detection. London: Sage.

Saywitz, K. J., Goodman, G. S., Nicholas, E. & Moan, S. F. (1991). Children's memories of a physical examination involving genital touch: Implications for reports of child sexual abuse. Journal of Consulting and Clinical Psychology, 59, 682 ­ 691.

Schacter, D. L., Kagan, J. & Leichtman, M. D. (1995). True and false memories in children and adults: A cognitive neuroscience perspective. Psychology, Public Policy, and Law, 1, 411­428.

Schooler, J. W. & Loftus, E. F. (1993). Multiple mechanisms mediate individual differences in eyewitness accuracy and suggestibility. In J. M. Puckett & H. W. Reese (Eds.), Mechanisms of everyday cognition (pp. 177­203). Hillsdale, NJ: Erlbaum.

Steller, M. & Köhnken, G. (1989). Criteria­based statement analysis. In D. C. Raskin (Ed.), Psychological methods in criminal investigation and evidence (pp. 217­245). Berlin: Springer.

Steller, M., Volbert, R. & Wellershaus, P. (1993). Zur Beurteilung von Zeugenaussagen: Aussagepsychologische Konstrukte und methodische Strategien [On the assessment of statements: Psychological constructs and methodological strategies]. In L. Montada (Eds.), Bericht über den 38. Kongreß der Deutschen Gesellschaft für Psychologie in Trier 1992, Band 2 (pp. 367­376). Göttingen: Hogrefe.

Stern, W. (1904). Die Aussage als geistige Leistung und als Verhörsprodukt [The statement as a mental achievement and product of interrogation]. Beiträge zur Psychologie der Aussage, 3, 269­415.

Stern, W. (1926). Jugendliche Zeugen in Sittlichkeitsprozessen: ihre Behandlung und psychologische Begutachtung [Juvenile witnesses in sex crime proceedings: their treatment and psychological assessment]. Leipzig: Quelle & Meyer.

Stukat, K.G. (1958). Suggestibility. A factorial and experimental analysis. Stockholm: Almqvist & Wiksell.

Tomes, J. L. & Katz, A. N. (1997). Habitual susceptibility to misinformation and individual differences in eyewitness memory. Applied Cognitive Psychology, 11, 233 ­ 251.

Underwager, R., & Wakefield, H. (1990). The real world of child interrogations. Springfield, IL: Charles C. Thomas.

Undeutsch, U. (1967). Beurteilung der Glaubwürdigkeit von Zeugenaussagen [Assessment of the credibility of witnesses' statements]. In U. Undeutsch (Ed.), Handbuch der Psychologie, Band 11: Forensische Psychologie [Handbook of psychology, vol. 11: Forensic psychology] (pp. 26 ­ 181). Göttingen: Hogrefe.

Undeutsch, U. (1982). Statement reality analysis. In A. Trankell (Ed.), Reconstructing the past (pp. 27­56). Stockholm: Norstedt & Söners förlag.

Warren, A. R., & McGough, L. S. (1996). Research on children's suggestibility: Implications for the investigative interview. In B. L. Bottoms & G. S. Goodman (Eds.), International perspectives on child abuse and children's testimony (pp. 12­44). Thousand Oaks, CA: Sage.

Yuille, J. C., Hunter, R., Joffe, R. & Zapurniuk, J. (1993). Interviewing children in sexual abuse cases. In G. S. Goodman & B. L. Bottoms (Eds.), Child victims, child witnesses (pp. 95 ­ 115). New York: Guilford Press.

Zimmermann, W. (1979). Zu einigen Problemen und Ergebnissen der Suggestibilitätsuntersuchung im Rahmen der Glaubwürdigkeitsdiagnostik (Entwicklung und erste Standardisierung eines Testverfahrens) [On some problems and results of suggestibility assessment in the context of credibility evaluation (development and first standardization of a test instrument)]. Kriminalistik und forensische Wissenschaften, 37, 25­58.

Zimmermann, W. (1982a). Probleme und Ergebnisse der weiteren Standardisierung und Validierung des Suggestibilitätstests für 12­ bis 16jährige Schüler (SET­S, 12­16) [Problems and results of the further standardization and validation of a suggestibility test for 12- to 16-year-old school children (SET-S, 12-16)]. Kriminalistik und forensische Wissenschaften, 46, 47­76.

Zimmermann, W. (1982b). Zur Entwicklung eines Verfahrens zur Suggestibilitätsdiagnostik bei jüngeren Schulkindern (SET­S, 9­10 Jahre) [On the development of an instrument for suggestibility assessment in younger school children (SET-S, 9-10)]. Kriminalistik und forensische Wissenschaften, 47, 91­116.

Zimmermann, W. (1988). Probleme und Ergebnisse der Suggestibilitätsdiagnostik im Kindesalter [Problems and results of suggestibility assessment in children]. Probleme und Ergebnisse psychologischer Forschung, 9 (3).

Article Submitted: 3 April 1997

Accepted for Publication: 2 June 1997

Published: 22 August 1997

Appendix: The two versions of the BTSS

(Author's translation from the original German)


Version 1 "TOY DUCK" (female)

On the picture here, you can see Bettina and Michaela. Bettina got a toy duck on her last birthday. And this duck, you can wind it up with a screw and then it can really run. Bettina terribly loves to play with her duck. And her friend, Michaela, also finds it quite terrific.

One day, Bettina told that she would be off hiking with her parents in the week-end. And Michaela was thinking: "Hey, then Bettina won't be able to play with her new duck at all. It will be lying around at her home and nobody will play with it." Therefore she asked Bettina: "Bettina, can you lend me your duck, please, as long as you're away with your parents?" Bettina gave it some thougths and then she said: "Okay, you can have it. But only under the condition that you will really take good care of the duck. And one thing is very important: You must not wind up that screw too tight. Because if it's wound too much, it may break and then that duck will be ruined." And so Michaela promised to Bettina that she will be very careful with the duck and will wind up that screw very cautiously.

On the weekend, Michaela went to the playground, very proud with the duck, and showed it to the other children. She wound up the screw very cautiously every time, and the duck ran about quite well. But it always ran but a very short time. And so Michaela had an idea: "Won't the duck run longer if I wind up that screw a bit more?"

And then she gave another turn to the screw, though Bettina had told her that you mustn't. Suddenly there was a crack and the screw had broken. Now the duck was ruined. Michaela thought about buying a new one for Bettina with her saved-up money, so that her friend would not notice. But her father repaired the duck and fixed a new screw to it. Then the duck was once more as good as new.


(Answers counted as suggestive are indicated in brackets)

- 1) Was the girl in the story called Bettina? (-)

- 2) And the duck was yellow on that picture, wasn't it? (-)

YN 3) And on the last picture the duck was already repaired? (Y)

Alt 4) Had she got the duck from her father or from her mother? (F / M)

Rep 5) What did the story say: Had she got the duck from her father or from her mother? (Shift)

Alt 6) Was the duck's beak yellow or white? (Y / W)

Alt 7) Did Bettina have a brother or a sister? (B / S)

- 8) And Bettina's friend was called Michaela, wasn't she? (-)

YN 9) Had Michaela told her parents that she also wished to have a duck? (Y)

Rep 10) Think about it once again: Had Michaela told her parents that she also wished to have a duck? (Shift)

YN 11) Didn't Bettina go to her grand-parents for the week-end? (Y)

Rep 12) Consider again: Didn't Bettina go to her grand-parents for the week-end? (Shift)

Alt 13) Did Michaela pinch the duck or was it a present from Bettina? (p / p)

Alt 14) Did that duck run with electrical batteries or did you have to pull it behind you? (B / P)

- 15) And Michaela broke that duck, didn't she? (-)

YN 16) And didn't this happen when she was playing at home? (Y)

Alt 17) On one of the pictures, was there a cat or a dog? (C / D)

Rep 18) Now try to remember: On one of the pictures, was there a cat or a dog? (C / D)

YN 19) An Michaela has always been very careful with the duck, has'nt she? (Y)

Alt 20) Did the duck on that picture have wheels or feet? (W / F)

Rep 21) Are you sure? I ask you once again: Did the duck on that picture have wheels or feet? (Shift)

- 22) Did the screw on the duck break off? (-)

Rep 23) Are you sure: Did the screw on the duck break off? (Shift)

YN 24) And Michaela also let other children play with the duck, didn't she? (Y)

Rep 25) Listen once again to my question: Michaela also let other children play with the duck, didn't she? (Shift)

Alt 26) Were there two or three friends on the playground playing with Michaela? (2 / 3)

YN 27) Did not Bettina's mother forbid her to give away the duck because you should not wind it up too tight? (Y)

Rep 28) Listen closely to this question again: Did not Bettina's mother forbid her to give away the duck because you should not wind it up too tight? (Shift)

YN 29) And when the duck was broken, didn't Michaela's brother repair it? (Y)

YN 30) Wasn't Bettina sad then and wanted to have a new duck from Michaela? (Y)

YN 31) Wasn't the duck as good as new after it had been fixed up? (-)


1) "I am going to tell you a story now, and you should try to remember it. I have told this story to a quite small child yesterday, and that child understood it quite well. Afterwards I would like to see how well you can remember that story. I am also going to show you some pictures so you can better understand the story. Look at them closely!"

2) The story is read to the child slowly. The pictures are shown at the appropriate time, with the previous ones remaining on the table.

3) "Did you understand everything. Do you remember what happened. Please tell me everything just as it happened!" (The pictures remain on the table while the child gives a free report.) (Ask when appropriate:) "What else happened?" "How did it continue?" "What have they done then?"

4) Take away the pictures. 15 minutes intervall. Test (Culture Fair Test)

5) "I am sure you remember, the story I told you some minutes ago about that duck. Now I would like to find out how much you still remember of what I told you. I am going to ask you some questions in order to see if you remember that story well." Ask questions 1 to 31.


MEMORY: Amount of correctly remembered items in free recall (maximum 51 points)

YES-NO: Suggestive answers to questions 3, 9, 11, 16, 19, 24, 27, 29, 30

ALTERNATIVE: Suggestive answers to questions 4, 6, 7, 13, 14, 17, 20, 26

REPEATED: Answer shifts in questions 5, 10, 12, 18, 21, 23, 25, 28


Version 2 "Roller-skating"

"On this picture you can see Sven and his friend. There was great weather outside and the two friends wanted to go roller-skating together. At first, they were skating on the yard in front of the house for some time. But this became boring quite soon. And so they decided to make a skating contest on the sidewalk around the block. Properly. they should not do that. You know, Sven's mother had told them: "It's quite dangerous to go racing on that sidewalk with roller-skates. A lot of people are walking there, and it happens easily that someone gets knocked over and is hurt badly." But the two friends did it nevertheless. "Nothing is going to happened", Sven thought by himself.

Now on this picture, this is Oliver. Oliver often goes shopping for his granny. Because his granny, she's a bit oldish already and can't do this so well any more. And therefore Oliver does it for her. And every time Oliver goes shopping for his granny, she gives him a small sum of money. And Oliver has figured out that he only needs to go shopping for her three more times. Then he will have enough money to buy a terrific birthday present for his mother.

Now, those two are making their skating contest, and Oliver is just on his way to the shop. And now it's happening. Sven, who is skating at full speed, comes racing round the corner like lightning. And he hits Oliver, who is walking on the sidewalk, with full force. And as Sven has run him over, Oliver gets knocked down.

His leg got broken and so he had to stay in bed for a whole week. Of course, he was quite sad then. Because he was not able to go shopping for his granny and was not able to earn the money for his mother's birthday present."

Questions (Suggestive answers are given in brackets.)

- 1) Was the boy with the skates called Sven? (-)

- 2) Was the other boy called Oliver or Oscar? (-)

YN 3) Did Oliver sprain his leg in that accident? (Y)

YN 4) And Oliver was just on his way to school? (Y)

- 5) Had Sven's mother told them not to skate one the sidewalk? (-)

YN 6) And the rolling-skates, they were ruined in that accident? (Y)

Rep 7) Think about it once again: The rolling-skates, they were ruined in that accident? (Y)

Alt 8) Did Sven or did Oliver lead a dog on a leash? (S / O)

Alt 9) Did Oliver want to buy bread or fruit? (B / F)

Rep 10) Consider this againg: Did Oliver want to buy bread or fruit? (Shift)

YN 11) Had not Oliver on that picture been give a shopping basket by his granny? (Y)

YN 12) And Oliver, after he had fallen down, is holding the shopping bag quite firmly in his hand, is'nt he? (Y)

Rep 13) I ask you again: And Oliver, after he had fallen down, is holding the shopping bag quite firmly in his hand, is'nt he? (Shift)

Alt 14) That cap Sven was wearing on that picture: Was it black or blue? (B / B)

Alt 15) And the cap Sven's friend was wearing: Was that black or blue? (B / B)

- 16) And Oliver had to lie in bed after that accident? (-)

Rep 17) I ask you once again: Oliver had to lie in bed after that accident? (-)

YN 18) And Sven, he was then scolded by his parents? (Y)

Alt 19) Did that accident happen on a sunday or on a wednesday? (S / W)

Rep 20) What did the story say: Did that accident happen on a sunday or on a wednesday? (Shift) -21) And what has the granny done afterwards in the story? (-)

- 22) Did the granny always give Oliver some money for going shopping? (-)

Rep 23) Consider this well: Did the granny always give Oliver some money for going shopping? (Shift)

YN 24) Did he want to buy some skates, too? (Y)

YN 25) Was it foggy on the day when that happened, and was it for this Sven did not see Oliver?

Rep 26) Listen well to this question: Was it foggy on the day when that happened, and was it for this Sven did not see Oliver? (Shift)

Alt 27) Did Oliver have to stay in bed for two weeks or for three weeks? (2 / 3)

YN 28) Was Oliver sad because his granny went shopping herself? (Y)

- 29) Was there a car or was there a bus on that picture with the accident? (C /


Rep 30) Are you quite sure? Was there a car or was there a bus on that picture with

the accident? (Shift)

- 31) Oliver's granny, she was already a bit oldish, wasn't she? (-)


1) "I am going to tell you a story now, and you should try to remember it. I have told this story to a quite small child yesterday, and that child understood it quite well. Afterwards I would like to see how well you can remember that story. I am also going to show you some pictures so you can better understand the story. Look at them closely!"

2) The story is read to the child slowly. The pictures are shown at the appropriate time, with the previous ones remaining on the table.

3) "Did you understand everything. Do you remember what happened. Please tell me everything just as it happened!" (The pictures remain on the table while the child gives a free report.) (Ask when appropriate:) "What else happened?" "How did it continue?" "What have they done then?"

4) Take away the pictures. 15 minutes intervall. Test (Culture Fair Test)

5) "I am sure you remember, the story I told you some minutes ago about that duck. Now I would like to find out how much you still remember of what I told you. I am going to ask you some questions in order to see if you remember that story well." Ask questions 1 to 31.


MEMORY: Amount of correctly remembered items in free recall (maximum 47 points)

YES-NO: Suggestive answers to questions 3, 4, 6, 11, 12, 18, 24, 25, 28

ALTERNATIVE: Suggestive answers to questions 8, 9, 14, 15, 19, 27, 29

REPEATED: Answer shifts in questions 7, 10, 13, 17, 20, 23, 26, 30

Number of page accesses since 22 August 1997:

End Document

Return to the JCAAWP Home Page