Visual Framing Effects: The Effects of Image-text Relations in International Affairs

Tom Powell (University of Amsterdam, The Netherlands), Knut de Swert (University of Amsterdam, The Netherlands), Claes de Vreese (University of Amsterdam, The Netherlands) & Hajo Boomgaarden (University of Vienna, AT)

Summary of the Project


Visuals in news media help define, or frame issues, but little is known about how they influence public opinion and behavior. This is important considering images can be a powerful vehicle for the framing of political messages and their ability to evoke an emotional response can influence the decisions of public and politician alike (Damasio, 1996). This project uses an experimental approach and the context of international affairs news – a ready source of powerful images – to study the effects of visual news frames, especially when presented alongside text as they would be in a typical news report.

Research Questions

We aim to clarify the individual and combined effects of images and text in international affairs news media. This would provide an important insight for those in the newsroom and a first step towards integrating visuals into a multimodal framing theory. Specifically, the following questions will be addressed: 1) Whether images generate stronger framing effects than text; 2) What are the interactive effects of images and text; 2) What are the mechanisms and boundary conditions for visual and verbal framing effects; 4) How do these effects differ across media types – i.e., text-image versus audio-visual; 5) What are the neural substrates of multimodal framing effects.

Theoretical Framework

News frames are “interpretive packages”, consisting of a “central organising idea”, that are used by journalists to communicate the most salient aspects of an otherwise complex story to a reader (Entman, 1993; Gamson & Modigliani, 1989, p.3). Frames have been shown to influence readers in multiple ways, including their information processing and recall (e.g., Nabi, 2003), attitude and opinion formation (e.g., Nelson, Oxley & Clawson, 1997) and behavioural intentions (e.g., de Vreese & Semetko, 2002). However, with a few exceptions (e.g., Arpan et al., 2009; Gibson & Zillmann, 2000; Iyer, Webster, Hornsey, & Vanman, 2014), this research has focused on framing effects produced by an news story’s text, with the effect that images have on the viewer having received far less scrutiny.

Framing effects of images and texts depend on their unique characteristics (Geise & Baden, 2014). In general images are attention-grabbing (Garcia & Stark, 1991), serve to index and reproduce reality (Messaris & Abrahams, 2001), and are considered to be superior to text in the psychological effects they produce on the recipient (e.g., Nelson, Reed & Walling, 1976). By evoking a heightened emotional experience compared to text (Iyer & Oldmeadow, 2006), visuals connect with a reader and in turn might also be more persuasive. However their ability to highlight the salient parts of an issue is tempered by their relative ambiguity (Messaris & Abrahams, 2001). In contrast, text is less salient, but possesses a clear structure for inferring who did what to whom and why (Entman, 1993). Clarification of the individual and combined effects of images and text would be an important insight for those in the newsroom, and a first step towards integrating visuals into a multimodal framing theory.


We employ an experimental approach to allow us to make causal inferences about how visuals contribute to the framing process. We will use modified news material to examine several independent variables during the project: In study 1 we manipulated framing device (image, text, image-text combination), media frame (e.g., risk, opportunity) and image-text congruence (congruent, incongruent) to unpack the individual and combined contributions of images and text to framing effects. In study 2 we will test the boundary conditions of these effects by manipulating processing pathway (heuristic, systematic) and by measuring durations of visual and textual framing effects. In study 3 we will compare framing effects in static and audio-visual media. Finally, in study 4 we aim to examine how biases delivered through visual and verbal frames relate to brain activity.

Studies 1 to 3 will employ a survey-embedded experiment method and our dependent measures will include self-reported opinions towards news issues (such as a policy to intervene in overseas conflict) and behavioural intentions (such as the intention to donate to a cause). We also aim to take behavioural measures such as actual donating or simulated voting behaviour and reaction time measures such as the implicit association task (Greenwald, McGhee & Schwartz, 1998). To study the mechanisms of multimodal framing effects we will also use self-reported emotions and cognitive processes. Emotional responses will be measured via discrete emotions such as anger and fear, and using dimensional measures such as emotional arousal and valence. Cognitive processes will be measured by assessing participants’ appraisal of the stimuli (e.g., that an image depicts suffering of victims, Lazarus, 1991) and by using measures of prior knowledge and processing style (visual-verbal) as moderators in process models.

In study 4 we aim to use functional magnetic resonance image (fMRI) to examine the contribution of visuals to framing effect. We will relate the blood-oxygen-level-dependent (BOLD) signal as an index of brain activity to self-reported opinions and behavioural biases.

Our experimental approach employing the variables above will allow us to investigate the main effects and mechanisms of multimodal framing effects, providing a level of analysis and insight absent in the visual framing literature thus far.


In study 1 we used an experiment (N = 1,082) to present images and text frames from war and conflict news in isolation and in image-text congruent and incongruent pairs. Results showed that, when presented alone, images generate stronger framing effects on opinions and behavioural intentions than text. When images and text are presented together, as in a typical news report, the frame carried by the text influences opinions regardless of the accompanying image, whereas the frame carried by the image drives behavioural intentions irrespective of the linked text. These effects were explained by the salience-enhancing and emotional consequences of images.


Methodological Reflections

Methodological Potentials

The main advantage of experiments is that, with careful control of extraneous variables, they allow one to make causal inferences about the effect of the independent variables. The controlled environment provided by laboratory conditions is also well-suited to investigating the mechanisms of these effects with high internal validity, and allowing other researchers to replicate the procedures.

Another advantage is that one is able to use implicit measures that do not require self-reporting. Although these were not used in study 1, we aim to utilise measures such as the implicit association task and functional neuroimaging in later studies. This will provide extra sensitivity to effects that do not reach conscious awareness and prevent contamination of results by demand characteristics.

Methodological Challenges

Studying visuals in any context can lead to an interpretational problem because, unlike text, visuals lack an explicit syntax with which to draw definite inferences about their meaning (Messaris & Abrahams, 2001). Furthermore, any two images that purportedly show a similar scene could evoke a different psychological response within subjects due to the presence of certain features that aren’t ostensibly essential to an images meaning but are compositionally salient – such as faces, children, blood and food. When comparing between subjects, the effect of visuals are influenced by people’s prior experience and knowledge, and perhaps to a greater extent than text. These issues can be particularly problematic in an experimental context in which all other factors aside from the independent variables should be controlled. An extra layer of complexity is added by our focus on image-text relations in this project. The basic properties of news images and text differ dramatically, and will therefore require careful consideration when being manipulated in my project.

We attempted to mitigate these issues in study 1 by using two pre-tests to match our stimuli on several characteristics previously shown to influence framing effects, memory and emotional reactions. These included participants’ appraised arousal, valence, salience, ambiguity, complexity and newsworthiness of the stimuli. Furthermore, we maximised congruence between images and text using an additional pre-test to capture thoughts listed when viewing the images, and then we used those key words to construct the text stimuli. In future studies I plan to make small manipulations to the image stimuli so that these kinds of controls become less necessary.

Methodological Limitations

The key limitation of any experiment is a lack of ecological validity. Indeed, one could easily argue against the relevance of a one-shot test of framing effects given today’s busy, varied and competitive media environment. However, in study 1 I made every attempt to maximise external validity by using actual news images, (modified) articles and via manipulation checks. I also ensured that our sample was representative of the US population for age and gender. Furthermore, the online survey-embedded experiment set-up should closely approximate the experience of reading the news in the comfort of one’s own home.

