A Music Perception Project | Exploring Textual Setting and Narrative Influence

** Music Dept. **

AWaking Dream

Download PDF

Results Datasheet:  http://bit.ly/ZODQgM


Exploring Textual Setting and Narrative Influence in Music Perception



Musicians, writers, teachers, and the music industry as a whole recognize the importance of higher engagement and participation levels from audiences. A large body of research work has explored potential for the emotional impact that music can produce in listeners, and the ability to control variables that influence listener response could have many useful real-world applications (Tuomas, Vuoskoski, 2013).

One such research study found that the pairing of depressive verbal text with an atonal music setting resulted in “lower liking ratings for majors and non-music majors” than simply with verbal text alone. The music served to increase the depressive tone of textual content, thereby confirming that music has the potential to heighten an affective response from listeners (Coffman, Gfeller, Eckert,1995). Another study concluded that the pairing of music with visual stimuli has the capability to affect listener responses in multiple emotional rating scales (McFarland, 2001). Additionally, this same research found that when subjects were asked to write narratives concerning specific visual stimuli, the emotional content of their stories significantly varied depending on the piece of music accompanying the visual stimuli (McFarland, 2001). Other research has focused on how subjects respond to visual stimuli using a specialized interface to produce musical sounds expressing narratives representing what they think best match the visual stimuli (Wingstedt, 2008).

Although there is a much larger body of research has focused on visual stimuli, some previous research points us towards exploring relationships between textual settings and music perception. (Hackworth, Fredrickson, 2009). In this instance, the term ‘textual setting’ is used to represent musician or artist names, track titles, and narratives surrounding a particular artist or piece of music. With this in mind, the goal of this paper and its accompanying experiment is to ascertain if a definitive link exists between non-verbal textual setting, a listener’s participation and emotional engagement level, and the ability to accurately categorize genres for a given piece of music. This approach differs from the aforementioned body of research, but is most similar to Coffman, Gfeller, and Eckert’s study which focused on how music affects response to verbal text. This project essentially flips the focus around to ask the question: can non-verbal text affect listener response to music?


A total of 35 subjects volunteered for the survey and were assigned to three different groups; 10 for a pilot survey to eliminate extraneous emotion and genre response fields and to improve the general layout of the surveys, while the remaining 25 subjects were separated into two groups to participate in the revised and final surveys.

The first group of 12 subjects was designated as “Group A”, and the second group of 13 subjects was designated as “Group B”. No payment, incentive, or academic credit was offered to any of the volunteers, and no discrimination or screening process was enacted in the subject selection except for a confirmation that the subject commit to participate.

The average age of the subjects in Group A was 24, and the average age of the subjects in Group B was 23. The percentage of the gender makeup for all subjects was 44% female, and 56% male. 72% of all subjects identified themselves as current or former musicians, while the remaining 28% had no prior experience as musicians. Group A spent on average a total of 105 minutes per day listening to music, while the listening average for group B was 231. The ethnic distribution of both groups is represented in the following two charts:

 Fig. 1-2

 Fig. 1 - Group A

Fig. 2 - Group B


Seven music clips of various artists and genres were selected and then edited to play for no longer than two minutes. The music tracks were purposefully selected based on the following criteria: accessible factual narratives, popularity and therefore exposure levels, and a difference in genres between all tracks. A chart containing each track’s attributes concerning the selection criteria is featured below:

Fig. 3

 Fig. 3


Three tracks (1, 2, 4) were selected because they had accessible narratives surrounding either the artist or the track itself. The remaining four tracks (3, 5, 6, 7) were selected with the intention to create falsified narratives for each. The purpose of this was to determine if a discernable difference could be shown in the results between the factual and falsified narratives. Of the four tracks with falsified narratives, three (5, 6, 7) possessed low popularity/exposure levels whereas only one (3) had a high exposure/popularity level. Out of the four falsified narratives, only music clip 5 falsified the artist name, track title and narrative, whereas the remaining three (3, 6, 7) remained with the factual artist names and track titles in place. It is worth noting that none of the genre descriptions in the narratives were altered or falsified, and were directly sourced from the well-established music radio website Last.FM.

Both subject groups listened to the same music clips previously described, which can be found here along with the factual and falsified textual settings: http://bit.ly/11bNm44. The surveys for Group A presented the seven music clips without any textual settings, and recorded responses concerning emotion and genre categorization. The surveys for Group B required subjects to read both factual and falsified textual settings before listening to the same seven music clips as Group A. The responses for Group B were then recorded concerning emotion and genre categorization in the same survey format as Group A. This survey format contained 30 emotion and 33 genre fields.

For each group, the seven music clips were assigned a unique number from 1-7. Fifteen unique ordering combinations were then generated using a ‘Random Integer Set Generator’ (http://www.random.org/integer-sets/) which creates sets based on atmospheric noise input. The purpose for randomizing the music clip ordering was to mitigate any potential bias that may have occurred if the same sequence of music clips was used for all subjects.

View Group A Sample Survey Here (Excludes Textual Settings): http://bit.ly/XWCVAN

View Group B Sample Survey Here (Includes Textual Settings): http://bit.ly/109Vuzf

Every rating scale was constructed with the lowest rating as 1, and the highest rating as 7. For the Dislike-Like scale, 1 represented “Extremely dislike”, while 7 represented “Extremely like”. For the emotion rating scales, 1 represented “Not at all”, and 7 represented “Extremely”. The genre categorizations consisted of check-box selections and did not contain rating scales.


The first finding from the results is that Group A, which contained no textual setting, consistently rated on average every music clip (excepting music clip 6), lower on the “Dislike-Like” scale:

Fig. 4

 Fig. 4

Given this, it is possible that the textual narratives served to foster stronger attachments between the listeners and the music. One factor that could have swayed these results is that 86% of Group B subjects identified themselves as musicians at some point of their lives, whereas a lesser 58% of Group A subjects identified themselves as musicians. Previous research has found that musicians tend to have a higher affective response to music; both alone and with spoken text. Therefore, it is possible that this could have contributed to the higher “Like” rating scores for Group B (Gfeller, Coffman, 1991). It is also worth noting that Group B subjects identified themselves as listening to music an average of 231 minutes per day which is over twice the 105 minutes per day for Group A.

Upon examining the highest-rated emotion response for each clip, numbers 1-6 received parallel or similar emotion responses from both groups (see fig. 5). However, music clip number 7 received a deviation from this pattern. For this clip, the highest-rated emotional response average from Group A was “Chill”, whereas the highest for Group B was “Cheerful”. The falsified “rags-to-riches” narrative provided for this clip could have catered to the emotion or even age bracket of the subjects. It described the artists investing their student loan money to build a home studio, which they then used to record the music track presented in clip number 7; launching them into a successful career as musicians (narrative available here: http://bit.ly/UbBvAi).

Fig. 5

 Fig. 5

The second and third-highest responses (bored, indifferent) of Group A for music clip 4 stand out in comparison to the responses of Group B (calm, sleepy). The truthful narrative for this track describes the results of a study which compared the relaxation levels of subjects for actions such as drinking a cup of tea, receiving a massage, and concluded that those who listened to this music track obtained the highest relaxation levels of all (narrative available here: http://bit.ly/11vRkUx). It is reasonable then to suggest that this narrative could have played a part to further engage relaxing and sleep-related sentiment in the listeners. However, follow-up questions would need to be answered by the subjects before any substantive conclusions could be drawn regarding this possibility.

When comparing the standard deviation averages for the top three emotion rating fields between the two groups (Fig. 5), Group A possessed an 18.17% higher average at 2.458 than the 2.080 average for Group B. It can be surmised from this, that in addition to higher engagement with the music clips, Group B also responded with overall higher agreement levels. Whether this finding is due to textual settings, or to other potential factors that may influence a subject’s emotional state before and during participating would have to be examined further in the context of a more controlled and in-depth study with a higher number of respondents to contribute data.

The majority of the results for the genre categorization section of the survey appear to be largely problematic when attempting to infer results. Below is a chart outlining the highest two percentages of genre responses for each music clip:

Fig. 6

 Fig. 6


Music clips 1, 2, and 7 stand out as Group B consolidates into the highest genre choice more effectively. However, there does not appear to be a significant difference in genre categorization between the factual narratives (1, 2, 4) and the falsified narratives (3, 5, 6, 7). It remains to be shown consistently and on a larger scale if the textual setting and specifically the genre descriptions in the narratives produce higher agreement levels as with Group B’s subjects.

Another method to view the results of the genre responses is to recognize the percentages of subjects who identified the correct genre sourced from artist biographies on the music radio and catalogue web service Last.FM (Fig. 7).

Fig. 7

 Fig. 7

For every music clip, Group A consistently fell short of identifying the correct genre field when compared to Group B, although music clips 3, 4, and 6 show relatively close percentages. Like many of the results previously mentioned, Group B’s higher percentage results could be attributed to the higher number of musicians in the subject pool, as it is likely that musicians would be more familiar with genre terms and possess higher identification abilities, however there is still a case to be made for the effects of textual settings, as all the non-musicians in Group B correctly identified nearly every genre field, excepting music clip 7.

In regards to comparing the factual and falsified narratives, it is difficult to ascertain from the mixed results if there are any overall difference in subjects’ emotion responses and genre categorizations, and there seems to be no discernable pattern on the Dislike-Like scale responses as well. One possible solution to obtain clearer results could be in the inclusion of falsified genre descriptions within the narratives, thereby creating the potential to further test the flexibility of this variable on listeners. However in general, the use of genre as an accurate variable seems to be highly problematic and prone to uncontrollable subjectivity.

Many of the categorization responses from both subject groups were spread across multiple conflicting genres. For example, the track ‘Tidal Wave’ by Dick Dale and the Del-Tones (music clip 1) is generally considered to epitomize the Surf Rock genre. While the majority of the subjects selected the ‘Surf Rock’ genre correctly, some selected the ‘Soundtrack/Orchestral’ genre instead; thereby presenting the possibility that their responses could have been influenced by other factors. In this instance, it is relevant that this same music track was featured in the closing credits of a popular film directed by Quentin Tarantino by the name of ‘Pulp Fiction’. Because of this possibility for interference, future research may do well to consider discarding the problematic and judgment-laden variable of genre, and instead attempt to approach subjects through more controllable variables such as timbre classification, or sociologically-centered experiments.

It is theoretically possible that future research in this area could improve response accuracy by implementing a pre-screening stage to determine which subjects have musical backgrounds, and then dividing musicians and non-musicians evenly across test groups. However, the ‘non-musicians’ in both survey groups consistently responded with the correct genres and Dislike-Like engagement on par with the ‘musicians’, which leads to the question of whether musicians do in fact have a “higher affective response” (Coffman, Gfeller, Eckert.) when compared with non-musicians.

It may also be helpful to collect physical and verbal data concerning the subject’s emotional state directly before participation, and to consider monitoring data points such as heart rate and verbal responses in real-time. Further research may also benefit from structuring the emotion-response section of an experiment with fewer response options for the subjects, as this may serve greatly to improve the concentration of responses into more crystalline categories. Even after using a pilot survey to eliminate fields, this experiment contained 30 emotions and 33 genres from which the subjects could choose from, and it could easily be argued that many of the emotion and genre fields overlap in meaning, thereby acting to dilute the quality of responses.

Overall, the results of this experiment affirm the hypothesis that textual setting and especially the inclusion of narratives are relevant factors in determining how listeners respond to a piece of music, both in terms of engagement and agreement levels. The question of how to utilize and precisely influence that response is largely unexplored, and its answer could very well have many practical real-world applications concerning web-based music delivery services, music player software, music marketing models, and more effective artist promotion.




1. Eerola, Tuomas, and Jonna K. Vuoskoski. “A Review of Music and Emotion Studies: Approaches, Emotion Models, and Stimuli.” Music Perception: An Interdisciplinary Journal 30.3 (2013): 307-40. University of California Press. Web. 08 Mar. 2013.

2.  Coffman, Don D., Kate Gfeller, and Michael Eckert. “Effect of Textual Setting, Training, and Gender on Emotional Response to Verbal and Musical Information.” Psychomusicology 14 (1995): 117-36. Print.

3.  Hackworth, Rhonda S., and William E. Fredrickson. “The Effect of Text Translation on Perceived Musical Tension in Debussy’s Noël Des Enfants Qui N’ont plus De Maisons.” Journal of Research In Music Education 58 (2010): 184-95. Print.

4. McFarland, Richard A. “Effects of Music Upon Emotional Content of TAT Stories.” The Journal of Psychology 116 (1984): 227-34. Print.

5. Wingstedt, Johnny, Sture Brändström, and Jan Berg. “Young Adolescents’ Usage of Narrative Functions of Media Music by Manipulation of Musical Expression.” Psychology of Music 36 (2008): 193-214. Http://pom.sagepub.com/. Web. 08 Mar. 2013.

6. Gfeller, Kate, and Don D. Coffman. “An Investigation of Emotional Response of Trained Musicians to Verbal and Music Information.” Psychomusicology 10 (1991): 31-48. Print.

Datasheet containing survey results from both groups available for viewing here: http://bit.ly/ZODQgM


2 Responses

  1. Marc Schroeder

    Hi, thanks for sharing. I’m wondering if it’s OK to copy some of the text in my site?


Leave a Reply

Your email address will not be published.