Sarah Ortega

The Distorted Reality of a Haunted Mind

March 6, 2026

Corpus and Stylistics: Shirley Jackson’s The Haunting of Hill House

1. Research context and significance

Fowler (1977) introduced “mind style” as the systematic linguistic and textual patterns used to represent a world view and an individual’s mental self. The linguistic phenomena that can contribute to these patterns include choices of syntax, semantics, and transitivity (Leech and Short, 2015). The identification of linguistic consistencies in an analysis of mind style requires two considerations: how the individual conceptualizes their experiences and the individual’s way of perceiving reality. This report will utilize different approaches to investigate whether The Haunting of Hill House by Shirley Jackson has features of mind style occurring across the novel as a whole, and if not, what features contribute to its disturbing atmosphere. This will include a manual approach and corpus methods, which may support my findings.

The Haunting of Hill House was published in 1959 and has captivated the literary community as a psychologically thrilling haunted-house story, and has further enticed modern viewers with its film adaptations: “The Haunting” in both 1963 and 1999, and Netflix’s “The Haunting of Hill House” (2018). In the novel, Eleanor is invited to partake in the paranormal investigation of a haunted mansion known as Hill House. Having taken care of her invalid mother for all of her young adult life, Eleanor is left in a state of desperation to establish her independence and to form connections with others. Her trauma of losing her mother and emotional vulnerability, which will become apparent from her paranoid thinking, seems to make her more susceptible to the supernatural events at Hill House. Her anxiety and paranoia ultimately led her to decide to forever join Hill House by committing suicide.

According to NHS UK, psychosis is a collection of symptoms that cause a person to lose contact with reality, which can encompass hallucinations, depression, and disordered thinking, which aligns with the experiences of most protagonists in Shirley Jackson’s works. The Haunting of Hill House was written and published during a Contemporary/Postmodern literary period where elements of fragmentation, unreliable narration, and political criticism were common, especially in regard to a postwar American society that advocated for the institution of marriage and the nuclear family. And like most 19th-20th century gothic fiction, which The Haunting of Hill House falls under, the setting typically acts as a conduit empowering the characters’ psychological fantasies to manifest (Leech and Short, 2017). This element of gothic fiction is reflected in Eleanor’s perception of Hill House and her relationship with it.

Mind style has been applied to a range of unique fictional minds. To name a few, Semino and Swindlehurst (1996) investigated conceptual metaphors in One Flew Over the Cuckoo’s Nest to illuminate the mechanistic world view of Bromden, a paranoid schizophrenic, Leech and Short (2017) studied syntax via transitivity patterns of Benjy’s mind style in The Sound and the Fury, and McIntyre and Archer (2010) used a corpus-assisted mind style analysis on Miss Shepherd from the play The Lady in the Van, introducing a quantitative supplement to the qualitative nature of mind style studies. With the USAS and Wmatrix semantic tagging tool, they searched for consistent semantic patterns in Miss. Shepherd’s pessimistic and guilt-ridden mind style. Shirley Jackson’s works contain many mentally deviant characters, and none have been subject to a mind style analysis. This report will also use semantic analysis to determine if it supports Eleanor’s odd mind style and the textually unique patterns of her psychosis, which often blur the distinction between a paranormal event and the manifestation of her mental decline.

2. Data and methodology

Features of mind style will be investigated through the lens of a third-person narrator observing how Eleanor’s worldview, including her psychosis, and, by extension, the paranormal, are construed. To start, I compared The Haunting of Hill House corpus to a fictional baseline represented by the Brown sub corpora of fiction following McIntyre and Archer’s (2010) approach to semantic categorization. The keywords were categorized with the Wmatrix semantic tagger to get an overall representation of the novel’s atmosphere and the general attitude of the narrator.

I wanted to consider the key semantic domains of the abnormal events to see how they are construed in comparison to the novel as a whole. This would hopefully suppress the narrator’s language patterns and give Eleanor’s mind style a chance to emerge since these events are more deeply involved in Eleanor’s mind and perception. To semantically analyse the abnormal occurrences, I separated the paranormal events and Eleanor’s episodes, some of which overlap, into a separate file and used The Haunting of Hill House corpus as a comparison to see what themes stand out during these shifts in reality. Separating these events was a subjective process, and how I separated them may be different to how someone else would. The chosen occurrences include (1) the storm when everyone hears the banging of the doors, (2) Eleanor realizing she was not holding Theodora’s hand, (3) the family picnic, (4) Eleanor going to the library, and (5) when Eleanor decides to drive her car into a tree. I excluded the “Help Eleanor Come Home” scenes because these are aftermath events, the reader is not taken through a transformative moment in reality.

This is accompanied by a brief qualitative analysis of mind style. In most literary analyses, Eleanor is characterized as someone who is desperate for independence, distorted[1], fanciful, and in some arguments, sexually repressed. The novel’s overall tone is unsettling and invites the notion that sanity can transform reality, challenging what is paranormal and what is a manifestation of one’s own mental state. More on this can be found in Brittany Roberts’ “Helping Eleanor Come Home: A Reassessment of Shirley Jackson’s The Haunting of Hill House” (2017) and Michael Wilson’s “’Absolute Reality’ and the Role of the Ineffable in Shirley Jackson’s The Haunting of Hill House” (2015). However, these characterisations do not follow Fowler’s (1977) definition of mind style, which requires a consistent defining linguistic feature that represents the world-view of a character or narrator. From my own assessment, the novel’s possible features of mind style include (1) agency and animacy that assign Hill House with autonomy and (2) paradoxical semantics. Both of these may represent a distorted worldview from Eleanor’s whimsical and fantasizing nature. My analysis of the key semantic domains will be used to see if this supports my conclusions.

3. Analysis

Before comparing the keywords and semantic domains that appear throughout the novel and its abnormal events, I’ll first explore possible features of mind style, which may or may not be supported by the quantitative evidence.

3.1. The distorted mind style of Eleanor Vance

From the very beginning of the novel, Hill House is diagnosed as not sane, which is a mental characteristic ascribed to a person and not typical for a house. These kinds of personifications continue throughout the novel, demonstrating Eleanor’s perception of the house as a living entity with its own motives. After terrifying the other characters by nearly killing herself, Eleanor is asked to leave the house, but she believes the house can decide on who stays and who doesn’t:

She could see the windows looking down, and to one side, the tower waited confidently. She might have cried if she could have thought of any way of telling them why; instead, she smiled brokenly up at the house, looking at her own window, at the amused, certain face of the house, watching her quietly. The house was waiting now, she thought, and it was waiting for her; no one else could satisfy it. ‘The house wants me to stay,’ she told the doctor, and he stared at her.

[Shirley Jackson, The Haunting of Hill House: 82]

Eleanor’s odd perception of the house not only assigns living behaviour to the house but its parts, such as its windows and its tower. The example includes windows looking down, the tower waited, the house was waiting, and wants her to stay. She even sees the house having a certain face that is watching her quietly. Eleanor also ascribes mental and physiological aspects to the house by calling it a lunatic and diseased, expanding on its initial impression as not sane. Eleanor will very early on perceive a predator-prey relationship with it:

I am like a small creature swallowed whole by a monster, she thought, and the monster feels my tiny little movements inside.

[Shirley Jackson, The Haunting of Hill House: 17]

Though Eleanor sees Hill House as autonomous, she also perceives its malicious intent, likening it to a monster that swallows and consumes its prey. Contrary to Eleanor’s view, the other characters do not ascribe living aspects to the house. For example, Theodora calls it filthy and rotten, which is more appropriate than diseased when describing a house. This literary choice contributes to Eleanor’s possible deviant mind style, giving life to what is not living. Paradoxical semantics also appear for various actions and nouns throughout the novel and occur often when Eleanor is in her head. In this scene, which was included in the abnormal events file, Eleanor is daydreaming as Luke and Theo are walking far behind her:

Luke was wrong about the softness everywhere, because the trees are hard like wooden trees. They are still talking about me, talking about how I came to Hill House and found Theodora, and now I will not let her go. Behind her, she could hear the murmur of their voices, edged sometimes with malice, sometimes rising in mockery, sometimes touched with a laughter almost of kinship, and she walked on dreamily, hearing them come behind. She could tell when they entered the tall grass a minute after she did, because the grass moved hissingly beneath their feet and a startled grasshopper leaped wildly away.

[Shirley Jackson, The Haunting of Hill House: 72]

An interesting phrase Eleanor makes regards the trees being hard like wooden trees, when in fact trees are made of wood. This unnecessary simile suggests that there are trees in Eleanor’s reality that are not made of wood, that by some whimsical chance they could be made of something unnatural. Then there is the grass moved hissingly, which can make a reader pause. To move hissingly is often attributed to a snake’s movements, otherwise hissingly is usually ascribed to sounds rather than the movement of objects. The particular word choice gives the grass snakelike attributes, transforming the blades into slithering creatures. If Eleanor were perhaps of more sound mind, she might have thought “the grass moved like snakes” or “the grass made a hissing sound”. This may be a weaker argument supporting Eleanor’s paradoxical style, but it does otherwise follow her pattern of distorting reality. Other examples of paradoxical modifiers and irrational metaphors throughout the novel include phrases such as danced gravely, slept watchfully, growing blackly, and wild sadness, which invite the reader to try to interpret what these would mean. This challenge of logic is an aspect of semantic deviation (Short, 1996).

3.2 Key concepts of the novel and the abnormal

Unsurprisingly, the semantic domains of the keywords shown in Table 1 contain speech and communication, character names, and parts of buildings, which refer to the house this story takes place. The uncategorized keyword list[2] also features Eleanor (619 occurrences) as the most common word, which is also expected since this is told in her point of view. Though Eleanor’s nickname, Nell (55 occurrences; p<.01) and Nellie (10 occurrences; not significant) are not as frequent in comparison, the narrator and the voices from the abnormal events do not conform to the nicknaming the other characters do and continue to address her as Eleanor. [3]

Table 1. Key semantic domains of The Haunting of Hill House (reference Brown fiction corpus)
*LL (critical value = 15.13)*	Semantic domains	Examples
*90.55*	Speech: Communicative	Said, told, voice, say, saying, talking, speak, spoke, talk, voices, whispered, story
*49.94*	Personal names	Eleanor, Luke, Mrs., Arthus, Dr., Nell, Eleanor’s, John, Hugh, Montague, Arthur’s, Luke’s (Theodora, Theo)
*39.83*	Medicines and medical treatment	Doctor, doctor’s
*39.70*	Parts of buildings	Door, room, hall, verandah, windows, floor, doors, wall, rooms, doorway, window, front, gate, parlour, stairway, upstairs, walls, staircase, hallway, roof, passage
*39.04*	Linear order	Then, first, before, at, finally, last, turn
*38.84*	Thought, belief	Thought, think, feel, thinking, believe, wonder, wondering, felt
*22.07*	Fear/shock	Afraid, frightened, fear, shock, frighten, scared, startled, terror, shocked
*22.00*	Pronouns	I, she, her, you, it, he, they, that, me, we, my, them
*20.26*	Architecture, houses and buildings	House, tower, houses, built, apartment
*17.90*	Moving, coming, and going	Go, come, going, went, came, coming, left, leave, get, ran, turned, followed, nodded, gone, sat, step, walking, journeys, steps, walked, rose, run
*16.01*	Location and direction	Here, this, where, away, there, back, around, out, stood, end
*15.53*	Darkness	Dark, darkness, darkly

There are a few considerations to address before analysing the semantic domains. Medicines and medical treatment contain doctor and dr. which exclusively refer to Doctor Montague, so the correct domain for these examples would have been personal names. Wmatrix also did not recognize Theodora and Theo as character names, which I have added to the examples. The Trash can domain was removed because it only contained punctuation.

Thought/belief have a unique dominance in the novel, illustrating that Eleanor is in her head unusually often whether it be during her stream of conscious daydreaming, when she’s formulating judgements of others or herself, or when she’s experiencing something that could be paranormal. But this does not speak much on her mind style as it is not a pattern reflecting her perception of the world. It is also unsurprising that Fear/shock and Darkness are also significant semantic domains. These categories likely are the cause for giving The Haunting of Hill House such an unsettling atmosphere. Again, this does not directly contribute to the mind style conclusions I made in the previous section.

The Parts of buildings domain may better support Eleanor’s mind style since my previous examples include the human-like descriptions given to not only the house but parts of the house as well. The concordance lines in Figure 1 show a colourless representation of the setting and doesn’t contain any of the personifications mentioned in my manual assessment. Any descriptions in this sample relate to loud sounds, and objects, like floor and rooms, are often used as part of referential statements.

Figure 1. Concordance for parts of buildings from Wmatrix

Eleanor’s mind style may still be consistent throughout the novel even though it is not supported in the semantic analysis of the novel as a whole. Focusing now on her daydreams, psychosis, and paranormal occurrences, Table 2 show the key semantic domains that may better support my conclusions.

Table 2. Key concepts of the abnormal events (reference The Haunting of Hill House)
*LL (critical value = 15.13)*	Semantic domains	Examples
*52.46*	Plants	Grass, trees, flowers, garden, daisy, bushes
*40.53*	Sensory: Sound	Heard, footsteps, listening, hear, hearing, sound
*39.98*	Moving, coming and going	Go, come, ran, came, going, walking
*35.29*	Colour and colour patterns	White, black, green, whiteness, blackness,
*24.92*	Anatomy and physiology	Feet, hand, breath, wake, eyes, hands, shivering, sleep, arm, fingers, mouth, bones
*20.68*	Grammatical bin	The, and, to, of, a, in, on, at

Previously I mentioned that Eleanor is in her head a lot throughout the novel. She thinks possibly more often than she speaks. During these abnormal events, Eleanor is not more uniquely in her head than she normally is, this could be due to maybe some counterbalancing of the increased frequency of the sensory domain. The first key domain introduces Plants which scores 52.46 (p<.0001). Plants don’t even make it on the top 220 key domains for the novel as a whole. The second key domain shows Sensory: Sound which scores 40.53 (p<.0001) and also achieves significance to the novel as a whole scoring 9.10 (p<.01). There is also a sudden splash of Colour and colour patterns with a score of 35.29 (p<.0001). When compared to the novel as a whole, Colour and colour patterns never reach significance with only a log-likelihood of 0.22.

The abnormal events in the novelcontain the most fanciful and fairyland daydream qualities of Eleanor’s worldview, this is supported by the presence of her paradoxical semantics[4]. These events, which are questionably paranormal, are more vividly portrayed than the moments leading up to them, as pointed out in the Parts of buildings concordance lines. The concordance of the Plants domain shown in Figure 2 demonstrates the visual nature of these abnormal events. Many of the items listed in this sample are connected to visual and textural modifiers, actions, and prepositions.

Figure 2. Concordance lines for plants from Wmatrix

However, this cannot completely support my mind style analysis. Parts of building and Architecture are not significantly frequent domains in this comparison, so this assessment also would not support my conclusions on the personifications of the house and parts of it.

4. Conclusion

My report and McIntyre and Archer’s (2010) analysis of Miss Shepherd open up the question as to whether corpus methods can be used to identify and support mind style. As shown, it can at least lend some support. Manual approaches are still more reliable for finding linguistic deviations than automation, and dealing with a third-person narrative can also pose challenges. When looking for features of mind style, I had to consider whether I was looking at Eleanor’s mind style or the narrator’s. Focusing on Eleanor’s mind style meant finding indicators such as what Eleanor thought, heard, or saw. Using pronouns like I and she was not an efficient way to find Eleanor’s perception because I could refer to any character speaking, and she could refer to what Theodora, Eleanor’s sister, or Mrs. Dudley sees or hears.

The comparison of key semantic domains in the novel and the abnormal events provided some quantitative support for my assumptions of Eleanor’s mind style and also pointed out other linguistic features that contribute to the novel’s atmosphere. The key to finding this support is similar to McIntyre and Archer’s (2010) semantic analysis, where they had to separate Miss Shepherd’s speech from the other characters to pinpoint distinctions. If I could do a follow-up, I would consider part-of-speech tagging to see if it could be more supportive for mind style in identifying metaphors.

Bibliography

Fowler, R. (1977). Linguistics and the novel. Methuen.

Leech, G. N., & Short, M. (2015). Style in fiction: A linguistic introduction to English fictional prose (Second ed.). Routledge. https://doi.org/10.4324/9781315835525

McIntyre, D., & Archer, D. (2010). A corpus-based approach to mind style. Journal of Literary Semantics, 39(2), 167-182. https://doi.org/10.1515/jlse.2010.009

National Health Service. (2023, September 5). Psychosis. Nhs.uk; NHS. https://www.nhs.uk/mental-health/conditions/psychosis/overview/

Roberts, B. (2017). Helping Eleanor come home: A reassessment of Shirley Jackson’s the haunting of hill house. The Irish Journal of Gothic and Horror Studies, (16), 67-233.

Semino, E., & Swindlehurst, K. (1996). Metaphor and Mind Style in Ken Kesey’s “One Flew Over the Cuckoo’s Nest”. Style (University Park, PA), 30(1), 143-166.

Short, M. H. (1996). Exploring the language of poems, plays and prose. Longman. https://doi.org/10.4324/9781315842080

Wilson, M. T. (2015). “Absolute Reality” and the Role of the Ineffable in Shirley Jackson’s The Haunting of Hill House. Journal of Popular Culture, 48(1), 114-123. https://doi.org/10.1111/jpcu.12237

[1] In Jackson’s notes on the novel, Eleanor is described as “ALL DISTORTED LIKE HOUSE” (Roberts, 2017)

[2] A key word list was originally generated when I was considering using keywords to drive my analysis with collocation patterns and clusters.

[3] This is a reason, among others, why I believe Eleanor is the one who wrote “Help Eleanor Come Home”. I will die on this hill (house).

[4] Two of the paradoxical semantics discussed in the mind style section appear in these concordance lines.

An Unknown Catalyst in Claudio Valente’s Confession

January 8, 2026
Edited: 1/10/26

On December 15th, 2025, Brown University and its surrounding community suffered a violent act at the hands of a troubled alumnus. The gunman entered a classroom in the Barus and Holley building and took two innocent lives and wounded nine others. He then allegedly traveled to Brookline, Massachusetts, targeting MIT professor Nuno Loureiro, and shot him at his home.

Investigators claim the gun that killed Loureiro was a different gun used in the Brown shooting, both 9mm, which were found on Valente. They have also found security footage that confirmed Claudio had entered Loureiro’s apartment building before eventually ending up at an Extra Space Storage facility in Salem, New Hampshire, to take his own life. Multiple sources claim that Valente’s motivation stems from a “lengthy grudge.”

What makes Valente’s confession tapes so interesting is that they were recorded after the shootings occurred. It seems more common for mass shooters to create their confessions or manifestos before the act because more often than not, the shooter intends not to live on after their act of violence.

Other recent instances of high-profile shooters outliving their act of violence are Nikolas Cruz and Ethan Crumbley. Nikolas Cruz, who shot and killed 17 people in 2018 at a Parkland high school, was caught by police hours after the shooting, and Ethan Crumbley, who shot and killed 4 people at a Township high school in 2021, surrendered to law enforcement just minutes after the act. He left many written and visual recordings evidencing his mental decline and motivations before the shooting. Both shooters certainly had different underlying causes that led them to commit such heinous acts, though they do share common themes in their confessions. They shared pre-incident detachment and lack of remorse, and post-incident remorse and guilt. They also shared a specific admission of their motive, which was, to oversimplify it, to make others suffer, and an acknowledgment of their declining mental health.

Unlike the previous examples, Valente, to my knowledge, had not shared any pre-incident indications. He was also not caught or interviewed by law enforcement. His confession was voluntary, which should eliminate the coercive influence that can impact a police interrogation. The transcripts resemble a defensive narrative or a loose confession more than a manifesto, which is regularly tied to mass shooters (Ethan Crumbley created a manifesto before his act).

Manifestos typically outline the purpose behind the act, which is often ideologically motivated, and employ persuasive and unifying language. They are more prominent in shooters with extreme ideological motivations, such as Dylan Roof, John Earnest, Patrick Crusius, and Brenton Tarrant. In 2007, I was attending high school on the East Coast of Virginia when Virginia Tech was subject to a mass shooting, which claimed 32 lives. The shooter, Seung-Hui Cho, left behind a wordy manifesto which also contained photographs. Seung-Hui Cho’s manifesto rambled themes of hatred, desire for vengeance, victimhood, and a desperate need for control, which all align with defining aspects of a manifesto. There was, however, a lack of reasoning for the choice of victims and location.

The transcripts of Valente’s tapes are mostly translations since Valente spoke Portuguese for most of his confession. Anyone who has learned a foreign language or who has taught foreign languages can understand the risk of translation. There is an area of interpretation that needs to be taken into account when translating material, as there are expressions and nuanced verbiage that cannot be directly translated. So, any analyses of the transcripts from Valente’s confessions must consider the possibility of misinterpreted translation.

This post presents a forensic linguistic analysis of the translated transcript of Caludio Valente’s confession tapes and will apply principles of statement analysis to find patterns of narrative construction and strategies. The transcripts are allegedly verbatim from these unseen videos. Analysis is provided for understanding what exactly Valente is confessing and any indicators of motivations or who he intended his audience to be.

Overall assessment: Highly likely cognitive and emotional overload, agentive positioning, moral disengagement, strategic omission, and adversarial audience construction.

Forensic linguistic markers legend
Pronoun Positioning

The entire narrative has an extremely high frequency of “I,” which demonstrates ownership and narrative control. This shows the agency of the speaker, whereas in assessments of courtroom testimonies, there tends to be a lack of “I” when the speaker is feeling guilt and wants to create distance between themselves and the scene.

“You” is treated non-specifically, which will become an ongoing theme in this assessment. There are also out-group exclamations, verbal aggression, such as “go fuck yourselves.” This constructs a polarized moral reality where “I” represents a lucid actor and “you” represents the hypocritical and undeserving.

“They” also come into the mix when Valente refers to the victims as “these people” and even further dehumanizes the victims by merging them into an inconvenient mess by using phrases like “all of this shit”. It erases the victims, reducing them to part of the circumstances rather than beings.

“We” is used only twice. Once when discussing the people he had spoken to in private, and again after thanking his viewers for “the opportunity”. Valente says, “We are finished”. There are a couple of things odd with this statement. If the translation and interpretation are solid, the use of “we are finished” in an American English context is often used to signify the completion of a task, or it is used to signify a dramatic end or defeat. Oddly, Valente has shown a significantly strong sense of ownership when it comes to his actions; he does not involve an “us” or “we” in his statements. It seems unnatural for him to use “we” when referring to a task he previously took responsibility for. And if he is dramatizing his sense of dread, then why does he include the viewers, since that was the last “you” he was speaking of? Over and over, he states that he wants to go out on his own terms; it would have been more natural to say “I’m finished”.

Omission

Valente owns the actions with phrases like “it happened” and “mistakes were made”. However, there is a lack of what exactly happened. For example: “It was hard as hell to do it to all of these people”. To “do it” suppresses the content.

Threat to Identity

Valente demonstrates several linguistic features that he senses the pressure of judgment. His narrative is dense with sensitivity markers, which typically indicate internal rehearsal of counterarguments and a need for explanation. Sensitivity markers can be found with uses of “because”, “since”, “so”, “that’s why”, and so on.

His repeated insistence, “I am sane,” is contradicted by the disorganized sequencing of his narrative, temporal confusion, which could be due to his cognitive and emotional overload, as well as what could be a debilitating eye injury. His repetition of sanity is another attempt to secure his identity and the perception that others have of him.

Stance Inconsistency

There are also noticeable contradictions in Valente’s transcripts. He claims multiple times that he doesn’t care when he otherwise continues stating his justifications and his concern about how they are interpreted.

Hedging

Valente’s use of hedging is less significant here, as hedging is often more noteworthy in courtroom testimonies when facts become suppressed into opinions. Valente’s use of these indicators, such as “basically”, “probably”, and “kind of”, is most likely an honest formulation of opinion and not intentionally transforming a factual event into an impression.

Stress Markers

Examples:
- “It was, it was…”
- “And—and it’s—it’s…”
- “There isn’t—there isn’t—there isn’t…”
Such repetition, which also dominates the narrative, can indicate cognitive overload and emotional arousal. These repetitions often occur around Valente’s moral justifications, the confrontation, and his eye injury.

This cognitive overload and perhaps even executive fatigue are further shown in his system breakdowns, where he becomes present-moment focused and goes into sensory narration such as “the lights…” and “my eye…”.

Discussion

What makes Valente’s narrative so unsettling is the disturbing lack of remorse and specificity. There is no denial of committing an act of violence, but there’s a deliberate omission of what those acts specifically were and why they were done. It’s especially odd considering he specifically targeted Loureiro at his home, over an hour’s drive away from Brown University.

The question currently remains unanswered about what emails Valente sent and what they contained. Also, why did he target Brown University? Why Nuno Loureiro? Linguistic evidence points to feelings of betrayal and fatigue more than envy. Valente employs “envy” in the context of those who can carry out violence and end their own lives. He does not mention envy toward Nuno Loureiro, which has been suggested by some sources as his motivation. His sentence “That is what I really envy” may be a nod confirming some feelings of jealousy toward something else, or it could function as another response to the rumors, as it is evident he was keeping tabs on the news.

Some questions are answered in the first video, where Valente begins a vague explanation of his actions: “This was an issue of… of opportunity. I would really like to thank you for the only opportunity that you gave me here, which was this one, and… and look, that’s it. I don’t have anything else to say. We are finished.” As mentioned earlier, the “you” is ambiguous and could mean the audience as one person or all the way to the whole world. The opportunity he is thanking the audience for is most likely for watching his video, the opportunity of being listened to.

More information unfolds in the second video. Valente adds: “I needed a catalyst–for both of them. But for the first one, it was the fact that I was confronted, and in the second, I also had one, you could say, a little bit.” He admits to relying on a catalyst. The omission of detail for the second catalyst suggests that it may risk the narrative he is constructing. His word choice of “catalyst” is also interesting; it is defined as a person or thing that precipitates an event. Valente, crediting the confrontation with that other man on campus for prompting his attack on Brown University, opens the possibility that something or someone else prompted Valente’s targeting of Nuno Loureiro. His first catalyst is specified by his first-hand experience (“I was confronted”), while his second catalyst is minimized; Valente reduces ownership and does not contribute a first-hand reasoning or experience.

His resentment seems directed toward people in general, not just Americans, and not even those he specifically targeted. This resentment may have been catalyzed by his earlier account of feeling perhaps humiliated or betrayed: “I later had access, uhm, to the people privately, the conversations we had privately showed it was all fake. Uhm [pause] so they are not going to get anything from me. I did not like any one of you. I saw all of this shit from the beginning.” He was apologized to before, and later it was revealed that these apologies were not genuine, perhaps due to formalities often used in formal disputes or other reasons. The function of “you” seems to shift from the viewers to those who wronged him.

His videos function as a suicide note, and the content, especially when we consider their chronology, does not entail a confessional revelation nor a manifesto. It resonates a narrative constructed for the audience, correcting rumors and clarifying his lack of feeling. The chronology of his videos shows how much he prioritized how his audience would perceive him. His first video encompasses a defensive narrative establishing his lack of remorse, his intention of “leaving on his own terms”, and the vague explanation as to why he was led to commit such acts. It is in the second video that he begins to touch on what he said in his previous video, adding more resentment toward others. His third video enforces that he is a reliable narrator, repeating that he is sane, and as if to prove that he is not an ego-driven maniac, declares that he does not want to leave a legacy or manifesto and “does not care” what others think (which is ironic). His final video is prompted by his need to correct rumors claiming that he said “Allah Akbar” during the attacks. What mattered most to Valente before his death was control of his narrative.
The Heptapod Disadvantage

April 5, 2025
1.0 Introduction

Heptapod A is a fictional communication system spoken by an alien species in Ted Chiang’s “Story of Your Life”. Heptapod B is the alien writing system that is more thoroughly investigated and understood in Chiang’s world. It is revealed that Heptapod A and Heptapod B are unrelated and the writing of Heptapod B’s ideograms are created simultaneously, building from Fermat’s Principle of Least Time which is the concept of how light chooses the fastest path to its destination (Chiang, 2015). In “Story of Your Life” this principle helped uncover the method behind Heptapod B because the fictional linguist Dr. Banks theorized that if light can choose the fastest path to its destination, then it must know its final destination beforehand, implying that time was not linear to light therefore the Heptapods may also experience time differently.

Heptapod A being an alien language can be assumed to not follow a Universal Grammar which is the innate principles all human languages share (Coon, 2020). Fortunately for us, Dr. Banks managed to uncover Heptapod B’s free word order and the aliens’ ability to produce multiple levels of centre-embedded clauses which human language has a limitation on (Chiang, 2015). Considering the constraints of multiple centre-embedded clauses may be a by-product of our short-term memory limitations (Karlsson, 2007), it makes sense that an alien with the ability to see into the future could handle the cognitive demand of producing these clauses.

Smith and Wheeldon (1999) investigated how much planning is completed (in humans) prior to articulation. Their picture-description task visualised various first clause complexities and sentence lengths, and recorded speech onset latencies to determine which forms and lengths of sentences took longer to plan. The authors believed that a complex first clause would take longer to plan than a simple clause, meaning the full first clause is in the planning scope at the point of articulation. They also believed that if two-clause sentence took longer to plan in comparisons to one-clause, then a portion of the second clause is also in the planning scope. They found not only a main effect of complexity and length, but also an interaction in the reduced complexity of long sentences compared to short. Their findings indicate that more planning is given to the first clause than the second clause.

This report will examine the grammatical and conceptual encoding of Heptapod A to understand if their spoken language produces similar results as the human English speakers in Smith and Wheeldon (1999). The Heptapod design in the film adaptation Arrival has no apparent eyes but has proven to be able to see, let alone communicate through writing, so visual stimuli will be used with a picture description task. Because Heptapod A and B are unrelated there must be an apparent distinction between their ability to produce spoken communication as the simultaneous non-linear approach to their writing cannot be construed in the sequential nature of auditory speech (Chiang, 2015). Due to their cognitive abilities, it may be possible that Heptapod A’s planning scope can be more accommodating to clausal and length variation and will not produce any main effects nor replicate the delays found in the previous study. Alternatively, if my predictions are not met and the Heptapods take longer to plan simple-complex sentences then this may tap into their unique language structure and spoken word order, violating Greenburg’s Universal Grammar which posits the shared properties of verb to object or object to verb languages.

2.0. Methods

2.1. Participants

The participants were twenty young-adult Heptapods with normal vision. All the Heptapods had similar education backgrounds, fluent in Heptapod A, and have signed their respective semasiographic names in agreement to not use their foresight to give them an advantage in the experiment.

2.2. Materials

A set of 48 black and white line drawings of familiar objects were used based on a fictional Heptapod normed list from a variety of semantic categories. All the pictures had a naming latency less than 600ms and a mean word frequency of more than 150 occurrences per million and ranged from one to two syllables in length. Following very closely to Smith and Wheeldon’s (1999) Experiment 1, 24 of the total pictures were used to create 32 sets of three pictures built two sets of sixteen which were matched for latencies and were created to avoid phonological and semantic similarity among each set. These were then combined in four different ways to produce a total of four sets (for each condition) of 8 triples. Each picture occurred in a screen position only once, and each experimental picture occurred only once in all sets. In the experiment, pictures could move either up or down. Movements are assigned to the subject phrase and object phrase where either the subject phrase moves up, and the object phrase moves down, or the object phrase moves down, and the object phrase moves up for two-clause sentences. Movements for the single clause sentences will have the subject phrase move up or own with the subject phrase having no movement. The filler picture sets will utilize additional movements: right, left, and no movement. The conditions were assigned as follows:
1. complex-simple sentence: the blorp and the srup move up and the forp moves down
2. simple-complex sentence: the blorp moves up and the srup and the forp move down
3. complex sentence: the blorp and srup moves up
4. simple sentence: the blorp moves up
Participants saw the same number of movements, and the order of these movements was randomized.

The remaining 48 objects were made into filler sets of triples to avoid the priming of the next experimental set. Staying close to Smith and Wheeldon’s (1999) study, the fillers will have four types that will elicit different responses: (1) all objects move in different directions, (2) two objects move in the same direction, (3) all objects move in the same direction, and (4) no objects appear.

2.3. Design

The independent variables are the first-clause complexity (simple-complex, complex-simple, simple, and complex) and sentence length (one-clause and two-clause). The dependent variable is the speech onset latencies (ms). First-clause variation and sentence length are both within subject and between item as all the images are grouped in certain triples that avoid phonological and semantic similarity.

2.4. Procedure

The Heptapods were tested individually in physically accommodating labs or upon their spaceships if they prefer. They were situated approximately three meters in front of a large computer monitor (Heptapods are 10m tall). They were also recorded for speech onset latencies with a sensitive voice key to accommodate for distance.

Again, like Smith and Wheeldon (1999), the experiment began with two practice blocks of 10 trials which will have each experimental picture occur once to activate the lemmas. This was followed by eight experimental blocks also featuring 10 trials. Altogether, these would create two “pairblocks” of 16 experimental trials and 24 filler trials, totalling 32 experimental trials and 48 fillers per participant.

The participants were instructed on what types of movements they would see and in what way they should be described. Because this design assumes an SVO order of Heptapod A, they were instructed to also describe the pictures from left to right. Each trial will start with a displayed central frame indicating the location and boundaries of the set of images (a triplet set). The frame will appear for 2 s and then will display the set of images. Movement of the images will begin instantly and will last up to 1000ms. As soon as the participants began to describe the set of images, triggering the voicekey, the set of images will disappear. After an increment of 4 s the next trial will begin. Breaks were encouraged, but the Heptapods deemed them unnecessary and were eager to participate.

3.0. Results

Analysis was done with a 2×2 factorial ANOVA with sentence length (simple vs. complex) as within subject and between item and first clause complexity (simple-complex vs. complex-simple) as within subject and between items. By-item analysis considered the effect of set (image used) on the onset latencies. Error rates and mean latencies will be shown in Table 1.

Table 1. Naming latency means and error rates

First, the effects of length, which includes one-clause and two-clause sentences, and the effects of first clause complexity, which included all sentence types, simple, complex, simple-complex, and complex-simple, were analysed with a 2×2 ANOVA with follow-up t-tests. Analysis revealed a main effect of first clause complexity (simple-complex, complex-simple, simple, and complex) (F (2,636) =5.664, p<.005). There was also no main effect of length (F (1,636) =.046, p=.829). Follow-up paired t-test Bonferroni corrected for multiple comparisons revealed there was no significant difference between simple and complex sentences, showing the main effect of type was not driven by length (p=.52). The t-tests also revealed the main effect of type was driven by the slower onset latencies of simple-complex sentences (M = 358.25, SE = 4.57) than in complex-simple sentences (M = 337.3, SE = 4.33) across all trials (t (159) = 3.6, p <.001).

In Figure 1, the comparisons of length and complexity are shown. As stated before, though there was a slower mean onset latency for complex sentences, there were no significant differences in onset latencies for simple and complex sentences, illustrating no effect of sentence length (one-clause and two-clause), which does not successfully replicate previous findings.

Figure 1. Heptapod A onset latencies of sentences

First-clause complexity demonstrates a main effect, more closely shown in Figure 2. Contrary to previous studies on English-speaking humans, simple first clauses elicited slower onset latencies than complex first clauses. Interestingly, the one-clause and two-clause sentences testing for length are also slower than the three-clause complex-simple phrases. Though this difference is not significant, it is notable that there is an apparent strategic change in sentence planning for changes in sentence length.

Figure 2. Heptapod A onset latencies for first-clause complexity

4.0. Discussion and Conclusion

The findings indicate that Heptapod A reserves more time for planning the final clause than the first. Complex-simple sentences produced faster onset times than simple-complex sentences, when strangely the onset times were extraordinarily similar for sentences of one clause (simple) and two clause (complex) length. Considering my hypothesis of no main effect in complexity and length not being met, we must consider the possibility that Heptapod A has an entirely unpredictable planning scope. When it came to their point of articulation for single clause and double clause sentences there was a suspicious lack of significant difference which gave my main predictions a half-hearted shrug because though there was no main effect, the non-significant difference in onset latencies contradicted what I believed, following the findings in the previous studies by reporting slightly slower responses to complex phrases as opposed to simple ones were achieved in Smith and Wheeldon’s (1999) study.

So how does this fare with the findings on first-clause complexity? It appears that in larger phrases, Heptapods may switch to a strategy that has them prioritize complex phrases, given they had to describe the images left to right, so upon seeing one object in the subject phrase move, they may have taken more time to observe the object phrase items before articulation. Sentence production has been found to be flexible and not structurally fixed (Wagner, et al. 2010), so the change in strategy follows findings in previous studies. This prioritization of complex phrases could be a reflection of Heptapod B’s ability to produce multiple centre-embedded clauses. If it is not a reflection of their inter-clausal structure, then we can also consider the free-word order of Heptapod B, which is a blatant rejection of Greenburg’s Universal Grammar. The apparent slowing of onset latencies for simple-complex phrases may even illuminate a disadvantage of the sequential order of spoken language, considering the Heptapods’ simultaneous approach to writing and their experience with time.

For follow-up studies I would want to test Heptapod A’s own language structure. One of the limitations of this study includes its SVO approach. When eye-tracking technology becomes more advanced and can detect the direction of a Heptapod’s gaze, I would want to do an eye-tracking study following (Spivey, et al. 2002) to gage Heptapods’ syntactic processing in comparison to the restricted-domain serial modal and the multiple-constraint model. Using this kind of experiment on Heptapods will give us more context on how they approach their sentence planning with visual context. Their ability to simultaneously produce clauses could improve their ability in dealing with the ambiguity employed in Spivey, et al. 2002. If a full semasiographic dictionary of Heptapod B and a large corpus of Heptapod A and B become available, these symbols that look like coffee stains can be treated as an image for visual context. Rather than using picture items, the semasiographs themselves will be prompts for “picture” naming. Eye-tracking and a full understanding of their writing system can give us clues into how Heptapods approach and conceptualize their “simultaneous order”.

References

Chiang, T. (2015). Story of your life, in Stories of your life and others. Main Market Ed. (76). Picador

Coon, J. (2020). The linguistics of Arrival: Heptapods, field linguistics, and Universal Grammar, in Punske et al., Language Invention in Linguistics Pedagogy. Oxford Academic, https://doi.org/10.1093/oso/9780198829874.003.0004

Karlsson, F. (2007). Constraints on Multiple Center-Embedding of Clauses. Journal of Linguistics, 43(2), 365–392. http://www.jstor.org/stable/40057996

Smith, M., & Wheeldon, L. (1999). High level processing scope in spoken sentence production. Cognition, 73(3), 205-246. https://doi.org/10.1016/S0010-0277(99)00053-0

Spivey, M. J., Tanenhaus, M. K., Eberhard, K. M., & Sedivy, J. C. (2002). Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology, 45, 447–481.

Wagner, V., Jescheniak, J. D., & Schriefers, H. (2010). On the flexibility of grammatical advance planning during sentence production: Effects of cognitive load on multiple lexical access. Journal of Experimental Psychology. Learning, Memory, and Cognition, 36(2), 423-440. https://doi.org/10.1037/a0018619

R code used for data analysis
Analysis of Radical Language in Classroom Discourse

March 8, 2025
Note: This post does not reflect any political affiliations or beliefs. This post includes a dispute that challenges narratives that have been manipulated by the media, which may appear to take the side of what some may deem a controversial set of beliefs, but that is not the case. Please read fully before making assumptions.

One of the most challenging periods of my academic journey was during my final semester of my undergrad. A professor wrongfully cited me for violating the student code of conduct and later retaliated by flunking my assignments when her report was unsuccessful. I spent the entire spring of that year defending my character and keeping records. By the end of the semester, I also spent a lot of time collaborating with my peers to build my report on my professor’s retaliation.

Now, a recent graduate of Applied Linguistics, I want to experience working with forensic linguistics, and I want to start with something meaningful to me. So, this personal project will revisit my academic misadventure with that professor and explore the defense I made for myself that spring of 2019, and the details of the moments leading up to the wrongful accusation. It will explore semantic analysis, indicators of radicalization, and an assessment of the discourse between my teacher and me. This will contain contextual information such as emails, classroom discussion posts, and information from my meeting with Student Affairs addressing my teacher’s retaliation. This will be total amateur work, but it will allow me to practice and to share some form of a forensic linguist project.

Here’s where it started:

Names of participating individuals and institutions are anonymized.

It was January 2019. The Black Hebrew Israelites, the Indigenous Peoples’ March, and the Covington Catholic School’s March for Life all clashed on the steps of the Lincoln Memorial. A viral video spread of what would quickly become known as the “Sandmann smirk”. This incident became a hot topic of interest to my professor because (1) it happened recently at the time of this class and (2) the course was on Native American Literature.

The viral incident of the “Sandmann smirk” was discussed and used as an example of Indigenous oppression during one of our sessions. In fact, it was initially presented by a student who then received positive feedback from the professor.

On that same day, I went home and watched the whole video of that incident. A nearly two-hour-long video, which ultimately denied the credibility and agenda behind the viral clip. It did not fully exonerate the Covington Catholic students; in fact, it showed that all parties contributed to the tension and friction. I saw that the viral clip should no longer be considered an example of the oppression of Indigenous people and shared the whole video on our classroom discussion board, fully stating this (unedited for authenticity):

1/29/19 @21:57 on Discussion Forum, “Current Events/ News/ Social Justice Movements/ Organizations/ Events.”

“Shar Yaqataz Banyamyan’s Video Coverage”

Since it hasn’t been posted yet, this is the full video of the confrontation between the groups of protesters at Lincoln Memorial that occurred earlier this January. To find the viral clip, you will need to view up to about an hour. I wanted to share this because I think it’s important to see how media distorts events and consciously creates stigma and hatred.

If this link does not work, I apologize. The video is still available on YouTube and should be easy to find. Let me know what you guys think. How does media affect you? Does this video provide a different perspective?”

This is how it began.

Following my first post, my professor had several responses:
1. 1/30/19 @00:42 https://m.dailykos.com/stories/2019/1/22/1828496/-Covington students-caught-wearing-blackface-making-rape-jokes-in-addition-to-harassing-Omaha elder (Page no longer found)
2. 1/30/19 @00:44 https://www.youtube.com/watch?v=OKJLe0L7Ktg&app=desktop
Then she later followed up with separate threads without explanation:
1. 1/30/19 @00:46 https://www.washingtonpost.com/opinions/2019/01/25/time-take covington-smirk/?utm_term=.0401f7a31730 (Page no longer found)
2. 1/30/19 @00:47 https://www.democracynow.org/2019/1/22/i_was_absolutely_afraid_indigenous_elder (INTERVIEW)
3. 1/30/19 @00:48 https://www.theguardian.com/us-news/2019/jan/23/how conservative-media-transformed-the-covington-catholic-students-from-pariahs-to-heroes (Page no longer found)
4. 1/30/19 @00:48 https://www.thenation.com/article/black-children-nick-sandmann savannah-guthrie/
5. 1/30/19 @00:49 https://www.washingtonpost.com/history/2019/01/23/face-off between-catholic-school-teens-native-american-elder-is-reminder-years conflict/?noredirect=on&utm_term=.e520e8872ca5 (Page no longer found)
6. 1/30/19 @00:51 https://theintercept.com/2019/01/24/covington-maga-hat-native american/ (Page no longer found)
7. 1/30/19 @00:52 https://badndns.blogspot.com/2019/01/first-encounters.html?m=1 (BLOG)
8. 1/30/19 @ 00:59 https://www.huffingtonpost.com/entry/covington-catholic students-blackface-race-issues_us_5c472a2de4b0a8dbe1752db5
I looked through all the links and responded to her first two responses without directly addressing the new thread. The two links I responded to cover an incident where the high school students make “rape jokes” and also an event where the students are seen in “black face”. Because my professor did not provide any context in her response, I have to assume that she was trying to argue that the Covington Catholic Students have committed enough indecent and inhuman acts outside of the “Sandmann smirk” and his association with the group is enough to villainize him as an individual.

I do not articulate my response very well:

1/30/19 @9:31

Would it possible to ignore the Hebrew Israeli protesters calling white racial slurs to the “MAGA” students in the middle of the original video? Why is it that one horrible rape joke manages to be the voice of the entire group when the man filming the video said “Indian” means “savage” gets brushed aside because the media didn’t highlight it?

The main point I’m trying to make is that the rape joke (not made by Sandmann) is horrible (and shouldn’t be tolerated), but it’s not grounds to villainize Sandmann as an individual or to use his smirk as a symbol of oppression. If that’s the case, then why aren’t we villainizing the others who contributed to the tension? The man recording the video was part of the Black Hebrew Israelites and said heinous things to both the Indigenous Peoples March AND the Covington Catholic Students, so why is the man recording the video not being targeted the way Sandmann is? Should we blame the media for that?

At this time, I emailed Greg, Head of the English Department, seeking advice on further action. My main concern was that my grades would be subject to unfair judgment due to the nature of my professor’s responses. I wrote the email later that day.

1/30/19 @12:06 PM

Greg,

Good afternoon. This email concerns political bias in a classroom setting. I don’t know who I should contact regarding this. I only wanted to express my concern for my Professor’s response to my discussion thread. I provided a link (unedited and with no underlying agenda) to the thread titled “Current Events/News/Social Justice Movements/Organizations/Events”. Her response was repressive and maybe intimidating. It made me feel that my opinion (non-political) was unappreciated. The discussion board was quickly overwhelmed with biased articles and links, which I took the time to read.

Please see the attachment that organize the links and the event. What should I do further?

1/30/19 @13:49 (From Greg)

Hi Sarah: I’m sorry to hear about your problems in this class, an I certainly hope it doesn’t affect your grades. I think we should meet and we can go through exactly what is happening in the class together. I can meet around 2:30 tomorrow, or anytime next Tuesday afternoon. I’m looking forward to talking to you soon. All best, Greg

1/30/19 @ 15:19 (From me)

Hello, I can meet after that very class which ends at 3:45

1/30/19 @15:50 (From Greg)

Sure; that’s fine. See you tomorrow at 3:45…..

The next morning, I opened blackboard and found two unread messages under my Native American Literature course.

The first reply by my professor, Helen, was posted on the classroom discussion board.

1/31/19 @1:23

Sarah, Actually, if you read all the articles and blogs I posted, no one is ignoring their comments, but rather contextualize the history of the group and agree that what they are doing in regards to the slurs, etc. is just as offensive as the youth. I certainly don’t agree with them and find their derogatory words towards Indigenous People and others appalling. The members of the Indigenous group willingly placed themselves between these two EXTREMIST groups. The youth have been documented in other videos that have since surfaced acting similarly towards other people in the area –before this entire incident occurred. Did you also gloss over the part of them making offensive “wooping” sounds and “hand chops” meant to mimic and stereotype “Indians”? I suggest reading more of the articles posted –especially those from the perspective of Indigenous People in order to better understand the absolute pain that reverberated through Indian Country (as well as communities of color) after watching these youth mob Nathan Phillips. I would start with Deborah Miranda’s blog and perhaps the opinion article on the “smirk”– as well as the articles about “white privilege” that allowed the youth to hire private PR 6 firms tied to known Republican operatives. Hiring a private PR firm to “speak for you” is a class privilege that most people of color in these situations do not have. In fact, can you imagine if the mob of 100s of youth were black lives matter activists and Nathan Phillips was white? These youth would have been arrested and perhaps killed. I’m astounded that anyone could read the actions of these youth as anything other than mob violence and intimidation meant to instigate Phillips.

Lastly, your comment about rape is unacceptable discourse in this class and will be recorded as a violation of the student code of conduct and [university’s name] core values. One should never make light of or joke about the serious crime of rape and I encourage anyone who has been sexually assaulted to not let your misguided words or the words and actions of these youth to deter them from reporting incidents to the appropriate persons.

The second reply was a private message from Helen sent minutes after this accusation took place.

1/31/19 @1:37

Hi Sarah, can you please set up a time to meet with me to discuss your nonchalant comment about rape. You may need to clarify your comment that I read as violating the code of conduct for students and dismissing the seriousness of sexual assault.
Fig. 1 Images used in Student Grievance Report
Fig. 1 Images used in Student Grievance Report

With the context and detailed background on what instigated this accusation, I will briefly touch on my feelings about the matter. First, I was extremely distraught to be accused of condoning rape in front of my peers. Second, the degree of accusation is not only an insult to my character but also an insult to my past. Helen was ignorant of my experiences with sexual assault and failed to consider the gravity of her accusation toward someone she did not know beyond our student-teacher dynamic. Lastly, her hypocrisy and inability to be professional became apparent, as her public accusation occurred before she considered writing to me to clear the air.

Also, on page 6 of her syllabus, it was clearly stated: “It [the class] will nurture an atmosphere free from…discrimination upon an individual’s political views or beliefs.” I want to clarify that my position was, in fact, neutral and was not directly challenging any political standpoint. However, because I was not supporting the claims the viral clip was illustrating, it may have appeared that I was openly defying its political message. It is possible that Helen’s reaction could have been instigated by her perceiving that I was challenging her political stance rather than the role of the media.

Analysis

My goal is to explore possible indications of radicalization in my professor’s language. Because the samples are very small, my results cannot be conclusive, which is fine in this case since I am simply exploring tools and frameworks specific to counter-radicalization. My methodology was inspired by research in threat assessment and violence risk. I want to be clear that I am not implying my professor is prone to violence, nor do I believe she is. I am using these frameworks and approaches to determine what extremist indicators are found in her language. My findings will be standardized by including quantitative methods, which will integrate NLP use, such as Wmatrix, to detect prominent features of her texts with theoretical frameworks for radical indicators. I will also use collocation patterns in the Corpus of Contemporary American English (COCA) to elaborate my defense.

My defense:

What got me in trouble was the use of “rape joke” in my reply. So, according to Helen, saying “rape” next to “joke” means that I support rape and possibly find it funny (?)

The defense I made with Student Affairs at the time was along the lines of “well, she sent me the article that has ‘rape joke’ in its title, by proxy, does that mean she condones rape too?” More specifically, I clarified that my construction of “rape” and “joke” as a phrase was not a reflection of my opinion but that of the very title of an article my professor sent in one of her replies. Clearly, I remain concerned that it was the sole consideration that got me out of trouble.

Knowing what I know now, let’s consider an alternate reality where she sent the article about the “rape joke” incident, but it wasn’t stated within the title. Would I be in hot water then? It would have been harder to show that the construction of that phrase was not a reflection of my opinion. Using the frequency of collocation patterns in corpus linguistics, I can further solidify my defense.

COCA is an American English database with over a billion words gathered from various sources. Using the collocation search feature, I input “rape” as the target word with +1 for all words that followed “rape”, meaning I searched for all two-word phrases that started with “rape” across the entire corpus. I also sorted by relevance with a minimum score of 20 to eliminate highly frequent words such as the, and, or are.

Jokes occurred 16402 times across the corpus, and 141 of those occurrences were connected to rape. This is a percentage of 1.65%. The phrase rape jokes also has a mutual information (MI) score of 8.08. This is significant because an MI score measures the strength of association between the indicated words, illustrating how often they co-occur. As a rule-of-thumb, an MI score of 3 or higher is considered evidence of a noteworthy association, that the two words are collocates. Then the word joke occurred 36085 times across the corpus, with 43 occurrences linked to rape. A percentage of 0.12%. The phrase rape joke has an MI score of 5.23.
Fig. 2 Images of COCA collocation results

It’s now more apparent that the phrase is not unfamiliar to the media and has made an appearance enough times to be noted as a collocate. If I could go back in time while sitting in that frigid office with Student Affairs, I would have liked to show them that my use of the phrase was not a combination inspired by my own moral framework but rather a byproduct of the media.

Let’s entertain that I constructed the phrase “rape joke” by my own accord. It’s an original phrase that has not achieved notable association in an American English corpus. What Helen and the Department of Student Affairs blatantly ignored is the adjective “horrible” I placed in front of it. Even if the phrase were my own construction, I associated the term with a negative description. If we want to assume my opinion based on my use of the words “horrible rape joke” rather than “rape joke,” then wouldn’t that mean I actually condemn rape?

Helen’s language:

Following J.Ebner et al.’s (2023) initial approach to their analysis, I will also be using the Institute for Strategic Dialogue’s (ISD) definition of extremism: “…the advocacy of political and social changes in line with a system of belief that claims the superiority and dominance of one identity-based ‘in-group’ over an ‘out-group’. It advances a dehumanising ‘othering’ mind-set incompatible with pluralism and universal human rights, and can be pursued through violent and non-violent means.”

In their recent study on assessing the violence risk of far-right extremist groups, J. Ebner et al (2023) associated extremism with identity fusion, which integrates personal agency with shared identity markers among group beliefs and practices. They determine that “fused individuals tend to view members of their in-group as kin-like” and employ kinship language such as “brotherhood”. The authors sought linguistic markers of identity fusion, and their framework also considered combinations with these indicators, such as existential threat perceptions, out-group dehumanization, and violence-condoning norms (Ebner et al, 2023). Their indicators, with the exception of violence-condoning norms, have been used as the baseline for extremist/radical indicators in my professor’s accusation.

Previous research that has analyzed extremist text has utilized lexicon-based tools such as Linguistic Inquiry and Word Count (LIWC) and corpus-linguistic tools such as Wmatrix, which calculate and match the percentage of words in predefined categories (e.g., semantic domains) (Litvinova & Litvinova, 2020). This analysis will implement Wmatrix for its analysis capabilities, particularly in indicating the significance of a text’s frequency of a word or phrase. I compared my professor’s accusatory text of a mere 358 words to the American Corpus. The primary domains (excluding the Trash Bin, which contains punctuation marks) were Time: New and young, Language, speech and grammar, Impolite, Crime, The Media, Colour and colour patterns, Belonging to a group, Comparing, Happy, Probability, Evaluation: Bad, Quantities, and Law and Order.

These primary domains were the only ones with a log-likelihood (LL) value of 3.84 or higher, which is statistically significant at the p<0.05 level. This means the results occurred with 95% certainty, less than a 5% probability that they occurred by chance.

Fig. 3 Wmatrix semantic frequency compared to COCA

There is a clear binary and exclusionary framing that is indicative of extremism, which will be detailed further. The top domain with a critical LL value of 21.83 (15.13 LL is considered to be at the p<0.0001 level) is Time: New and Young. The concordances of this domain strictly refer to “youth,” which my professor uses to refer to the Covington Catholic Students.

Fig. 4 Concordance lines for Time: New and Young domain

With this context in mind, it would be more appropriate to categorize these results with the domain Belonging to a group, which included other terms such as “members” and “group” in Figure 5. Although there are no indicators of kinship language, there is a notable us vs them narrative. Nathan Phillips is mentioned as a victim individually, but when considering the primary alleged antagonist, Sandmann, the individual is not mentioned. There is a fixation on his ‘group’ as a whole.

Fig. 5 Concordance lines of domain Belonging to a group

References to the ‘youth’ are consistently associated with another group term, ‘mob’, indicating a strong negative perception of the determined out-group. It is also noteworthy that ‘mob’ is used as both a noun and a verb regarding the students; Wmatrix caught this and did not include the verb in the group domain. This association may not be a clear indicator of the dehumanization of an out-group, mainly due to the lack of derogatory terms directed at them. However, there is a strong tendency to villainize the out-group. Out-groups, if we consider her brief mention of the Black Hebrew Israeli group as another extremist group.

Her in-group terms use “members” and “communities” in association with a pacifistic role under threat of the out-group. The sentence, “the Indigenous group willingly placed themselves between these two EXTREMIST groups”, echoes principles of sacrifice and virtue, which further polarize the framing of these groups. Under COCA, to place oneself between two parties, particularly aggressors, is a common cluster that is associated with individuals trying to create peace or to protect.

Fig. 6 Concordances of domain Colour and colour patterns

Returning to indicators of extremism, fixation occurs across various models (Araque et al, 2022). According to Araque et al (2022), fixation is any behavior that demonstrates a person’s increasing preoccupation with a person or cause that has commonly been accompanied by failing relationship or work performance. It seems fair to assume that Helen has displayed the defined fixation, as this has impacted her classroom management and her authority as a grader (see the bottom for her retaliation). The concordances of the domain Colour and colour patterns further display a group polarity. White versus black. White versus color. The white group represents a privileged and villainous class, and the black and colored groups represent the victim class. Her fixation clearly resides within left-wing ideology, but I will not comment on the political or sociological implications, nor will I argue these beliefs; this is purely an assessment of extremist language and its indicators.

One final note on in-group mentality before moving further into existential threats is the possibility of identity fusion. Fusion, as mentioned before, integrates personal agency with group beliefs and practices. An example of this fusion can come from shared trauma or deeply transformative events (Ebner et al, 2023). Helen acts as a representative of in-group suffering, for example: “I suggest reading more of the articles posted– especially those from the perspective of Indigenous People in order to better understand the absolute pain that reverberated through Indian Country (as well as communities of color) after watching these youth mob Nathan Phillips.”

Looking closer into frameworks of extremist language, existential threat perceptions entail beliefs of danger and threat posed by the out-group, whether absolute or not. Helen’s assumptions of violence of the out-group have already been expressed by her use of “mob”, and then she further engages with conspiratorial myth behavior. When Helen said, “…can you imagine if the mob of 100s of youth were black lives matter activists and Nathan Phillips was white? These youth would have been arrested and perhaps killed”, she portrays inevitable injustices and possible violence.

Domains Happy refers to her use of “smirk”, “make”, and “joke”, which mainly encompasses her accusation toward me. Domain Evaluation: Bad refers to her use of “appalling” and “derogatory”, which are her comments on the actions of the out-group. These results are replicable. If readers are more interested in the concordances not shown, save my professor’s response as a txt. file, access Wmatrix, and upload the txt as a corpus. Under the semantics section, select the American Corpus to compare it to.

Overall, Helen’s language in this portrays indicators of radicalization and extremism. It should be noted again that this sample is incredibly small, and more data would be needed before making assumptions. The goal of this project was to assess linguistic features of extremism, and Helen’s language displays group identity, identity fusion, negative out-group perceptions, and existential threat beliefs.

Final notes:

My professor’s retaliation is an entire ordeal outside the scope of this exploratory study. However, I invite anyone to view my (anonymized) grievance report below.

Official+Statment Download

It will take a lot more behavioral evidence to confidently and publicly accuse someone of condoning rape. Sexual assault should not be taken lightly, nor should how we accuse others of supporting it. The term “rape joke” alone, especially in the context I used it, does not come close to reflecting my opinion. My opinion of academic administration and classroom management, however, is strong and has driven my interest in behavioral analysis and counter-radicalization in areas of education, law, and policy.

References

Araque, Ó., Sánchez-Rada, J. F., Carrera, Á., Iglesias, C. Á., Tardío, J., García-Grao, G., Musolino, S., & Antonelli, F. (2022). Making Sense of Language Signals for Monitoring Radicalization. Applied Sciences, 12(17), 8413. https://doi.org/10.3390/app12178413

Davies, Mark. (2008-) Collocates data from The Corpus of Contemporary American English (COCA). Data available online at https://www.collocates.info.

Ebner, J., Kavanagh, C., & Whitehouse, H. (2023). Assessing Violence Risk among Far-Right Extremists: A New Role for Natural Language Processing. Terrorism and political violence, 36(7), 944–961. https://doi.org/10.1080/09546553.2023.2236222

Litvinova, T., Litvinova, O. (2020). Analysis and Detection of a Radical Extremist Discourse Using Stylometric Tools. In: Antipova, T., Rocha, Á. (eds) Digital Science 2019. DSIC 2019. Advances in Intelligent Systems and Computing, vol 1114. Springer, Cham. https://doi.org/10.1007/978-3-030-37737-3_3