Digital Witch Project


The general purpose of the Digital Witch Project is to create a methodology for understanding film genres and cycles using digital tools. Our specific goal is to investigate the dialogue and images of one critically understudied genre, witch films, and draw conclusions about the emergence of patterns and trends over time. The project has three stages: (1) identifying a corpus of witch films, (2) collecting data from that corpus on textual and visual similarities between films, and (3) visualizing and analyzing collected data. We employed SameDiff a textual comparison of dialogue in witch films and ImagePlot for a visual comparison of theatrical posters and film stills. This investigation reveals prototypical films, The Witches of Eastwick and The Craft; cycles, including the Harry Potter franchise and the made-for-television witch film trend; and outliers, including Snow White and the Seven Dwarfs and The Blair Witch Project. In all, we hope this study sparks more conversations on the complex inner workings of genre film production, capitalism, and the tradition of co-opting witchcraft—a symbol of import to various marginalized cultures—within U.S. film and media.

Witches, Genres, and Cycles

Genre films have familiar narratives, characters, and aesthetics. An extraterrestrial invades Earth. A slasher slowly kills off a group of teenagers until one final girl escapes. A lone outlaw rides off into the sunset. These patterns are a way for Hollywood to capitalize on the popularity of one film, turning it into a modular and repeatable type, a generic formula. While there is an overwhelming amount of research on certain film genres, a number of others remain critically understudied. Our research aims to establish a method for the digital analysis of one corpus of understudied genre films, witch films—a genre that, according to critic Sarah Ward, pivots on the tension between oppositions to normality and structures of power (41).

Ward argues that “as a term, a practice and a label commonly denouncing spiritual activities that defy dominant beliefs, witchcraft comes to the screen loaded with meaning” (35). Although witches figure in almost every society across the globe, they occupy a unique space in American culture and history. From “witch trials” during the early colonial period to the present-day politicized “witch-hunts,” America has obsessed over witches for centuries. Culturally, the witch has been bound up in America’s tumultuous relationships with religion, gender, and sexuality. Scholars tie popular witch media to their corresponding historical moments, particularly moments of unrest and paranoia—for example, one cannot extricate Arthur Miller’s play The Crucible (1953) from McCarthyism, nor can one detach Rosemary’s Baby (Roman Polanski, 1968) from the political unrest of the 1960s (Ward 37-40). When witchcraft became aligned with the women’s liberation movement in the 1970s, interest in and media representations of witches boomed (Foltz). After this influx of interest, images of witches appear more domesticated.

There have been few comprehensive studies of the witch film despite the longevity of the genre, which began with the U.S. release of the silent film Häxan in 1929 (Benjamin Christensen). As popular culture tamed the witch, from violent or sexual threats to domesticated, loving, hard-working women of capitalism, witch comedies began to appear (Gibson 184). ABC’s Bewitched (1964–1972) set a precedent for the Disney-TV trend toward light-hearted witch series and movies. Sabrina the Teenage Witch aired on ABC from 1996 to 2000 (before moving to the WB from 2000–2003). Halloweentown (Duwayne Dunham 1998) was the first in a cycle of Disney Channel Original Movies about witches. Meanwhile, the slightly darker Buffy the Vampire Slayer (1997–2003) and Charmed (1998–2006), which were a mix of comedy, horror, and teen drama, aired on the WB. Concurrent horror films such as The Blair Witch Project (Daniel Myrick & Eduardo Sanchez, 1999), and The Conjuring (James Wan, 2013) recall the fear of difference that buoyed early representations of witches. The teen horror film The Craft (Andrew Fleming, 1996) reflects both a potential for liberation and regulation bound up in witchcraft. Sue Short, writing of The Craft, notes the female characters’ willingness to accept their status as outsiders and in fact embracing their exile as an “alternative to existing norms” (105). Of course, in the end only one member of The Craft coven—the most “normal” among the four teenage girls—remains empowered. These film and television trends reflect a period of the mass production of witch films, from the release of The Witches of Eastwick (George Miller) in 1987 to the release of Harry Potter and the Deathly Hallows: Part Two (David Yates) in 2011.

In 1973, Andrew Tudor centered sociological and psychological context in his definition of genre, arguing that genre is “what we collectively believe it to be” (139). That is, genre “is not a way in which a critic classifies films for methodological purposes, but the much looser way in which an audience classifies its films” (145). While scholars and fans widely label witch films as such, what allows us to come to this agreement is much more complex than simply pointing to commonalities. In short, a genre is defined and created by cultural consensus. “Like film genres,” Amanda Klein writes, “film cycles are a series of films associated with each other through shared images, characters, settings or themes” (4). She goes on to establish the difference between genre and cycle:

However, while film genres are primarily defined by the repetition of key images (their semantics) and themes (their syntax), film cycles are primarily defined by how they are used (their pragmatics). In other words, the formation and longevity of film cycles are a direct result of their immediate financial viability as well as the public discourses circulating around them, including film reviews, director interviews, studio-issued press kits, movie posters, and theatrical trailers. (4)

The center of film cycles are studios’ attempts to cater to audience desires and expectations. If an audience enjoys a film (i.e., if a film earns a studio impressive returns at the box office or garners fan attention), the studio will copy some or all of it (e.g., the semantics or syntax) and graft these elements onto a new film. Cycles are about films’ use-value and profitability; these are commodities, usually produced as quickly as possible. While cycles and genre function differently for our discussion of witch films, both are crucial for understanding how generic formulas reflect both audience and studio demands.

Digital Humanities Projects on Film

We seek to position this project within a network of burgeoning scholarship on the digital analysis of film. With the onset of data mining digital humanities projects, we are now equipped with the tools for studying large-scale textual and visual trends more efficiently. Movies in Color analyzes the usage of color in film using color drop tools; the in-progress project Distant Viewing takes on the movement of images; Digital Formalism examines visual elements of film to consider how formal techniques contribute to overall narrative structures. The tool ScripThreads identifies patterns of character interactions in screenplays. Meanwhile, projects from The Pudding, such as “Film Dialogue from 2,000 Screenplays, Broken Down by Gender and Age” and “Hollywood’s Gender Divide and Its Effect on Films,” rely strictly on text mining and lack scholarly rigor.

Finding throughlines and divergences in thousands of hours of footage is possible in ways that it hasn’t been before, but methodologies for this have not yet been developed or tested. Considering that films are multimodal by design, telling and showing stories through sound, visual, narrative, and spectacle, the fact that all prior projects have stuck to either the text or the visual component of films elides the reality of the cinematic form. This is where our project hopes to intervene, first by doing a combined digital analysis of both dialogue and visuals, with long term goals of incorporating sound and moving images. At its core, this project is an experiment that uses digital tools in order to understand film genre and cycles.

Bar graph of Box Office Reciepts ImagePlot of Brightness vs. Hue ImagePlot of Brightness vs. Saturation (Axis) ImagePlot of (stills) Brightness vs. Saturation Scatterplot of Halloweentown Scatterplot of Harry Potter Scatterplot of Snow White Scatterplot of The Blair Witch Project Scatterplot of Witches of Eastwick Scatterplot of The Craft

Corpus Development and Collection

Bar graph of Box Office Reciepts

In this first phase of the Digital Witch Project, we collected an inclusive corpus* of witch films. Despite the witch’s prevalence in the collective imagination, no singular definition exists. For example, if we consider both The Church and School of Wicca and Louisiana Voodoo, each employs their own unique definition and identity. In the context of film genre, though, cultural census allows us to readily identify a large number of so-called “witch films” that have co-opted the term. We let public opinion guide us as we created our corpus: If a film was called a witch film by film distributors, marketing materials, fan websites, listicles, or scholarly articles, we included it. This methodology necessarily introduces certain biases, and in our case it immediately illustrated how the figure of the witch is white-washed and subdued in media. Lastly, we supplemented the list with any film that included named “witches,” such as the three Macbeth films.

We then collected dialogue transcripts, original release posters, and most popular stills of each film.* We employed the image comparison tool ImagePlot to examine differences and similarities in hue, brightness, and saturation among the release posters and film stills. Using the text comparison tool SameDiff, we simultaneously compared each witch film’s dialogue to six films* that indicate watershed moments in the genre’s development: Snow White and the Seven Dwarfs (David Hand et al, 1937), The Witches of Eastwick, The Craft, Halloweentown, The Blair Witch Project, and Harry Potter and the Sorcerer’s Stone (Chris Columbus, 2001). Initially, we selected these prototypical films based on box office receipts. The films that appeared when we found these receipts, after we adjusted for inflations, were Snow White and the first Harry Potter. We also observed how seminal two specific films, The Witches of Eastwick and The Craft, were in fannish and scholarly conversations about witch films. Similarly, when we searched popular culture lists, the made-for-television movie Halloweentown occupied a significant place in conversations; this film highlighted a cycle of made-for-television witch movies that our initial view of box office receipts ignored. The following demonstrates our findings and analysis thereof.

Visual Analysis Using ImagePlot

In addition to informing our definition of witch films as a genre, cultural perception also shaped how we approached visual analysis. As mentioned earlier, we analyzed theatrical posters and film stills to consider both how this genre is marketed and received by audiences.

For theatrical posters, we used the poster associated with each film on the Internet Movie Database (IMDB). IMDB provided continuity—nearly all U.S. films are included in the database—and also allowed us to study how viewers perceive witch films, because IMDB pages are curated by volunteer contributors rather than by the film’s distributors. ImagePlot, then, allows us to perform a color study of the collection of theatrical posters, indicating what shared visual characteristics define marketing techniques of the generic witch film.

ImagePlot of Brightness vs. Hue

We first rendered the posters as a polar visualization of brightness and hue, which compares the relative lightness of the image to the variations in colors or shades as they relate to the color wheel. The majority of posters clustered around the center of the graph, which means that, on average, those images use darker hues than ones toward the edges. There were relatively few posters that used vibrant blue hues, with the exceptions of Bewitched (Nora Ephron, 2005) and Sleeping Beauty (Clyde Geronimi, 1959), and even fewer that used green hues, excluding Rosemary’s Baby. Yet, if we consider the films at the center of the graph, the generic witch film looks different than expected. Where we might anticipate some combination of warmer yellows and oranges, following the model of Practical Magic (Griffin Dunne, 1998) or Halloweentown, posters most commonly used black in combination with colder blues and grays to suggest that the generic witch film poster is icier, darker than expected. A surprising trend was the lack of reds.

ImagePlot of Brightness vs. Saturation (Axis)

We also organized these theatrical posters to compare brightness and saturation along the x- and- y-axes respectively. Here, the visualization indicates a trend of higher saturation in theatrical posters, which generally aligns with the darker hues of our first graph. But organizing the posters by brightness and saturation also reveals a cluster of outliers not apparent in our first visualization. There is a cluster of films—The Wizard of Oz (Victor Fleming, 1939), Little Witches (Jane Simpson, 1996), The Crucible (Nicholas Hytner, 1996), and Maleficent (Robert Stromberg, 2014), among others—near the bottom right corner that have the highest possible brightness and lower color saturations. All predominantly white in hue, these posters range across subgenres and historical periods.

ImagePlot of (stills) Brightness vs. Saturation

Our visualization of film stills supports this analysis. For each film, we collected the first image return from a Google Search for “[film title] + film still.” Although the Google Search algorithm is partially randomized, influenced by factors like the computer’s previous search history, it also ranks pages based on their relevance to search terms and allows for a study of how public discourse contextualizes each film.* As the graph shows, the majority of film stills maintain a lower brightness and higher saturation. Yet there are several exceptions: Suspiria (Dario Argento, 1977), Teen Witch (Dorain Walker, 1989), The Little Mermaid (Ron Clements & John Musker, 1989)—nearly off the graph—Halloweentown, and The Wicker Man (Neil LaBute, 2006).

Textual Analysis Using SameDiff

Turning from the visual similarity to looking at the dialogue, we gathered cosine similarity scores for our six benchmark films resulting in six distinct graphs, rendered using a combination of RawGraphs and Adobe Illustrator. In all graphs, the x-axis represents chronological time and the y-axis denotes the percent similarity between a film and the potential prototypical film. We included ratings because we thought that there would be some connection or pattern in similarity between those films with similar ratings. While this bore out in the made-for-TV witch films, beyond this cycle we don’t see any significant ratings-based patterns.

Scatterplot of Halloweentown Scatterplot of Harry Potter

The graphs revealed a few unexpected trends. One of the most prominent cycles is that of Halloweentown. Many of the closest films to this are in fact other made-for-television witch films, such as Sabrina the Teenage Witch. The success of Halloweentown prompted several sequels, including Halloweentown II: Kalabar’s Revenge (Mary Lambert, 2001), Halloweentown High (Mark A. Z. Dippe, 2004), and Return to Halloweentown (David Jackson, 2006). Meanwhile, the Disney Channel also put out Twitches (Stuart Gillard, 2005) and Twitches, Too (Stuary Gillard, 2007). This cycle is nested within the genre; for example, Halloweentown and Witches of Eastwick are 72 percent similar.

The Harry Potter films represent a distinct cycle but remain a generic outlier. This underlines the ways in which Harry Potter was not marketed as a witch film, even though half the characters are witches. Despite the fact that the semantic elements are present—the broomsticks, wands, cats and spells—the film’s dialogue is quite dissimilar from other witch films. For example, Harry Potter and the Sorcerer’s Stone is only 24 percent similar to The Witches of Eastwick.

Scatterplot of Snow White Scatterplot of The Blair Witch Project

While Harry Potter represents an outlying cycle in the genre, The Blair Witch Project and Snow White are outliers all together. They are entirely dissimilar from the majority of films in the genre and do not seem to initiate the production of dialogue-similar films. Snow White is the highest grossing film in our corpus with multiple theatrical releases. One would think that the second highest grossing film and another heavy-hitter for Disney, Sleeping Beauty, would have used similar language. Yet Sleeping Beauty is only 18 percent similar to its studio predecessor. In regards to our other outlier, the only films that resembles The Blair Witch Project is The Woods at 52 percent, with all other films falling at or below 32 percent. Perhaps we might say that The Blair Witch Project is a horror film first, and a witch film second.

Scatterplot of Witches of Eastwick Scatterplot of The Craft

If two prototypical, theatrical witch films seemed to emerge from this study, it would be The Witches of Eastwick and The Craft. Looking at the timeline of these graphs, we see an uptick in the number of witch films produced after The Witches of Eastwick. This could be explained by an uptick in the sheer number of movies being made. However, it could also be tied to the post-1970s increase in interest in Wicca and the concurrent commodification of witchcraft or to greater recognition by production companies of the attractiveness of the witch film as a genre. It is also interesting that The Witches of Eastwick and The Craft are 80 percent similar, despite the fact that they are presumably geared toward distinct audiences: The first is about three adult female friends having shared sexual, magical, and maternal experiences, while the second is about the perils and promise of ostracization and female friendship in high school. Both films have garnered the majority of fan and scholarly attention in the genre, so it is at the same time unsurprising that they coalesce at the level of language.

After the Witching Hour...

After the peak production period of witch films (1987–2011), the trends seem to change. In the last few years, the number of action-adventure witch films has flourished. Here we refer to The Huntsman films (Rupert Sanders, 2012; Cedric Nicolas-Troyan, 2016), Hansel and Gretel: Witch Hunters (Tommy Wirkola, 2013), and The Last Witch Hunter (Breck Eisner, 2015). Alongside these, we see a resurgence of independent witch films, such as The House of the Devil (Ti West, 2009), Antichrist (Lars von Trier, 2009), The Witch (Roger Eggers, 2015), and The Love Witch (Anna Biller, 2016). The last two of these have received critical acclaim and suggest possible new watershed moments, although it is certainly too soon to tell. Nonetheless, they continue to rely on certain semantic elements, alongside the syntactics of the witch in American culture.

In phase two of the Digital Witch Project, we expect to add complexity and detail to the project’s approach. First, we will run the rest of our corpus through SameDiff, accompanied by a further exploration of specific language use in the scripts using the language analysis tool Voyant. We will explore further the affordances of ImagePlot, while also considering other visual analysis tools. With another dimension of film in mind, we will search for sound analysis tools. Next, we will add data about each film's production company to see what patterns might emerge. Lastly, we will develop interactive graphs for our data to make them more readable and explorable.


We would like to thank the following individuals for their feedback and support: Amanda Phillips, Megan Martinsen, Melissa Jones, Matt Pavesich, and Caetlin Benson-Allott.