Stylometry: When Algorithms Lead the Investigation — From Molière to Contemporary Affairs

The Art of Measuring Style Long confined to circles of philologists and linguists, stylometry has in recent years experienced an unexpected resurgence of interest. This discipline, which applies statistical tools to textual analysis, aims to identify an author by their way of writing — the frequency of function words, syntactic patterns, and sentence rhythm. In other words, a stylistic fingerprint as unique as a handwritten signature.

ACTUALITÉS

11/4/20255 min read

The Art of Measuring Style

Long confined to circles of philologists and linguists, stylometry has in recent years experienced an unexpected resurgence of interest. This discipline, which applies statistical tools to textual analysis, aims to identify an author by their way of writing — the frequency of function words, syntactic patterns, and sentence rhythm. In other words, a stylistic fingerprint as unique as a handwritten signature.

Behind this science of style lies a simple idea: we all write, unknowingly, according to recurrent, measurable, and comparable patterns. What literary critics once described as a “tone” or a “voice” has now become a mathematical profile that computers can recognize with increasing precision.

The Molière Case: When the Algorithm Challenges the Legend

For over a century, a persistent rumor has hovered over the figure of Molière. Was Jean-Baptiste Poquelin merely a front for Pierre Corneille? Skeptics — from Pierre Louÿs to a few isolated academics — have sustained this doubt, relying on the absence of autograph manuscripts and on the stylistic similarities between the two playwrights.

In 2019, two French researchers, Florian Cafiero and Jean-Baptiste Camps, decided to put an end to the debate by submitting the plays to the rigor of stylometry. Using thirty-seven of Molière’s works and a large corpus of contemporary texts, they applied six analytical methods: measuring function words, grammatical structures, and syntactic sequences. The verdict was clear: Molière’s works form a coherent body, distinct from Corneille’s.

In other words, the verses of Tartuffe and The Misanthrope indeed bear the hand of Molière. The algorithm does not replace aesthetic judgment; it extends it. The machine does not define what style is, but it proves that it exists, unique, measurable, and irreducible to any other hand.

A Science Between Literature and the Courtroom

If the Molière affair amuses scholars, stylometry has now left the libraries to enter the courtrooms. The analysis of anonymous messages, threatening letters, tweets, or suspicious blogs has become part of the new forensic toolkit.

In the wake of forensic linguistics, these methods can reveal recurring writing patterns and establish links between a known author and an anonymous text. The stylistic fingerprint, much like a digital one, becomes evidence. Thus, in the wake of the debate on Molière’s authorship, stylometry has leapt from academic journals to the heart of criminal investigations. This shift is far from anecdotal: it shows how a science of nuance and trace has become an instrument of judicial truth.

In the 2010s, the same researchers who had resolved the Molière–Corneille dispute, Florian Cafiero and Jean-Baptiste Camps, saw their work cited far beyond the literary world. Their book Affaires de style (Seuil, 2023) retraced this unexpected migration: how algorithms designed to distinguish two playwrights of the Grand Siècle are now being used to unmask a modern “crow”, the anonymous author of threatening letters in criminal cases.

The Grégory Affair, one of the most scrutinized French tragedies of the 20th century, provides the most striking example. Since October 16, 1984 — the day little Grégory Villemin’s body was found in the Vologne river — investigators have faced a mystery: who wrote the dozens of threatening and insulting letters sent to the family? Saturated with resentment and bitterness, these texts have long defied graphologists, linguistic experts, and judges alike.

Forty years later, stylometry reopened the case. At the request of magistrates, a Swiss laboratory specializing in statistical language analysis compared the anonymous letters with personal writings of several key figures. In 2025, the results led to the indictment of Jacqueline Jacob, the child’s great-aunt, based on strong stylistic similarities between her writing patterns and those of the “crow.” Experts noted repeated coincidences in the choice of function words, punctuation, sentence rhythm, and syntactic details deemed impossible to imitate consciously.

“This indictment is an important step forward,” said the lawyer for Grégory’s parents. “A step forward for two reasons: first, because it shows that justice has not given up; and second, because it demonstrates that justice is giving itself every possible means to continue,” emphasized Me Marie-Christine Chastant-Morand.

The Ramadan Case: Stylometry at the Heart of a Double Deception

In the Swiss and French cases against Tariq Ramadan, stylometry brought an unexpected twist. Commissioned by the defense, American forensic linguist Dr. Carole E. Chaski analyzed, using her patented SynAID method, several series of messages attributed to two of his accusers, one in Switzerland and the other in France. Her expertise, based on the study of thousands of syntactic structures, concluded that the disputed messages were written by the two complainants themselves, with a reliability rate exceeding 97%.

“Stylometry is a scientific technique based on spacing, punctuation, and syntax to attribute a text to a given author,” explained attorney Pascal-Pierre Garbarini. “This technique was not as advanced seven years ago. Today, it’s different: it’s computer-assisted. We called on world-renowned experts. They analyzed the case files, and their results confirm that the accuser known in France as Christelle had planned in advance to harm Tariq Ramadan. This analysis completely dismantles her version of events.”

The expertise also revealed that both women had discussed the idea of “trapping” Tariq Ramadan before any meeting: one during her first contact with him, the other six days before their only encounter. Both denied writing these messages, though they admitted using the pseudonyms involved. Stylometry, by identifying the invisible author behind the words, thus becomes a central player in a case where proof lies not only in facts but in syntax itself.

On this basis, his Swiss lawyers filed a request for a retrial. The theologian had been sentenced on appeal in Geneva in September 2024 to three years in prison, one of which to be served. The conviction became final in August 2025, and he has since appealed to the European Court of Human Rights (ECHR). Stylometric evidence thus corroborates the initial conclusions of the criminal investigation — conclusions that had not been considered in France, according to a journalist close to the case.

Truth at the Mercy of Words

From Molière to Grégory, from Grégory to Ramadan, stylometry draws a single thread: a science of language capable of revealing what words conceal. In literary disputes, it restored an author’s voice; in criminal dramas, it gave voice to anonymous writings; in contemporary trials, it questions the sincerity of accusations themselves.

As columnist Alain Bauer wrote in Marianne: “Technological advances are revolutionizing criminal investigation methods. From stylometry to artificial intelligence, via nanometric sensors, the tools of forensic science are becoming ever more precise and sophisticated.”

This measurable fingerprint reshapes our understanding of evidence. It reminds us that, in an age saturated with discourse and digital traces, writing itself becomes a witness — sometimes more reliable than memory or testimony. By crossing the boundaries between literature, investigation, and justice, stylometry no longer merely illuminates the past: it redefines how truth is built. And whether it concerns a playwright of the Grand Siècle or a defendant of the 21st century, it always poses the same question: who is speaking , and how far can we prove it?