Blog Post

Kurt Wallander and the Case of the Text-Encoding Gremlins

Graphics by Michelle Jia : Image Flickr ( I, II 

I recently picked up the first of Henning Mankell’s Wallander novels, Faceless Killers. I loved the Kenneth Branagh TV adaptations of these mysteries but had been saving up the pleasure of the novels themselves. I have the special talent of forgetting the resolution of pretty much any mystery I read, so I have no trouble enjoying the suspense a second time around. What I do not enjoy, however, is reading an excellent novel in a poor text.

Here is a sample passage, from Wallander’s first encounter with the prosecutor he falls for, Anette Brolin:

She shook off the question brusquely. “I don’t really know yet. Stockholmers no doubt have a hard time getting used to the leisurely pace of SkŒne.”

He could see that despite her youth she did have professional experience.

“We have to take a look at Lšvgren’s bank statements,” he said.

Henning Mankell, Faceless Killers, trans. Steven T. Murray (New York: New Press, 1991, rpt. Vintage Crime/Black Lizard/Vintage/Random House, 2003), 89.

I’ve given extra publication information to be clearer about just which text I’m discussing. Needless to say, the “leisurely,” somewhat provincial region in which the novel is set is not called “SkŒne” but Skåne; the victim of the crime, a salt-of-the-earth farmer (with a secret—of course), does not have the impossible name “Lšvgren” but “Lövgren.” These typographical aberrations take on a particular irony in a novel whose central theme is the unease with which Sweden welcomes migrants in the immediate post-Cold-War era. The farmer from Skåne is meant, I think, to seem at first like an ur-Scandinavian victim, and the major plot revolves around the possibility that “foreigners”—perhaps inhabitants of a nearby refugee camp—are responsible. Characters keep remarking on the threatening novelty of this outburst of extreme violence in the countryside.

Of course things turn out to be more complicated than that. Furthermore, Mankell carefully balances the moral ledger by introducing a secondary crime, the cold-blooded murder of a Somali refugee by racist nationalists. The typographical aberration reappears, all too appropriately, at the climax of this secondary investigation. I am about to quote from late in the novel, but it seems silly to say “spoiler alert.” Mankell does not do much to offer you suspects ahead of the fact. Since I am terrible at picking up clues I may have missed something, but I am also fairly sure he is uninterested in the game of giving you material to figure out the mystery before Wallander does. All the real interest comes from Wallander’s experience of the investigation. One’s own pleasure in following the experience is rather interrupted when one sees lines like this on the page:

According to the officer, the flat was occupied by a man named Valfrid Stršm. He wasn’t listed in any police files…

The door was opened by a woman wearing a dressing gown. Wallander recognised her. It was the same woman who had been asleep in the double bed. He hid his revolver behind his back.

“We’re with the police,” he said. “We’re looking for your husband, Valfrid Stršm.” (203)

For just a moment I wondered whether this was the twist: the sinister neo-Nazi was himself a “foreigner,” some kind of south Slav (Croatian? Slovak?) with a vocalic r in his name. But no: on the next page things are as we expect, and the villain is referred to as “Valfrid Ström” (231), another countryside name.

These are text encoding errors. After some detective work (see below for the R code), I am fairly certain that at some point in the preparation of this edition, a text prepared on a Macintosh was then edited or typeset on a Windows PC. In particular, the errors are all consistent with text encoded in “Mac-Roman” being reinterpreted as “Windows-1252.” The former was the default text encoding on Macs before the introduction of MacOS X, and a natural for anyone who works with European languages; I remember the fun of being able to produce lots of letters with diacritical marks on our family’s first Mac in the early 1990s. The latter was and in some cases remains the default text-encoding on Windows. Most computer users are familiar with these errors. Computers store all their data as numbers, which means the computer has to pick a scheme for mapping numbers to letters when it represents text. Glitches arise when the computer picks the wrong “encoding” scheme. If the original scheme represented <ö> using the number 154, but the new scheme uses 154 to represent <š>, then <Ström> will become <Stršm>. Glitches are more common with glyphs beyond the unaccented Latin alphabet: in the Anglocentric world of computation, most encoding schemes agree about encoding the unaccented alphabet, and variation begins where English ends. (This situation has greatly improved with the spread of Unicode, but the underlying problem is fundamental. The popularity of emoji is creating new versions of it.)

In the Vintage Faceless Killers, the errors have a curious distribution: they are not consistent throughout the book (that really would have ruined the whole text), but when they appear they are uniform across the whole page. This suggests to me that the errors arose at the typesetting stage: was the encoding glitch noticed and then fixed page-by-page, but with some pages accidentally skipped? But why wouldn’t a document-wide search and replace do the job? The translator couldn’t possibly let such things go by, and I can’t imagine any copyeditor missing these, regardless of their knowledge of Swedish. (I know about three words of Swedish, but it’s easy to guess which of “Skåne” and “SkŒne” is the correct form.)

Or was there a copyeditor at all? Not that one expects copyediting in a $15 Vintage trade paperback, of course, or for that matter in a reprint of a title first issued by the New Press, founded by André Schiffrin to save publishing from rampant commercialism.1 Sarcasms aside, the glitched text makes a notable contrast to the intention of authoritativeness conveyed by the format and price: this “Vintage Crime/Black Lizard” title is in the same size as the many modern-classic type books published under Vintage, with a parallel cover design to that of my Vintage edition of The Sound and the Fury. Some light on the situation is thrown by an interesting 2009 blog post in which Steven T. Murray, the credited translator, discusses his process (which turns out to be a collaboration between Murray and his wife Tiina Nunally):

We are proud that our translations at Fjord Press were remarkably error-free, compared to most books today, now that publishers are cutting back on copy editing, or eliminating that step altogether….

While each of the 3 Mankell novels I did was supposed to be due in 3 months, I recall that we cranked out one of them—I don’t recall which—in about 4 weeks because the “advance” was 2 months late! That’s the publishing business for you.

Murray, Nuts and Bolts of Translation (1).

It is not difficult to imagine errors like those in Faceless Killers appearing in a rapid and curtailed production process like that alluded to here. I underline once more that it is very difficult to believe the text-encoding errors would have escaped the notice of the translators, and easy to envision the text being corrupted after the novel was out of their hands. If I had to guess, I might even say that it hints that the final production of this novel might, like so much of the production chain of contemporary publishing, have been carried out in the global South, perhaps by workers who read neither English nor Swedish.

It is sometimes said about popular genre fictions like Faceless Killers that though they may be translated and circulated widely, they do not qualify for consideration as an authentic world literature. In the last chapter of his Ecology of World Literature, Alex Beecroft, drawing on Vittorio Coletti’s arguments, takes this position:

Successful international crime novels use their cities (whether New York or Stockholm) merely as noir-ish backdrops, rarely engaging with their immediate political or cultural contexts in the ways that national-era detective fiction, from Agatha Christie to Georges Simenon to Dashiell Hammet [sic], manage to do. The contemporary crime novel…is a commodity packaged for export, nearly mass-produced and indistinguishable from its counterparts produced in other nations.

Beecroft, An Ecology of World Literature: From Antiquity to the Present Day (London: Verso, 2015), 282–83.

It strikes me that this highbrow dismissal of mass-produced fiction was also visited on Christie and Simenon, two of the most astonishingly prolific writers of the last century, in their own time. I also find it hard to moralize the distinction between commodities for domestic circulation and commodities packaged for export; surely every book in the global literary system is a commodity exchanged in some market or other, and there is nothing inherently superior about producing for a domestic market. Nor will it really work to deduce the political and cultural “engagement” of any text from its setting and themes; for a maximally prestigious counterexample, think of Beckett, with his contextless, pretranslated texts.

I am taking advantage of the text-encoding glitch to suggest that we can formulate a broader inquiry into globalized communications circuits that is not restricted to prestige texts a priori. (Such an inquiry would, I think, actually fit well with the broader arguments of Beecroft’s book, which shows how much can be gained from taking the comparative view of circulation as well as textual themes and forms.) In any case, one could hardly ask for a more particularized setting than the Skåne of Faceless Killers—my book includes a map of the peninsula, with all the places mentioned in the novel marked—or a more pressing political theme than its question of refugees and migrants. But the text-encoding errors in my Vintage paperback have something to say about the particular way one commercial channel for transnational genre fiction transmits such material. They testify to speedup and globalization, not at the point of authorship, but in the channels of book production. Like all glitches, they alert us to the functioning of a particular technological pipeline as well as the customs that govern its use. They indicate a familiar American indifference to other languages even in the exceptional context (for the Anglosphere) of a commercially successful translated book. But if “SkŒne” is a little obscŒne, that phenomenon is by no means limited to commercial genre fictions, any more than encoding problems created by shifting from Macs to Windows are.

Appendix: Forensic method

The substitutions noted above were <š> for <ö> and <Œ> for <å>. Elsewhere I also noticed <Š> for <ä> (“NŠslund,” 250). So we wish to find a pair of text encodings E1, E2 such that encoding <öåä> as E1 and then decoding the result as E2 yields <šŒŠ>. The stringi package makes a brute-force search of possibilities easy to implement in R.

knitr::opts_chunk$set(cache=T, autodep=T)
library(stringi)
library(dplyr)
library(purrr)

First we need a function for the encode-decode. We start with the input, and we examine the output, in R’s native UTF-8 encoding.

mangle <- function (x, from, to) tryCatch({
        bytes <- stri_encode(x, to=from, to_raw=T)[[1]]
        stri_encode(bytes, from=to)
    },
    error=function (e) NA,
    warning=function (w) NA
)

The tryCatch is necessary because stri_encode gives a warning if we try to convert a raw value that is not actually part of the target encoding; other glitches are also possible, all of which we’ll ignore.

Now given a pair of encodings, this is the test to see whether they are the culprit in Faceless Killers:

guilty <- function (x, y) mangle("öåä", x, y) == "šŒŠ"

A list of all available encodings known to stringi is found with

encs <- stri_enc_list(simplify=T)

The rest is brute force (a minute or two to try the million or so possibilities):

culprits <- expand.grid(from=encs, to=encs, stringsAsFactors=F) %>%
    filter(map2_lgl(from, to, guilty))

(map2_lgl vectorizes guilty over its two arguments.) Though this yields 120 pairs of encoding names, they all turn out to be synonyms or near-synonyms (i.e. largely overlapping encoding schemes) for "macroman" and "cp1252", the 1990s-era Mac and Windows default sets, respectively, which we can see by using stri_enc_info:

culprits %>%
    mutate(from=map(from, stri_enc_info),
           to=map(to, stri_enc_info)) %>%
    mutate(from=map_chr(from, "Name.friendly"),
           to=map_chr(to, "Name.friendly")) %>%
    distinct() %>%
    knitr::kable(format="html")
from to
macintosh windows-1252
x-mac-turkish windows-1252
macintosh windows-1254
x-mac-turkish windows-1254
macintosh ibm-1252_P100-2000
x-mac-turkish ibm-1252_P100-2000
macintosh ibm-1254_P100-1995
x-mac-turkish ibm-1254_P100-1995

Now that I’ve solved this mystery, I can finally get some rest.

Cross-posted on andrewgoldstone.com.

  • 1. I haven’t consulted the New Press edition to check whether the texts are the same (I suspect they are). Strangely, the text follows British spelling conventions (e.g. “recognised” in the passage on 203), but I can’t guess why, since the translator is American and the first English-language edition appears to be the New Press (New York), judging by WorldCat listings, with the Harvill (London) edition following later.

Andrew Goldstone is an Assistant Professor of English at Rutgers University, New Brunswick. His book, Fictions of Autonomy: Modernism from Wilde to de Man, is published by Oxford University Press. He specializes in twentieth-century literature in English, with interests in modernist and non-modernist writing, literary theory, the sociology of literature, and the digital humanities.