A Searle's Chinese Room of One's Own  

March 11, 2015

As a technical writer at GrammaTech, most of what I do is essentially translation: translation between the developer-language and user-language. On the face of it, this seems like the subtlest translation situation imaginable. GrammaTech developers speak English, the manual is in English, and CodeSonar users are themselves developers. What is there to translate?

Here, translation is not a linguistic transformation but a context switch. Users of CodeSonar will certainly have a different perspective on it than the people who are creating it. Even the scaling is different: a developer-month of effort may translate to a single sentence in the release notes saying that the analysis now runs faster; while an afternoon's worth of tweaks to the CodeSonar GUI may entail changes to dozens of places in the manual. Part of my job is extracting the information users want and need to know while quietly leaving out the rest, because one of the fundamental tragic realities of user documentation is that nobody cares how amazing the underlying algorithms are. Tell it to POPL.

"Actual" Translation

In the last few years, we have introduced translation in a more traditional sense: parts of the CodeSonar manual are now available in Japanese.

Except: I don't know Japanese.

Luckily, our wonderful Japanese distributors do. The result is a division of labor — they provide translations, and I integrate the translated material into the broader documentation ecosystem.

Some of the issues that arise in this integration are domain-independent: How do you sanity-check something you can't understand? (Google Translate is sufficient to confirm that the sentence you have is the sentence you think you have.) How do you make Emacs cooperate with Japanese input? (By placating it with ritual key combinations. I know. I was as shocked as you are.) How do you avoid being distracted by intriguing translation minutiae? (Just go ahead and be intrigued. If you weren't the kind of person that would find it interesting that the translation for "security" is セキュリティ [sekyuriti], which looks distinctly like a loanword, you wouldn't have gotten yourself into this situation the first place.)

Other issues have been specific to our documentation infrastructure. For instance, the majority of the translation effort so far has been on the documentation for the CodeSonar warning classes; however, significant pieces of the warning class documentation are generated automatically from the same information sources used by the CodeSonar implementation. To allow for Japanese-language versions of these pages, I had to amend the relevant parts of the documentation generation system to parameterize the output language, drawing on a collection of translated texts provided by our distributors.

The process works like this:

Codesonar Documentation Build System

So what I have now is a small system that can be interrogated by speakers of a language I don't speak and provide answering information in that language. Which is to say, I have something a lot like the Chinese Room described by John Searle in his 1980 paper Minds, brains, and programs. Except that the language in question is not Chinese, and there is no book or room, and the whole thing is not imaginary.

It would be great if I could say that having my own little question-answerer has changed my perspective on Searle's thought experiment, but I really can't. I'm pretty proud that a Japanese-speaker who wonders whether CodeSonar's binary analysis engine can issue a 二重解放 (Double Free) warning can determine that このクラスはマシンコード解析で完全にサポートされています。(this class is fully supported by the machine code analysis), but I definitely don't think it constitute evidences that any combination of me, my software, and my hardware "knows" Japanese.

The corollary, though, is that the generated English versions of the pages also don't constitute evidence that I know English. Ouch.