The Trojan Horse

Bruce Perens <bruce@hams.com>

There's a problem that could very badly effect the public perception of Linux and Open Source. I want people to think about this, and hopefully "head it off at the pass" before it happens.

Perhaps it's already on your system today: a trojan-horse program. It might be a game, or more likely a system utility. It's author uploaded it to an FTP archive, where it was then picked up by your favorite Linux distribution, who wrote it onto the CD-ROM that you bought. It works just fine, but hidden away in the program is a special feature: a secret back-door past your system's security.

Perhaps the author of this attack is tired of hearing about what great hackers we are, and wants to take us down a notch. He's patient - he will wait until his program is distributed to tens of thousands of Linux systems before he says a word. But say is what he'll do - he's not really interested in breaking into your system. What he wants is the publicity, bad publicity for us, and lots of it. We've left the gates open for this trojan horse. Let's talk about how to close them, and hope we have enough time to solve this problem before our reputation is hurt.

The Open Source model has a lot of advantages, and one disadvantage we'd better work on. Do you really know where your software came from? Many Linux distributions have taken some precautions, but none of them can say they've solved the problem yet.

Are we sure we're reading every line of every utility, every game, every application, and that we're catching every back-door that's been planted in that software?
We're confident that we can deal with the problem because our source code is all publicized, and so many people read the code. But are we really secure? Are we sure we're reading every line of every utility, every game, every application, and that we're catching every back-door that's been planted in that software? What about those patch files? It's so helpful when someone else fixes a program and submits a patch to the program's author, but can something sinister be hiding in the patch?

Let's use the Debian GNU/Linux distribution as an example of how to start solving this problem. Although Debian developers are volunteers from all over the world, and most of them have never met, every Debian developer has been identified in some way before they've been given access to Debian's upload area. Either another developer has vouched for them, or they've had someone look at their identification, or a Debian developer has verified that they can be reached at a given address or a listed phone number. Almost all of those developers are set up to provide public-key signatures of their uploads using GNU Privacy Guard, and the upload-processing software will delete their submissions if they are uploaded using the wrong key. So, generally, the Debian folks can say yes, we do know who packaged your software. Unfortunately, they don't necessarily know who wrote the software, or who has submitted patches and otherwise modified it. This isn't only true for Debian - you can say the same thing for Red Hat. Although they digitally sign every RPM they produce, Red Hat doesn't necessarily have reliable identification of the person who uploaded it - for example, my Electric Fence package is in Red Hat and most other Linux distributions, but they put it on their CD before I met anyone at Red Hat, and I have not digitally signed the upload or checked the version that is on their CDs. If someone has inserted a pernicious change into my upload before Red Hat downloaded it, I wouldn't know. The situation is similar for all other Linux distributions.

Take a look at the submission policy for SunSite.UNC.edu, one of the most popular Linux software archives, where software is often picked up for packaging by Linux distributions. No security checks there. Next, look at FreshMeat, a popular online software catalog scanned by Linux distributions looking for new material. FreshMeat does one security check before you can add to their catalog - they verify that you actually do get their mail at the e-mail address you supply. That's easy to defeat using any of the free web-mail services.

If you're lucky, the person who created the software you are using has also posted a cryptographic checksum file with a digital signature, which can be used to verify that the files you download actually do contain the data that he uploaded. If he has a properly-signed public key, then you have a good idea of who he is. If you're real lucky, he has carefully reviewed all patches submitted to him, he's identified the submitters of the patches, and he keeps a permanent archive of those patches for future reference. These are things that we need everyone to do.

So, here's how you can help:

Expand the Web of Trust. Contact your local Linux Users Group, and have a knowledgable person instruct members in public-key cryptography and run a public key cross-signing session regularly at their meetings. Every Linux or Open Source convention, where developers are likely to get together, should host some form of public-key certification, and should announce that well in advance so that developers bring the proper materials to the show. Conventions should also run tutorials on how to use GNU Privacy Guard and how to properly handle and sign cryptographic keys. Through key-signings, developers can join the web of trust, which is a method of identifying people whom you haven't met by checking their public keys for cryptographic signatures made by people you have met. Then, we'll have a way to identify our developers.

Reliably Identify Uploaded Files. Uploaded files should come with a list of cryptographic checksums for those files. That list should be cryptographicaly signed. Only then can you be sure that the file you are downloading is the file that the developer uploaded. The original uploaded files and their checksums should be preserved in source packages, so that a user can verify the integrity of the files in their Linux distribution, and trace them all the way back to the original developer.

Carefully Review Patches, and Identify Their Submitters. Don't assume that every patch file is innocent. Carefully review their contents. Ask the people who send you patches to use a cryptographic signature in their email, so that you can verify who they are.

Keep More Than Just a Change Log. Keep copies of the emails containing patches for later reference. Keep your software in RCS or CVS, so that you have automatic identification of the date and circumstances of every change. If a trojan-horse does make it into your software, you can identify when the change was inserted, and by whom.

Hope This is Soon Enough. It would be just terrible if a widespread, deliberately-inserted trojan-horse in Linux was revealed. Such a thing could be used in the press to discredit the Open Source paradigm that made it possible. Thus, let's get to work today to assure this won't happen.