Date: Wed, 17 Jan 2001 16:16:59 -0500 From: jose nazario <jose@SPAM.THEGEEKEMPIRE.NET> Subject: Crimelabs Paper: Passive System Fingerprinting using Network To: BUGTRAQ@SECURITYFOCUS.COM in keeping with the past few weeks and months in which some really great papers have appeared on BUGTRAQ, i offer you a low-jack approach to passive network analysis. i mentioned this earlier today on PEN-TEST, and so far feedback has been positive and welcome. better formatted, PS and PDF versions can be found on our website: http://www.crimelabs.net/docs/passive.html in this ASCII version i had to drop a large table due to the difficulty in fitting it in 75 columns. please see the PS or PDF versions for this table. this is a work in progress, so any suggestions would be welcome. this was originally submitted to Summercon '01 in Amsterdam, but was not accepted. perhaps i'll shop it around after some further tweaking. the paper follows my .sig, ____________________ jose nazario jose@thegeekempire.net Passive System Fingerprinting using Network Client Applications Jose Nazario crimelabs research jose@crimelabs.net 27 November, 2000 Abstract Passive target fingerprinting involves the utilization of network traffic between two hosts by a third system to identify the types of systems being used. Because no data is sent to either system by the monitoring party, detection approaches the impossible. Methods which rely solely on the IP options present in normal traffic are limited in the accuracy about the targets. Further inspection is also needed to determine avenues of vulnerability, as well. We describe a method to rapidly identify target operating systems and version, as well as vectors of attack, based on data sent by client applications. While simplistic, it is robust. The accuracy of this method is also quite high in most cases. Four methods of fingerprinting a system are presented, with sample data provided. Introduction Passive OS mapping has become a new area of research in both white hat and black hat arenas. For the white hat, it becomes a new method to map their network and monitor traffic for security. For example, a new and possibly subversive host can be identified quickly, often with great accuracy. For the black hat, this method provides a nearly undetectable method to map a network, finding vulnerable hosts. To be sure, passive mapping can be a time consuming process. Even with automated tools like Siphon<1> a sufficient quantity packets to arrive to build up a statistically significant reading of the subjects' operating systems. Compare this to active OS fingerprinting methods, using tools like nmap<2> and queso<3>, which can operate in under a minute usually, and only more determined attackers, or curious types, will be attracted to this method. Current Methods and Research Two major methods of operating system fingerprinting exist in varying degrees of use, active and passive. Active scanning involves the use of IP packets sent to the host and the scanner then monitoring the replies to guess the operating systems. Passive scanning, in contrast, allows the scanning party to obtain information in the absence of any packets sent from the listening system to the targets. Each method has their advantages, and their limitations. Active Scanning By now nearly everyone is familiar with active scanning methods. The premier port scanning tool, nmap, has been equipped for some time now with accurate active scanning measures. This code is based off of an earlier tool, queso, from the group The Apostols. Nmap's author, Fyodor, has written an excellent paper on this topic in the e-zine Phrack (issue 54 article 9)<4>. Ofir Arkin has been using ICMP bit handling to differentiate between certain types of operating systems<5>. Because ICMP usually slips below the threshold of analysis, and most of the ICMP messages used are legitimate, the detection of this scanning can be more difficult than, say, queso or nmap fingerprinting. The problems with active scanning are mainly twofold: first, we can readily firewall the packets used to fingerprint our system, obfuscating the information; secondly, we can detect it quite easily. Because of this, it is less attractive for a truly stealthy adversary. Passive Scanning In a message dated June 30, 1999, Photon posted to the nmap-hackers list with some ideas of passive operating system fingerprinting<6>. He set up a webpage with some of his thoughts, which has since been taken down. In short, by using default IP packet construction behavior, including default TTL values, the presence of the DF bit, and the like, one can gain a confident level of the system's OS. These ideas were quickly picked up by others and several lines of research have been active since then. Lance Spitzer's paper<7> dated May 24, 2000, on passive fingerprinting included many of the data needed to build such a tool. In fact, two quickly appeared, one from Craig Smith<8>, and another tool called p0f from Michael Zalewski<9>. One very interesting tool that is under active development, extending the earlier work, is Siphon. By utilizing not only IP stack behavior, but also routing information and spanning tree updates, a complete network map can be built over time. Passive port scans also take place, adding to the data. This tool promises to be truly useful for the white hat, and a patient black hat. One limitation of these methods, though, is that they only provide a measure of the operating system. Vulnerabilities may or may not exist, and further investigations must be undertaken to evaluate if this is the case. While suitable for the white hat for most purposes (like accounting), this is not suitable to a would-be attacker. Simply put, more information is needed. An Alternative Approach An alternative method to merely fingerprinting the operating system is to perform an identification by using client applications. Quite a number of network clients send revealing information about their host systems, either directly or indirectly. We use application level information to map back to the operating system, either directly or indirectly. One very large advantage to the method described here is that in some situations, much more accurate information can be gained about the client. Because of stack similarities, most Windows systems, including 95, 98 and NT 4.0, look too similar to differentiate. The client application, however, is willing to reveal this information. This provides not only a measure of the target's likely operating system, but also a likely vector for entrance. Most of these client applications have numerous security holes, to which one can point malicious data. In some cases, this can provide the key information needed to begin infiltrating a network, and one can proceed more rapidly. In most cases it provides a starting point for the analysis of vulnerabilities of a network. One major limitation of this method, however, comes when a system is emulating another to provide access to client software. This includes Solaris and SCO's support for Linux binaries. As such, under these circumstances, the data should be taken with some caution and evaluated in the presence of other information. This limitation, however, is similar to the limitation that IP stack tweaking can place on passive fingerprinting at the IP level, or the effect on active scanning from these adjustments or firewalling. Four different type of network clients are discussed here which provide suitable fingerprinting information. Email clients, which leave telltale information in most cases on their messages; Usenet clients, which, like mail applications, litter their posts with client system information; web browsers, which send client information with each request; and even the ubiquitous telnet client, which sends such information more quietly, but can just as effectively fingerprint an operating system. Knowing this, one now only needs to harvest the network for this information and map it to source addresses. Various tools, including sniffers, both generic and specialized, and even web searches will yield this information. A rapid analysis of systems can be quickly performed. This works quite well for the white hat and the black hat hacker, as well. In this paper is described a low tech approach to fingerprinting systems for both their operating system and a likely route to gaining entry. By using application level data sent from them over the network, we can quickly gather accurate data about a system. In some cases, one doesn't even have to be on the same network as the targets, they can gather the information from afar, compile the information and use it at their discretion at a later date. Mail Clients One of the largest type of traffic the network sees is electronic mail. Nearly everyone who uses the Internet on a regular basis uses email in those transaction sessions. They not only receive mail, but also send a good amount of mail, too. Because it is ubiquitous, it makes an especially attractive avenue for system fingerprinting and ultimately penetration. Within the headers of nearly every mail message is some form of system identification. Either through the use of crafted message identification tags, as used by Eudora and Pine, or by explicit header information, such as headers generated by OutLook clients or CDE mail clients. The scope of this method, both in terms of information gained and the potential impact, should not be underestimated. If anything, viruses that spread by email, including ones that are used to steal passwords from systems, should illustrate the effectiveness of this method. An Example: Pine Pine itself is one of the worst offenders of any application for the system it is on. It gives away a whole host of information useful to an attacker in one fell swoop. To wit<10>: Message-ID: <Pine.LNX.4.10.9907191137080.14866-100000@somehost.example.ca> It is clear it's Pine, we know the version (4.10), and we know the system type. Too much about it, in fact. This is a list of the main ports of Pine as of 4.30: --------------------- a41 IBM RS/6000 running AIX 4.1 or 4.2 a32 IBM RS/6000 running AIX 3.2 or earlier aix IBM S/370 AIX aos AOS for IBM RT (untested) mnt FreeMint aux Macintosh A/UX bsd BSD 4.3 bs3 BSDi BSD/386 Version 3 and Version 4 bs2 BSDi BSD/386 Version 2 bsi BSDi BSD/386 Version 1 dpx Bull DPX/2 B.O.S. cvx Convex d54 Data General DG/UX 5.4 d41 Data General DG/UX 4.11 or earlier d-g Data General DG/UX (even earlier) ult DECstation Ultrix 4.1 or 4.2 gul DECstation Ultrix using gcc compiler vul VAX Ultrix os4 Digital Unix v4.0 osf DEC OSF/1 v2.0 and Digital Unix (OSF/1) 3.n sos DEC OSF/1 v2.0 with SecureWare epx EP/IX System V bsf FreeBSD gen Generic port hpx Hewlett Packard HP-UX 10.x hxd Hewlett Packard HP-UX 10.x with DCE security ghp Hewlett Packard HP-UX 10.x using gcc compiler hpp Hewlett Packard HP-UX 8.x and 9.x shp Hewlett Packard HP-UX 8.x and 9.x with Trusted Computer Base gh9 Hewlett Packard HP-UX 8.x and 9.x using gcc compiler isc Interactive Systems Unix lnx Linux using crypt from the C library lnp Linux using Pluggable Authentication Modules (PAM) slx Linux using -lcrypt to get the crypt function sl4 Linux using -lshadow to get the crypt() function sl5 Linux using shadow passwords, no extra libraries lyn Lynx Real-Time System (Lynxos) mct Tenon MachTen (Mac) osx Macintosh OS X neb NetBSD nxt NeXT 68030's and 68040's Mach 2.0 bso OpenBSD with shared-lib sc5 SCO Open Server 5.x sco SCO Unix pt1 Sequent Dynix/ptx v1.4 ptx Sequent Dynix/ptx dyn Sequent Dynix (not ptx) sgi Silicon Graphics Irix sg6 Silicon Graphics Irix >= 6.5 so5 Sun Solaris >= 2.5 gs5 Sun Solaris >= 2.5 using gcc compiler so4 Sun Solaris <= 2.4 gs4 Sun Solaris <= 2.4 using gcc compiler sun Sun SunOS 4.1 ssn Sun SunOS 4.1 with shadow password security gsu SunOS 4.1 using gcc compiler s40 Sun SunOS 4.0 sv4 System V Release 4 uw2 UnixWare 2.x and 7.x wnt Windows NT 3.51 Pine system types used in Message-ID tags as of Pine 4.30. This table was gathered from the supported systems listed in the Pine source code documentation, in the file pine4.30/doc/pine-ports, and was edited for brevity. --------------------- Hence, with the above message ID, one knows the target's hostname, an account on that machine that reads mail using Pine, and that it's Linux without shadowed passwords (the LNX host type). Hang out on a mailing list, maybe something platform agnostic, and collect targets. In this case, one could use a well known exploit within the mail message, grab the system password file and send it back to ourselves for analysis. This can easily scaled to as many clients as has been fingerprinted; one mass mailing, and sit back and wait for the password files to come in. Other Mail Clients This is not to say that other mail clients are not vulnerable to such information leaks. Most mail clients give out similar information, either directly or indirectly. Direct information would be an entry in the message headers, such as an X-Mailer tag. Indirect information would be similar to that seen for Pine, a distinctive message ID tag. When this information is coupled to the information about the originating host, a fingerprint can occur rapidly. Some examples: User-Agent: Mutt/1.2.4i X-Mailer: Microsoft Outlook Express 5.00.3018.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.3018.1300 X-Mailer: dtmail 1.2.1 CDE Version 1.2.1 SunOS 5.6 sun4u sparc X-Mailer: PMMail 2000 Professional (2.10.2010) For Windows 2000 (5.0.2195) X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Message-ID: <4.3.2.7.2.20001117142518.043ad100@mailserver3.somewhere.gov> While not all clients give out their host system or processors, such as Mutt or Outlook Express, this information can be used by itself to get a larger vulnerability assessment. For example, if we know what version strings appear only on Windows, as opposed to a MacOS system, we can determine the processor type. The dtmail application is entirely too friendly to someone determining vulnerabilities, giving up the processor and OS revision. Given the problems that have appeared in the CDE suite, and in older versions of Solaris, an attack would be all too easy to construct. Finding such information There are two main avenues for finding this information for lots of clients quickly. First, we can sniff the network for this information. Using a tool like mailsnarf, ngrep or any sniffer with some basic filtering, a modest collection of host to client application data can be gathered. The speed of collection and the ultimate size of this database depends chiefly on the amount of traffic your network segment sees. This is the main drawback to this method, a limited amount of data. A much more efficient method, and one that can make use of this above information, is in offline (for the target with respect to the potential attacker) system fingerprinting, with an exploit path included. How do we do this? We search the web, with it's repleat mailing list archives, and we turn up some boxes. Altavista: 2,033 pages found (for pine.ult) Google results 1-10 of about 141,000 for pine.lnx Altavista: 16,870 pages found (for pine.osf) You get the idea. Tens of thousands of hits, thousands of potentially exploitable boxes ready to be picked. Simply evaluate the source host information and map it to the client data and a large database of vulnerable hosts is rapidly built. The exploits are easy. Every week, new exploits are found in client software, either mail applications like Pine, or methods to deliver exploits using mail software. Examples of this include the various buffer overflows that have appeared (and persist) in Pine and OutLook, the delivery of malicious DLL files using Eudora attachments, and such. We know from viruses like ILOVEYOU and Melissa that more people than not will open almost any mail message, and we know from spammers that it's trivial to bulk send messages with forged headers, making traceback difficult. These two items combine to make for a very readily available exploit. Usenet Clients In a manner similar to electronic mail, Usenet clients leave significant information in the headers of their posts which reveal information about their host operating systems. One great advantage to Usenet, as opposed to email or even web traffic, is that posts are distributed. As such, we can be remote and collect data on hosts without their knowledge or ever having to gain entry into their network. Among the various newsreaders commonly used, copious host info is included in the headers. The popular UNIX newsreader 'tin' is among the worst offenders of revealing host information. Operating system versions, processors and applications are all listed in the 'User-Agent' field, and when coupled to the NNTP-Posting-Host information, a remote host fingerprint has been performed: User-Agent: tin/1.5.2-20000206 ("Black Planet") (UNIX) (SunOS/5.6(sun4u)) User-Agent: tin/pre-1.4-980226 (UNIX) (FreeBSD/2.2.7-RELEASE (i386)) User-Agent: tin/1.4.2-20000205 ("Possession") (UNIX) (Linux/2.2.13(i686)) NNTP-Posting-Host: host.university.edu The standard web browsers also leave copious information about themselves and their host systems, as they do with HTTP requests and mail. We will elaborate on web clients in the next section, but they are also a problem as Usenet clients: X-Http-User-Agent: Mozilla/4.75 (Windows NT 5.0; U) X-Mailer: Mozilla 4.75 (X11; U; Linux 2.2.16-3smpi686) And several other clients also leave verbose information about their hosts to varying degrees. Again, when combined with the NNTP-Posting-Host or other identifying header, one can begin to amass information about hosts without too much work: Message-ID: <Pine.LNX.4.21.0010261126210.32652-100000@host.example.co.nz> User-Agent: MT-NewsWatcher/3.0 (PPC) X-Operating-System: GNU/Linux 2.2.16 User-Agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (Bryce Canyon) X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-Newsreader: Forte Free Agent 1.21/32.243 X-Newsreader: WinVN 0.99.9 (Released Version) (x86 32bit) Either directly or indirectly, we can fingerprint the operating system over the source host. Other programs are not so forthcoming, but still leak information about a host that can be used to determine vulnerability analysis. X-Newsreader: KNode 0.1.13 User-Agent: Pan/0.9.1 (Unix) User-Agent: Xnews/03.02.04 X-Newsreader: trn 4.0-test74 (May 26, 2000) X-Newsreader: knews 1.0b.0 (mrsam/980423) User-Agent: slrn/0.9.5.7 (UNIX) X-Newsreader: InterChange (Hydra) News v3.61.08 None of these header fields are required by the specifications for NNTP, as noted in RFC 2980. They provide only some additional information about the host which was the source of the data. However, given that more transactions that concern the servers are between servers, this data is entirely extraneous. It is, it appears, absent from RFC 977, the original specification for NNTP. On interesting possibility to exploiting a user agent like Mozilla is to examine the accepted languages. In the below example, we see not only English is supported, but that the browser is linked to Acrobat. Given potential holes, and past problems<11>, with malicious PDF files, this could be another avenue to gaining entry to a host. X-Mailer: Mozilla 4.75 (Win98; U) X-Accept-Language: en,pdf While this may seem that we're limited to fingerprinting hosts, or out of luck if they are using a proxy, this is not the case. We can also retrieve proxy info from the headers. Recall recent problems with Squid<12>: X-Http-Proxy: 1.0 x72.deja.com:80 (Squid/1.1.22) for client 10.32.34.18 While in this case the proxy is disconnected from the client's network, if this were a border proxy, we could use this to gain information about a possible entry point to the network and, over time and with enough sample data, information about the network behind the protected border. Using Web Traffic A remarkably simple and highly effective means of fingerprinting a target is to follow the web browsing that gets done from it. Most every system in use is a workstation, and nearly everyone uses their web browsers to spend part of their day. And just about every browser sends too much information in it's 'User-Agent' field. RFC 1945<13> notes that the 'User-Agent' field is not required in an HTTP 1.0 request, but can be used. The authors state, "user agents should include this field with requests." They cite statistics as well as on the fly tailoring of data to meet features or limitations of browsers. The draft standard for HTTP version 1.1 requests, RFC 2616, also notes similar usage of the 'User-Agent' field. We can gather this information in two ways. First, we could run a website and turn on logging of the User-Agent field from the client (if it's not already on). Simply generate a lot of hits and watch the data come in. Get on Slashdot, advertise some pornographic material, or mirror some popular software (like warez) and you're ready to go. Secondly, we can sniff web traffic on our visible segment. While almost any sniffer will work, one of the easiest for this type of work is urlsnarf from the dsniff package from Dug Song<14>. Examples of browsers that send not only their application information, such as the browser and the version, but also the operating system which the host runs include: Netscape (UNIX, MacOS, and Windows) Internet Explorer One shining example of a browser that doesn't send extraneous information is Lynx. On both 2.7 and 2.8 versions, only the browser information is sent, no information about the host. The User-Agent field can be important to the web server for legitimate reasons. Due to implementations, both Netscape and Explorer are not equivalent on many items, including how they handle tables, scripting and style sheets. However, host information is not needed and is sent gratuitously. A typical request from a popular browser looks like this: GET / HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/4.08 (X11; I; SunOS 5.7 sun4u) Host: 10.10.32.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,*,utf-8 The User-Agent field is littered with extra information that we don't need to know: the operating system type, version and even the hardware being used. Instantly we know everything there is to know about compromising this host: the operating system, the host's architecture, and even a route we could use to gain entry. For example a recent problems in Netscape's JPEG handling<15>. Using urlsnarf to log these transactions is the easiest method to sniff this information from the network. A typical line of output is below: 10.10.1.232 - - "GET http://www.latino.com/ HTTP/1.0" - - "http://www.latino.com/" "Mozilla/4.07 (Win95; I ;Nav)" We can also use the tool ngrep<16> to listen to this information on the wire. A simple filter to listen only to packets that contain the information 'User-Agent' can be set up and used to log information about hosts on the network. A simple regular expression filter can do the trick: ngrep -qid ep1 'User-Agent' tcp port 80 This will print out all TCP packets which contain the case insensitive string User-Agent in them. And, within this field, for too many browsers, is too much information about the host. With the above options to ngrep, typical output will look like this: T 10.10.11.43:1860 -> 130.14.22.107:80 GET /entrez/query/query.js HTTP/1.1..Accept: */*..Referer: http://www. ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search DB=PubMed..Accept-Langua ge: en-us..Accept-Encoding: gzip, deflate..If-Modified-Since: Thu, 29 Jun 2000 18:38:45 GMT; length=4558..User-Agent: Mozilla/4.0 (compatibl e; MSIE 5.5; Windows 98)..Host: www.ncbi.nlm.nih.gov..Connection: Keep -Alive..Cookie: WebEnv=FpEB]AfeA>>Hh^`Ba@<]^d]bCJfdADh@(j)@ =^a=T=EjIE=b<F bg<.... Even more information is contained within the request than urlsnarf showed us, information including cookies. Web Server Fingerprinting In much the same way as one can use the strings sent during requests by the clients to determine what system type is in use, one can follow the replies sent back by the server to determine what type it is. Again we will use ngrep, this time matching the expression 'server:' to gather the web server type: T 192.168.0.5:80 -> 192.168.0.1:1033 HTTP/1.0 200 OK..Server: Netscape-FastTrack/2.01..Date: Mon, 30 Oct 20 00 00:15:31 GMT..Content-type: text/html.... While specifics about the operating system information are lost, this works to passively gather vulnerability information about the target server. This can be coupled to other information to decide how best to proceed with an attack. This information will not be covered as this paper is limited to client applications and systems being fingerprinted. Telnet Clients While telnet is no longer in widespread use due to the fact that all of its data is sent in plain text, including authentication data, it is still used widely enough to be of use in fingerprinting target systems. What is interesting is that it not only gives us a mechanism to gather operating system data, it gives us the particular application in use, which can be of value in determining a mechanism of entry. This method of system fingerprinting is not unique to this paper. At Hope2k in New York City in the summer of 2000, I saw this demonstrated by a security analyst from Bell Labs. He had a honey-pot system set up that one would telnet to. An application would fingerprint the client and hence the operating system. While I do not recall his name, his research is acknowledged here as my introduction of this method of system fingerprinting. The specification for the telnet protocol describes a negotiation between the client and the host for information such as line speed, terminal type and echoing<17>. What is interesting to note is that each client behaves in a unique way, even different client applications on the same host type. Similarly, the telnet server, running a telnet daemon, can be fingerprinted by following the negotiations with the client. This information can be viewed from the telnet command line application on a UNIX host by issuing the 'toggle options' command at the telnet> prompt. This information can be gather directly, using a wedge application, or a honey-pot as demonstrated on the network at Hope2k, or it can be sniffed off the network in a truly passive fashion. We discuss below gathering data about both the client system and the server being connected to. The same principles apply to both host identification methods. Fingerprinting Telnet Clients The negotiations described above, and in the references listed, can be used to fingerprint the client based upon the options set and the order in which they are negotiated. Table 1 describes the behavior of several telnet clients in these respects. Their differences are immediately obvious, even for different clients on the same operating system, such as Tera Term Pro and Windows Telnet on a Windows 95 host. In this table, all server commands and negotiation options are ignored and only data originating from the client is shown. -------------------- Shown here are the options and the order in which they are sent from three telnet clients.The IRIX 6.5 system was the stock telnet client with no special arguments. The Windows system used was Windows 95B, using either Tera Term Pro 2.3 or the built in telnet client. In all cases an OpenBSD telnet server, with all Kerberos options turned off, was used as the connection target. Data was captured using the tool Ethereal and decoded by the application. [ TABLE OMITTED IN ASCII VERSION, PLEASE SEE PDF OR PS ] -------------------- Some operating systems, such as IRIX, use a specific and particular terminal type. However, this is usually not a good metric of the operating system, as it can be spoofed or ambiguous, with a value such as vt100 or xterm. Instead, the value and order of various commands sent by the client can be used to distinguish hosts and applications. For example, Windows doesn't set the terminal speed, Linemode options or accept new environmental options. To differentiate between the normal Windows telnet client or Tera Term Pro, one can look for the option to negotiate a window size, for example. Space only permits the above three clients to be shown. However, as one can imagine, differences both striking and subtle exist between the various clients. Fingerprinting Telnet Servers Obviously, the most direct method to fingerprint a server would be to connect to it and examine the order of options and their values as a telnet session was negotiated. However, as this study is concerned with passive scanning of clients, we will leave it to the reader to map this information and learn what to do with it. Conclusions In this paper has been illustrated the effectiveness of target system identification by using the information provided by network client applications. This provides a very efficient and precise measure of the client operating system, as well as identifying a vector for attack. This information is sent gratuitously and is not essential to the normal operation of many of these applications. The main limitation of this information is found when a host is performing emulation of another operating system to run the client software. While this is rare, it could lead to a false system identification. This mainly falls in the open software world, however, and only for some operating systems. The scope of this information should not be underestimated. There are some who will note that all one will likely gain on a UNIX system is an unprivilidged account. This may be, however, what we are after, the access that a particular user may have to other valuable data. We may only want their system privilidges, ie for packet generation in a DDoS network. For non-UNIX systems, the impact is well illustrated by the October, 2000, compromise of the Microsoft Corporation network, where access was gained to the source code of Windows and the Office suite. Repeating what is said often, your perimeter is only as strong as its weakest link. Similarly, there are some that will note that some of these attacks, such as using the mail or Usenet client as a vector for entry, require a bit of social engineering. While this is true, it is by no means any less of a threat. Numerous times we have seen that people will read almost any email message that shows up in their inbox. Usenet engineering is even easier: simply reply to a message posted, such as a question, and the person is almost certain to read the reply. As such, for the black hat, this represents a quick method of passively gathering target host information as well as a likely vector of attack. For the white hat, it suffices to map a network with respect to operating system and vulnerable application. Recommendations for Mitigating the Risks Would that the world were perfect, or at least software engineers were not prone to errors, this information would not be usable against a host. However, we exist in a world with operating systems littered with security problems and applications that are poorly programmed, ready to exploit. If we lived in an ideal world, but we do not. For web browsers, which are ubiquitous and used by nearly everyone on the Internet, the host operating system should not be sent. Ideally, information about what protocols are spoken, what standards are met and what language are supported (ie English, German, French) should suffice. Lynx behaves nearly ideally in this regard, and both Netscape and Explorer should follow this lead. With respect to Usenet and electronic mail clients, again only what features are supported should be provided. Pine is an example of how bad it can get, providing too much information about a host too quickly. There is no reason why any legitimate client should know what processor and OS is being run on the sending host. Telnet clients are far more difficult. It is tempting to say that all telnet applications should support the same set of features, but that is simply impossible. Proxy hosts should be used, if possible, to strip off information about the originating system, including the workstation address and operating system information. This will help obscure needed information to map a network from outside the perimeter. Coupled with strong measures to catch viruses and malicious code, such as in a web page script, the risks should be greatly reduced. The best solution is for application authors to not send gratuitous information in their headers or requests. Furthermore, client applications should be scrutinized to the same degree as daemons that run with administrative privilidges. The lessons of RFC 1123 most certainly apply at this level. In the intervening time, those with access to the source code of their network clients may want to consider removing gratuitous host information from their request packets or headers. This, however, doesn't apply to most users, and those that know about this method already practice this routinely. Acknowledgments This work was inspired largely by Photon's post in 1999 to the nmap-hackers list. It seems a great deal of other research, and proof of concept tools, has been initiated by this message. To them I extend my thanks, they've helped to provide me with hours of diversions and thought experiments. Also, I am thankful to the authors of the tools used in this study, especially Dug Song for his dsniff package, and Jordan Ritter for ngrep. As always, I am indebted to the people I work with at crimelabs, as well, especially Kosher Egyptian Rabbit and Jesus, with whom I have had numerous productive conversations on this topic. Rick Wash and Merlin were most helpful in their critical review of this manuscript and their helpful suggestions. ENDNOTES ======== <1>: Available from http://www.subterrain.net/projects/siphon/ . <2>: Available from http://www.insecure.org/nmap/ . <3>: Available from http://www.apostols.org/. <4>: This is definitely a must read to understanding how active, and hence passive, scanning occurs. Obtain this article from http://phrack.infonexus.org/ . <5>: These papers can be found online at http://www.sys-security.com/ . <6>: This note is available from the MARC archives of the nmap-hackers list, at http://marc.theaimsgroup.com/ . <7>: Please see http://www.enteract.com/~lspitz/pubs.html for this paper. <8>: This tool can be found at http://www.enteract.com/~lspitz/passfing.tar.gz . <9>: p0f can be found at http://lcamtuf.hack.pl/p0f.tgz . <10>: I have tried to sanitize all network addresses or hostnames. If I missed a few, it was inadvertant, and I apologize. You should be running more secure software, anyhow. <11>: See BUGTRAQ vulnerabilities with ID's 666 and 1509 at http://www.securityfocus.com/ for more information <12>: See BUGTRAQ ID's 471 and 89 at http://www.securityfocus.com/. <13>: This and all other listed RFC's are available from the IETF website at http://www.ietf.org/ . <14>: This package is available at http://www.monkey.org/~dugsong/dsniff/ <15>: See BUGTRAQ ID 1503 for more information. <16>: ngrep can be obtained from the PacketFactory website, http://www.packetfactory.net/Projects/Ngrep/ . <17>: For descriptive information on these options and their negotiations, please see RFCs 857, 858, 859, 860, 1091, 1073, 1079, 1184, 1372, and 1408. Also, see TCP Illustrated, Volume 1: The Protocols by W. Richard Stevens.