ABSTRACT
This paper discusses how to glean precious information about a host by querying its TCP/IP stack. I first present some of the "classical" methods of determining host OS which do not involve stack fingerprinting. Then I describe the current "state of the art" in stack fingerprinting tools. Next comes a description of many techniques for causing the remote host to leak information about itself. Finally I detail my (nmap) implementation of this, followed by a snapshot gained from nmap which discloses what OS is running on many popular Internet sites.
REASONS
I think the usefulness of determining what OS a system is running is pretty obvious, so I'll make this section short. One of the strongest examples of this usefulness is that many security holes are dependent on OS version. Lets say you are doing a penetration test and you find port 53 open. If this is a vulnerable version of Bind, you only get one chance to exploit it since a failed attempt will crash the daemon. With a good TCP/IP fingerprinter, you will quickly find that this machine is running 'Solaris 2.51' or 'Linux 2.0.35' and you can adjust your shellcode accordingly.
A worse possibility is someone scanning 500,000 hosts in advance to see what OS is running and what ports are open. Then when someone posts (say) a root hole in Sun's comsat daemon, our little cracker could grep his list for 'UDP/512' and 'Solaris 2.6' and he immediately has pages and pages of rootable boxes. It should be noted that this is SCRIPT KIDDIE behavior. You have demonstrated no skill and nobody is even remotely impressed that you were able to find some vulnerable .edu that had not patched the hole in time. Also, people will be even _less_ impressed if you use your newfound access to deface the department's web site with a self-aggrandizing rant about how damn good you are and how stupid the sysadmins must be.
Another possible use is for social engineering. Lets say that you are scanning your target company and nmap reports a 'Datavoice TxPORT PRISM 3000 T1 CSU/DSU 6.22/2.06'. The hacker might now call up as 'Datavoice support' and discuss some issues about their PRISM 3000. "We are going to announce a security hole soon, but first we want all our current customers to install the patch -- I just mailed it to you ..." Some naive administrators might assume that only an authorized engineer from Datavoice would know so much about their CSU/DSU.
Another potential use of this capability is evaluation of companies you may want to do business with. Before you choose a new ISP, scan them and see what equipment is in use. Those "$99/year" deals don't sound nearly so good when you find out they have crappy routers and offer PPP services off a bunch of Windows boxes.
playground~> telnet hpux.u-aizu.ac.jp
Trying 163.143.103.12...
Connected to hpux.u-aizu.ac.jp.
Escape character is '^]'.
HP-UX hpux B.10.01 A 9000/715 (ttyp2)
login:
There is no point going to all this trouble of fingerprinting if the
machine will blatantly announce to the world exactly what it is
running! Sadly, many vendors ship _current_ systems with these kind
of banners and many admins do not turn them off. Just because there
are other ways to figure out what OS is running (such as
fingerprinting), does not mean we should just announce our OS and
architecture to every schmuck who tries to connect.
The problems with relying on this technique are that an increasing number of people are turning banners off, many systems don't give much information, and it is trivial for someone to "lie" in their banners. Nevertheless, banner reading is all you get for OS and OS Version checking if you spend $thousands on the commercial ISS scanner. Download nmap or queso instead and save your money :).
Even if you turn off the banners, many applications will happily give away this kind of information when asked. For example lets look at an FTP server:
payfonez> telnet ftp.netscape.com 21
Trying 207.200.74.26...
Connected to ftp.netscape.com.
Escape character is '^]'.
220 ftp29 FTP server (UNIX(r) System V Release 4.0) ready.
SYST
215 UNIX Type: L8 Version: SUNOS
First of all, it gives us system details in its default banner. Then if we
give the 'SYST' command it happily feeds back even more information.
If anon FTP is supported, we can often download /bin/ls or other
binaries and determine what architecture it was built for.
Many other applications are too free with information. Take web servers for example:
playground> echo 'GET / HTTP/1.0\n' | nc hotbot.com 80 | egrep '^Server:' Server: Microsoft-IIS/4.0Hmmm ... I wonder what OS those lamers are running. Other classic techniques include DNS host info records (rarely effective) and social engineering. If the machine is listening on 161/udp (snmp), you are almost guaranteed a bunch of detailed info using 'snmpwalk' from the CMU SNMP tools distribution and the 'public' community name.
playground>
Nmap is not the first OS recognition program to use TCP/IP fingerprinting. The common IRC spoofer sirc by Johan has included very rudimentary fingerprinting techniques since version 3 (or earlier). It attempts to place a host in the classes "Linux", "4.4BSD", "Win95", or "Unknown" using a few simple TCP flag tests.
Another such program is checkos, released publicly in January of this year by Shok in Confidence Remains High Issue #7. The fingerprinting techniques are exactly the same as SIRC, and even the _code_ is identical in many places. Checkos was privately available for a long time prior to the public release, so I have no idea who swiped code from whom. But neither seems to credit the other. One thing checkos does add is telnet banner checking, which is useful but has the problems described earlier. [ Update: Shok wrote in to say that chekos was never intended to be public and this is why he didn't bother to credit SIRC for some of the code. ]
Su1d also wrote an OS checking program. His is called SS and as of Version 3.11 it can identify 12 different OS types. I am somewhat partial to this one since he credits my nmap program for some of the networking code :).
Then there is queso. This program is the newest and it is a huge leap forward from the other programs. Not only do they introduce a couple new tests, but they were the first (that I have seen) to move the OS fingerprints _out_ of the code. The other scanners included code like:
/* from ss */
if ((flagsfour & TH_RST) && (flagsfour & TH_ACK) && (winfour == 0) && (flagsthree & TH_ACK))
reportos(argv[2],argv[3],"Livingston Portmaster ComOS");
Instead, queso moves this into a configuration file which obviously scales
much better and makes adding an OS as easy as appending a few lines to a
fingerprint file. Queso was written by Savage, one of the fine folks at
Apostols.org .
One problem with all the programs describe above is that they are very limited in the number of fingerprinting tests which limits the granularity of answers. I want to know more than just 'this machine is OpenBSD, FreeBSD, or NetBSD', I wish to know exactly which of those it is as well as some idea of the release version number. In the same way, I would rather see 'Solaris 2.6' than simply 'Solaris'. To achieve this response granularity, I worked on a number of fingerprinting techniques which are described in the next section.
You can also subclass groups such as random incremental by computing variances, greatest common divisors, and other functions on the set of sequence numbers and the differences between the numbers.
It should be noted that ISN generation has important security implications. For more information on this, contact "security expert" Tsutomu "Shimmy" Shimomura at SDSC and ask him how he was owned. Nmap is the first program I have seen to use this for OS identification.
Nmap sends these options along with almost every probe packet:
Window Scale=10; NOP; Max Segment Size = 265; Timestamp; End of Ops;take a look at which options were returned and thus are supported. Some operating systems such as recent FreeBSD boxes support all of the above, while others, such as Linux 2.0.X support very few. The latest Linux 2.1.x kernels do support all of the above. On the other hand, they are more vulnerable to TCP sequence prediction. Go figure.
Even if several operating systems support the same set of options, you can sometimes distinguish them by the _values_ of the options. For example, if you send a small MSS value to a Linux box, it will generally echo that MSS back to you. Other hosts will give you different values.
And even if you get the same set of supported options AND the same values, you can still differentiate via the _order_ that the options are given, and where padding is applied. For example Solaris returns 'NNTNWME' which means: <no op><no op><timestamp><no op><window scale><echoed MSS>
While Linux 2.1.122 returns MENNTNW. Same options, same values, but different order!
There are a few other useful options I might probe for at some point, such as those that support T/TCP and selective acknowledgements.
Exploit Chronology -- Even with all the tests above, nmap is unable to distinguish between the TCP stacks of Win95, WinNT, or Win98. This is rather surprising, especially since Win98 came out about 4 years after Win95. You would think they would have bothered to improve the stack in some way (like supporting more TCP options) and so we would be able to detect the change and distinguish the operating systems. Unfortunately, this is not the case. The NT stack is apparently the same crappy stack they put into '95. And they didn't bother to upgrade it for '98.
But do not give up hope, for there is a solution. You can simply start with early Windows DOS attacks (Ping of Death, Winnuke, etc) and move up a little further to attacks such as Teardrop and Land. After each attack, ping them to see whether they have crashed. When you finally crash them, you will likely have narrowed what they are running down to one service pack or hotfix.
I have not added this functionality to nmap, although I must admit it is very tempting :).
SYN Flood Resistance -- Some operating systems will stop accepting new connections if you send too many forged SYN packets at them (forging the packets avoids trouble with your kernel resetting the connections). Many operating systems can only handle 8 packets. Recent Linux kernels (among other operating systems) allow various methods such as SYN cookies to prevent this from being a serious problem. Thus you can learn something about your target OS by sending 8 packets from a forged source to an open port and then testing whether you can establish a connection to that port yourself. This was not implemented in nmap since some people get upset when you SYN flood them. Even explaining that you were simply trying to determine what OS they are running might not help calm them.
The new version of nmap reads a file filled with Fingerprint templates that follow a simple grammar. Here is an example:
FingerPrint IRIX 6.2 - 6.4 # Thanks to Lamont Granquist TSeq(Class=i800)Lets look at the first line (I'm adding '>' quote markers):
T1(DF=N%W=C000|EF2A%ACK=S++%Flags=AS%Ops=MNWNNT) T2(Resp=Y%DF=N%W=0%ACK=S%Flags=AR%Ops=) T3(Resp=Y%DF=N%W=C000|EF2A%ACK=O%Flags=A%Ops=NNT) T4(DF=N%W=0%ACK=O%Flags=R%Ops=)
T5(DF=N%W=0%ACK=S++%Flags=AR%Ops=)
T6(DF=N%W=0%ACK=O%Flags=R%Ops=)
T7(DF=N%W=0%ACK=S%Flags=AR%Ops=)
PU(DF=N%TOS=0%IPLEN=38%RIPTL=148%RID=E%RIPCK=E%UCK=E%ULEN=134%DAT=E)
> FingerPrint IRIX 6.2 - 6.3 # Thanks to Lamont GranquistThis simply says that the fingerprint covers IRIX versions 6.2 through 6.3 and the comment states that Lamont Granquist kindly sent me the IP addresses or fingerprints of the IRIX boxes tested.
> TSeq(Class=i800)This means that ISN sampling put it in the "i800 class". This means that each new sequence number is a multiple of 800 greater than the last one.
> T1(DF=N%W=C000|EF2A%ACK=S++%Flags=AS%Ops=MNWNNT)The test is named T1 (for test1, clever eh?). In this test we send a SYN packet with a bunch of TCP options to an open port. DF=N means that the "Don't fragment" bit of the response must not be set. W=C000|EF2A means that the window advertisement we received must be 0xC000 or EF2A. ACK=S++ means the acknowledgement we receive must be our initial sequence number plus 1. Flags = AS means the ACK and SYN flags were sent in the response. Ops = MNWNNT means the options in the response must be (in this order):
<MSS (not echoed)><NOP><Window scale><NOP><NOP><Timestamp>
> T2(Resp=Y%DF=N%W=0%ACK=S%Flags=AR%Ops=)Test 2 involves a NULL with the same options to an open port. Resp=Y means we must get a response. Ops= means that there must not be any options included in the response packet. If we took out '%Ops=' entirely then any options sent would match.
> T3(Resp=Y%DF=N%W=400%ACK=S++%Flags=AS%Ops=M)Test 3 is a SYN|FIN|URG|PSH w/options to an open port.
> T4(DF=N%W=0%ACK=O%Flags=R%Ops=)This is an ACK to an open port. Note that we do not have a Resp= here. This means that lack of a response (such as the packet being dropped on the network or an evil firewall) will not disqualify a match as long as all the other tests match. We do this because virtually any OS will send a response, so a lack of response is generally an attribute of the network conditions and not the OS itself. We put the Resp tag in tests 2 and 3 because some operating systems _do_ drop those without responding.
> T5(DF=N%W=0%ACK=S++%Flags=AR%Ops=) > T6(DF=N%W=0%ACK=O%Flags=R%Ops=) > T7(DF=N%W=0%ACK=S%Flags=AR%Ops=)These tests are a SYN, ACK, and FIN|PSH|URG, respectively, to a closed port. The same options as always are set. Of course this is all probably obvious given the descriptive names 'T5', 'T6', and 'T7' :).
> PU(DF=N%TOS=0%IPLEN=38%RIPTL=148%RID=E%RIPCK=E%UCK=E%ULEN=134%DAT=E)This big sucker is the 'port unreachable' message test. You should recognize the DF=N by now. TOS=0 means that IP type of service field was 0. The next two fields give the (hex) values of the IP total length field of the message IP header and the total length given in the IP header they are echoing back to us. RID=E means the RID value we got back in the copy of our original UDP packet was expected (ie the same as we sent). RIPCK=E means they didn't fuck up the checksum (if they did, it would say RIPCK=F). UCK=E means the UDP checksum is also correct. Next comes the UDP length which was 0x134 and DAT=E means they echoed our UDP data correctly. Since most implementations (including this one) do not send any of our UDP data back, they get DAT=E by default.
POPULAR SITE SNAPSHOTS
Here is the fun result of all our effort. We can now take random Internet sites and determine what OS they are using. A lot of these people have eliminated telnet banners, etc. to keep this information private. But this is of no use with our new fingerprinter! Also this is a good way to expose the <your favorite crap OS> users as the lamers that they are :)!
The command used in these examples was: nmap -sS -p 80 -O -v <host>
Also note that most of these scans were done on 10/18/98. Some of these folks may have upgraded/changed servers since then.
Note that I do not like every site on here.
# "Hacker" sites or (in a couple cases) sites that think they are
www.l0pht.com => OpenBSD 2.2 - 2.4
www.insecure.org => Linux 2.0.31-34
www.rhino9.ml.org => Windows 95/NT # No comment :)
www.technotronic.com => Linux 2.0.31-34
www.nmrc.org => FreeBSD 2.2.6 - 3.0
www.cultdeadcow.com => OpenBSD 2.2 - 2.4
www.kevinmitnick.com => Linux 2.0.31-34 # Free Kevin!
www.2600.com => FreeBSD 2.2.6 - 3.0 Beta
www.antionline.com => FreeBSD 2.2.6 - 3.0 Beta
www.rootshell.com => Linux 2.0.35 # Changed to OpenBSD after
# they got owned.
# Security vendors, consultants, etc.
www.repsec.com => Linux 2.0.35
www.iss.net => Linux 2.0.31-34
www.checkpoint.com => Solaris 2.5 - 2.51
www.infowar.com => Win95/NT
# Vendor loyalty to their OS
www.li.org => Linux 2.0.35 # Linux International
www.redhat.com => Linux 2.0.31-34 # I wonder what distribution :)
www.debian.org => Linux 2.0.35
www.linux.org => Linux 2.1.122 - 2.1.126
www.sgi.com => IRIX 6.2 - 6.4
www.netbsd.org => NetBSD 1.3X
www.openbsd.org => Solaris 2.6 # Ahem :)
www.freebsd.org => FreeBSD 2.2.6-3.0 Beta
# Ivy league
www.harvard.edu => Solaris 2.6
www.yale.edu => Solaris 2.5 - 2.51
www.caltech.edu => SunOS 4.1.2-4.1.4 # Hello! This is the 90's :)
www.stanford.edu => Solaris 2.6
www.mit.edu => Solaris 2.5 - 2.51 # Coincidence that so many good
# schools seem to like Sun?
# Perhaps it is the 40%
# .edu discount :)
www.berkeley.edu => UNIX OSF1 V 4.0,4.0B,4.0D
www.oxford.edu => Linux 2.0.33-34 # Rock on!
# Lamer sites
www.aol.com => IRIX 6.2 - 6.4 # No wonder they are so insecure :)
www.happyhacker.org => OpenBSD 2.2-2.4 # Sick of being owned, Carolyn?
# Even the most secure OS is
# useless in the hands of an
# incompetent admin.
# Misc
www.lwn.net => Linux 2.0.31-34 # This Linux news site rocks!
www.slashdot.org => Linux 2.1.122 - 2.1.126
www.whitehouse.gov => IRIX 5.3
sunsite.unc.edu => Solaris 2.6
Notes: In their security white paper, Microsoft said about their lax security: "this assumption has changed over the years as Windows NT gains popularity largely because of its security features.". Hmm, from where I stand it doesn't look like Windows is very popular among the security community :). I only see 2 Windows boxes from the whole group, and Windows is _easy_ for nmap to distinguish since it is so broken (standards wise).
And of course, there is one more site we must check. This is the web site of the ultra-secret Transmeta corporation. Interestingly the company was funded largely by Paul Allen of Microsoft, but it employs Linus Torvalds. So do they stick with Paul and run NT or do they side with the rebels and join the Linux revolution? Let us see:
We use the command: nmap -sS -F -o transmeta.log -v -O www.transmeta.com/24
This says SYN scan for known ports (from /etc/services), log the results to 'transmeta.log', be verbose about it, do an OS scan, and scan the class 'C' where www.transmeta.com resides. Here is the gist of the results:
neon-best.transmeta.com (206.184.214.10) => Linux 2.0.33-34 www.transmeta.com 206.184.214.11) => Linux 2.0.30 neosilicon.transmeta.com (206.184.214.14) => Linux 2.0.33-34 ssl.transmeta.com (206.184.214.15) => Linux unknown version linux.kernel.org (206.184.214.34) => Linux 2.0.35 www.linuxbase.org (206.184.214.35) => Linux 2.0.35 (possibly the same machine as above)Well, I think this answers our question pretty clearly :).
ACKNOWLEDGEMENTS
The only reason Nmap is currently able to detect so many different operating systems is that many people on the private beta team went to a lot of effort to search out new and exciting boxes to fingerprint! In particular, Jan Koum, van Hauser, Dmess0r, David O'Brien, James W. Abendschan, Solar Designer, Chris Wilson, Stuart Stock, Mea Culpa, Lamont Granquist, Dr. Who, Jordan Ritter, Brett Eldridge, and Pluvius sent in tons of IP addresses of wacky boxes and/or fingerprints of machines not reachable through the Internet.
Thanks to Richard Stallman for writing GNU Emacs. This article would not be so well word-wrapped if I was using vi or cat and ^D.
Questions and comments can be sent to fyodor@DHP.com (if that doesn't work for some reason, use fyodor@insecure.org). Nmap can be obtained from http://www.insecure.org/nmap .
A window scale of 10 makes the window 210 x 65535.