System Administration & Network Administration
tcpdump tshark
Updated Fri, 05 Aug 2022 04:52:15 GMT

Monitoring HTTP traffic using tcpdump


To monitor HTTP traffic between a server and a web server, I'm currently using tcpdump. This works fine, but I'd like to get rid of some superfluous data in the output (I know about tcpflow and wireshark, but they're not readily available in my environment).

From the tcpdump man page:

To print all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets.

tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

This command

sudo tcpdump -A 'src example.com and tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

provides the following output:

19:44:03.529413 IP 192.0.32.10.http > 10.0.1.6.52369: Flags [P.], seq 918827135:918827862, ack 351213824, win 4316, options [nop,nop,TS val 4093273405 ecr 869959372], length 727

E.....@....... ....P..6.0.........D...... __..e=3...__HTTP/1.1 200 OK Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=UTF-8 Date: Sat, 14 Nov 2009 18:35:22 GMT Age: 7149
Content-Length: 438

<HTML> <HEAD> <TITLE>Example Web Page</TITLE> </HEAD> <body>
<p>You have reached this web page ...</p> </BODY> </HTML>

This is nearly perfect, except for the highlighted part. What is this, end -- more importantly -- how do I get rid of it? Maybe it's just a little tweak to the expression at the end of the command?




Solution

tcpdump prints complete packets. "Garbage" you see are actually TCP package headers.

you can certainly massage the output with i.e. a perl script, but why not use tshark, the textual version of wireshark instead?

tshark 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

it takes the same arguments as tcpdump (same library) but since its an analyzer it can do deep packet inspection so you can refine your filters even more, i.e.

tshark 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' -R'http.request.method == "GET" || http.request.method == "HEAD"'





Comments (2)

  • +1 – Thanks -- after trying out all the suggestions, tshark seems like the best tool for the job. I'm currently using "tshark -d tcp.port==8070,http -R 'http.request or http.response'". Now if only I could get tshark to "follow the tcp stream" just like wireshark can (This gets asked a lot, but I still haven't found the answer). "-V" displays info about the TCP and IP packets and so on, which I'm not interested in. But I guess I can remove that using a script. — Nov 18, 2009 at 15:39  
  • +4 – You can also search for "GET" in a capture filter by matching the ASCII values for each character: tcp port 80 and tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420. I added a page to the Wireshark web site a while back that helps you create string matching capture filters: wireshark.org/tools/string-cf.html — Sep 13, 2011 at 20:03