Friday, November 3, 2017

WHAAAA0101 0000 0011T !!! (aka Extracting files out of pcaps with Foremost & discussing TShark)



My SANS SEC503 NetWars coin 


So I thought I would use this as a chance to try out something Dr. Ullrich had mentioned in class: that a tool I use often for forensic recovery, also can extract files from tcp/ip streams present in pcap files--that tool is Foremost--I had just simply not realized that or noticed this feature in the foremost documentation.

IN SUMMARY

This did work, however my idea to extract an audio stream was not as simple as I had thought.   I'll discuss only the simplest part of the capture below, but basically I have an Apple TV (maybe will write quite a bit more about that in future posts, it is a lot of fun to monitor!) captive through a Linux bridge, which I captured about 30 seconds of packets on, so the nature of the the device, or some already-finished handshake from the radio server probably complicated the excersize.   I'd habitually have otherwise used TCPDump if I wasn't wanting to explore TShark a bit.

The strangely altered jpeg, and inabiliti of foremost to extract an audio file could of course be caused by a number of things, for example encryption or simply just incomplete streams.  It is possible with TShark to go to a lot further lengths to ensure you get all the traffic matching specific streams or tcp sessions, I've ignored that here for this post, but is something I am currently also researching.  The HTTP 404 pages which I captured in that same pcap, came out perfectly, so my results may also be about file types supported by Foremost.

Since it was very simple to run Foremost, I'm also just discussing here various commands I ran on the same capture with TShark, to confirm and otherwise just analyze the same packet captures, and really just give a complete exploration of the tshark man pages.

THE PROCESS

I've got an on-going write up about different free tools for forensics and which standard Linux distributions they are in, such as those in Debian like Foremost and Sleuthkit, and I'll re-post that here when I am finished.  As a huge fan of the command line, I had used TShark a bit before, but not realized it was actually part of Wireshark and uses the same packet dissection code.

You can use Foremost to extract files from TCP streams for many purposes, of course, but really simple uses might be to decode unencrypted e-mail attachments or pull some simple files out of a stream if you didn't have Bro handy. For example today I'm going to to try and pull out a short clip of music, which I had streamed from Klassik Radio of Germany, on the Schiller Lounge channel. A full direct list of all the streams used to be here. I later tried it again with the stream on SomaFM from "DefCon Radio," another favorite, because of the amazing relaxing music played there, not even the infosec name! As you'll see below, both did not work, but instead I got some trippy graphics & html files, but I did prove Foremost works for this, in general, and as well I could find a way at a later date to extract music streams if I really wanted to (in this case, it was just the first thing I had thought of to write this post and show off my medal :)).

Many people use wireshark for extractions, which requires a bit of clicking around, and the most useful of all the tools I know of is probably to use a file-extraction script in Bro to extract files from packet streams, but the utter speed of Foremost would make this a great choice to have in your kitbox, similarly to using Tshark for packet analysis.  It blazed the entire pcap I had nearly instantly, and that was on only a Chromebook.

I captured about 30 seconds with tshark -n -vv -w [MyFullCaptureFile.pcap], in which music was playing on the Apple TV, I stopped the music, did an apt-get upgrade on the box, then started the music playing again.

I'd limited down my pcap first, to just the music, as would normally be a good practice for speed and resources, by using tcpdump out of habit:

tcpdump -l -r MyFullCaptureFile.pcap 'host 85.239.108.20' > radio.pcap

Of course you could use TShark to do the same thing, or even use TShark for the extraction with --extract-objects or other options.


THE EXTRACTION

So I fired off: 

foremost -i radio.pcap,

I also tried specifying files types like foremost -i radio.pcap -t mp4 but since there was not a file of this type present, for whatever reason, this did not help here, but it's how you'd limit the command, 
 again, this is applying the general principle of limiting down your input and using tools quickly.

This above did not work! There was no files in that reduced stream, that Foremost liked.

However, when I scanned my original entire capture though Foremost,

foremost -i MyFullCaptureFile.pcap

I very quickly found 3 files. Foremost creates by default a directory called "output" which contains a log, audit.txt, and files in folders sorted by file type. In my case it found html pages (404 returns) from the Debian update I had captured also within the same experimental session and the jpeg below (I am guessing it is some portion of the album art or the screensaver of the Now playing screen on Apple TV).

trippy

THE CAPTIAL AND LOWERCASE

I'd like to simply discuss some of the commands I used to explore the same packets, and as well commands I studied & practiced today in the man pages.   Reasons why I love TShark include it's speed and scriptability, but it is famous, among other things, for its use of the Berkley Packet filters for filter expressions.

The authors mention that as TShark progresses, expect more and more protocol fields to be allowed in read filters, so if you are a frequent user, look out for updated versions.

*The -G option is a special mode that simply causes Tshark to dump one of several types of internal glossaries and then exit. Actually makes loads of text spew to the console :)  Really, what this does is blast out internal settings of Tshark, but by default, if you don't specify one of the hundreds of "glossaries" then it selects for you an option called "fields," which dumps the contents of the registration database to stdout...

a perfect example is that when I wanted to get the names of possible "fields" for the -T fields output mentioned below, I do

tshark -G column-formats

*-f allows you then to input filters in the BPF Syntax (Based on the Berkley Packet Filter), which are handy to learn for a lot of reasons. So note that is unlike tcpdump, where you type more readable filters just at the end of the command like "tcpdump [expression]." I hadn't realized before, that Pre-defined capture filter names, as shown in the GUI menu item (from Wireshark) Capture->Capture Filters, can be used by prefixing the argument with "predef:". Example: -f "predef:MyPredefinedHostOnlyFilter" 

*I'll talk a bit below about how to use the -T "fields" [-e {list of fields}] option, because this really blew my mind the first few times I used it in class with Johannes. You use -e after transferring data into text, json, hex, and other formats with the "-T fields" option.  So a nice take-away here is that in TShark -E capital letter option is used to format the related lower case option:   

  • don't the confuse this with the timestamp (-t ) option
  • the -T option (perhaps think of as "table output"... gives specific, advanced formatting to the output of decoded packet data.    -T is then followed by
  • the -e option which defines the fields &
  • the -E option gives specific, advanced formatting to that chosen field output 
  • in other words -T  [  -e <field> ] [ -E <field print option> ]

The way we used this in 503 was to cut nice, C-Suite worthy data out of large packet files, way faster than you could ever use Wireshark to do in most circumstances.

For example:  tshark -n -r radio.pcap  -T fields -e tcp.stream | uniq

to find a specific stream number, or optionally add in limitations like -Y 'tcp.src.port == 8000 and tcp.dstport == 80'.  In this example it just showed me the stream number of my small capture.

This of course could be scripted and even pumped directly into stuff like gnuplot or the R programming language to make charts, or even "e-mail a random set of 5 data sets from the previous day each morning to your boss" ;-)

Someone who can't script will soon be scripted out of their job someone once said! Shoutout to Matt Domko and the #python channel at the BrakeSec Slack... and as well as Mendel Cooper aka Grindell with the infamous advanced Bash Scripting guide at the LDP.

The example in the man page is to use:

tshark [-r input {or capture}] -T fields -e frame.number -e ip.addr -e udp -e _ws.col.Info
Note that giving a protocol rather than a single field will print multiple items of data about the protocol as a single field. Fields are separated by tab characters by default.

*The -z (Statistics) option also gives you a waterfall of options, almost Netflow like data (without having to go first to SilK format for example with rwp2yaf2silk, in a way it gives options or summaries that you see with capinfos or other commands; for example it showed me, at the end of my analysis some neat summary of packets with:

tshark -z conv, tcp -r myfile.pcap | tail

Another example we used in SEC503 was tshark -r myfile.pcap -z http_req,tree to see an ASCII "tree" representation of the http requests, very cool!  -z has a huge list of possible switches.





Congratulations to all the other SANS coin winners in Singapore and in Berlin! 

No comments:

Post a Comment

The new "LinkedInSecureMessage"​ ?

With all the talk of secure messenger applications lately, I bet you’d like to have just one more, right? In the past few weeks, we’ve noti...

Follow by RSS