payload extraction from multiple streams - script

I had a need for a simple payload extraction from a pcap file, I wanted it to start based on my bpf expression and later extract all unique data streams that matched that expression to files in a binary format. I wanted it to be able to extract flow from almost any protocol not only TCP (for that you have tcpflow). The script is very simple it uses ngrep to do the work on the pcap file and the hex values are converted to characters using perl. There are many ways and also many tools that will extract the payload from the streams, I just needed something quick and small.

The whole script is just one loop on the output from ngrep:

my $file=shift;
my $bpf=shift;
my $ts=time();

my $conv = "";
my $buff = "";

open(CMD,"ngrep -qxlI $file \"\" \"$bpf\" |");
  if (/^\S+ (\S+?):(\d+) -> (\S+?):(\d+)(\s|$)/){
    if ($conv ne "" and $buff ne ""){
      open(OUT,">> $conv");
      print OUT map(chr(hex($_)),split(" ",$buff));
    $buff = "";
    $conv = "out_$1_$2_$3_$4_$ts.bin";
    print "\r.";
  if (/^\s{2}(.{50})/) {
    $buff .= "$1 ";

The second regular expressions matches any hex data and appends it to the buffer. When the first regular expression is matched and if there is anything in the buffer then that data will be appended to a unique file for that conversation.

simple and easy to modify for different protocols, the full code is here.

Or you can use tshark or wireshark.

tshark -T fields -e data -r input.pcap | xxd -r -p > extracted.dat