[Dailydave] Unknown Application Protocol Analysis

Jared DeMott demottja at msu.edu
Thu Sep 7 06:23:16 EST 2006


>
> Q. How do you run a quick one pass analysis of some proprietary
> application
> protocol?
>
I am certainly thinking about this problem.  My goal is to write a tool
that can "automatically" fuzz anything (tcp/udp network  application
protocols right now).  Rather it's ascii based (ftp), binary (dns),
includes hashes/encryption (ike), etc.  We should be able to do this in
both the client and server direction.  I'd like to:
- sniff real traffic
- tokenize that traffic into an internal format (for dns: sessionID,
Flags, ProtoID, ProtoID, ProtoID, len, ascii, len, ascii, etc.)
- fuzz with intelligence
    -fuzz the binary nums like ProtoID differently that we fuzz a len or
ascii feild
    -fix up things like the sessionID, len, etc. so they are correct
- monitor target app to fuzz with even more power
    -to perhaps auto increase code coverage, target certain naughty
functions, catch memory access violations, etc.

I'm taking a genetic algorithms class this semester.  I hope to use GA's
to help me do some of the above better.  You can download my tool (GPF)
from http://www.appliedsec.com/developers.html, but I'd wait a couple
weeks.  I've made substantials changes lately (but haven't uploaded it
yet, since it's not quite working) as I work more and more toward the
above goals.  I'll post to this list when I've got something worthy of
downloading. :)
>
> I know it's fairly easy to look at small subsets of traffic manually,
> looking for the \x00 and slowly guess-timate where fields begin and end,
> what constitute a record, what are static offsets etc, but I'm imagining a
> tool that would take in a batch of traffic and work out roughly what's
> what,
> seeing the big picture.
>
> I'd imagine this tool would run a first check, looking for what might
> constitute discrete units of information, (possibly all those bounded by
> \x00).
>
> I'd imagine this tool would then look for some of the basic layouts of TLV
> protocols (which seem most common IMHO) by working out lengths of what
> appear to be strings, and look for those ints before or after. Maybe even
> looking for md5 or sha1 hashes that correspond to other data fields. Then
> look for repeating byte patterns etc.
>
> Once it understands the structure of a single packet, then compare it over
> time with other packets between similar host, looking for which fields are
> constant, which ones change randomly (signifying GUID or Message IDs) and
> those that only change slightly (perhaps timing fields). This would be
> where
> the real knowledge would lie, as assumptions made about individual packets
> (eg what is really static or dynamic) could be rectified over a larger
> data-set.
>
> Then print this out in a way like:
>
> <static header><record 1><length><Unicode content><\x88\x88\x88><record
> 2><length><COMPUTER_NAME><record 3><CURRENT_TIME><unknown static crud>
>
> Producing an Ethereal protocol definition file at the end would be
> icing on
> the cake!
>
> I've had a look at:
> [1]
> http://research.microsoft.com/workshops/sysml/papers/sysml-Gopalratnam.pdf
> [2] http://www.ub.utwente.nl/webdocs/ctit/1/000000ef.pdf
>
> But can't seem to find any public code that has attempted to solve the
> same
> problem.
> Has anyone else thought about this, or know of code I should look at?
>
> Rhys
>
>
>
> _______________________________________________
> Dailydave mailing list
> Dailydave at lists.immunitysec.com
> http://lists.immunitysec.com/mailman/listinfo/dailydave
>
>
>



More information about the Dailydave mailing list