[Dailydave] PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case)
Piotr Bania
bania.piotr at gmail.com
Mon May 18 14:20:20 EDT 2009
Yo,
>I've got a few questions regarding your approach.
>
>1) In section 4.4 you discuss predicting data propogation and you use
>the term 'symbolic execution'. Does this mean you treat all input as
>symbolic? e.g. everything from a recv() call is marked as 'tainted'
>2) If the answer to the previous question is 'yes'; how do you deal
>with symbolic read/writes using your O_in/O_out register mechanism? I
>can't see this working for memory, as the size of those sets becomes
>potentially unbounded (well, bounded by the amount of usable memory)
>e.g how do you describe the memory written to by **mov dword ptr
>[eax], ebx** if eax is symbolic and dependent on user input? A more
>tangible situation might be the case where a child object is created,
>then written to memory at a symbolic offset and then later read again.
First of all, SpiderPig in current shape requires the user to pick the
starting point (that was the main idea of it from the beginning). In other
words user must specify the register or memory region which is tainted (pick
a root of the taint). SpiderPig can taint either memory location or CPU
elements like registers, flags etc. etc. Regarding the 4.4 section
(Predicting Data Propagation) the symbolic execution approach (O_in/O_out
variants) refers only to the elements of the CPU architecture - not the
memory locations pointed by them.
So if SpiderPig meets instruction like "mov dword ptr [eax], ebx", the
"ProcessStandardInstruction()" function (see Algorithm 1, page 22) is used.
So basically in this case it does following thing:
1) kills the 4 byte memory region pointed by EAX (saves all the information
about the killer instruction)
2) if the EBX value is tainted then the 4 byte memory region pointed by EAX
is also tainted
To preserve some time the referenced memory address (in this case pointed by
EAX) is computed by the instrumentation code (on the fly) inside of target
process.
Now if there are any possible data propagations afterwards in the Dataflow
Region between CPU elements, the symbolic execution approach is used. I
think it is important to notice that each Dataflow Region is considered as
side-effect free (see Definition 4, page 24).
> 3) What DynamoRIO plugin are you comparing your code to?
If you refer to "Test application's performance" (page 39), it was a very
simple plugin (made by myself) which task was to gather and save a CPU
context for each executed instruction. Like i have stated in 5.2.3 section
("Analysis (Instrumentation) Performance", page 38) there is nothing really
to compare. VCI is VCI, DBI is DBI and IMHO they should be treated
separately. Shortly in case of VCI i dont need to waste time for dispatcher
calls "every"* transfer instruction. Anyway personally DynamoRIO is my
favorite DBI so far, and i really admire Derek and rest of the authors for
providing such an excellent tool. I think it is quite possible I will port
SpiderPig to DynamoRIO, especially after it became open source project[1].
>Cheers, and good work,
I'm glad you liked it and i hope i have answered your questions.
cheers,
pb
* i am aware DynamoRio has some optimizations for that
[1] - http://code.google.com/p/dynamorio/
On Mon, May 18, 2009 at 1:32 PM, Piotr Bania <bania.piotr at gmail.com> wrote:
> SpiderPig is a project created for performing and visualizing data flow
> analysis of a selected binary program. SpiderPig was created in the
> purpose
> of providing a tool which would be able to help vulnerability and security
> researchers with tracing and analyzing any necessary data and it's further
> propagation. Such tasks are very often crucial in the vulnerability
> discovering/identifying process and typically require a lot of time
> consuming manual work. Following paper discusses methods and techniques
> implemented in SpiderPig in order to perform semi-automatic data flow
> analysis.
>
> Paper is available here:
> http://piotrbania.com/all/spiderpig/pbania-spiderpig2008.pdf
>
>
> Simple video demo and some other things available on project website:
> http://piotrbania.com/all/spiderpig/
>
>
> best regards,
> Piotr Bania
>
> --
> --------------------------------------------------------------------
> Piotr Bania - <bania.piotr at gmail.com> - 0xCD, 0x19
> Fingerprint: 413E 51C7 912E 3D4E A62A BFA4 1FF6 689F BE43 AC33
> http://www.piotrbania.com - Key ID: 0xBE43AC33
> --------------------------------------------------------------------
>
> - "The more I learn about men, the more I love dogs."
>
>
> P.S Did ya know adult pigs can run at speeds of up to 11 miles an hour?
>
> _______________________________________________
> Dailydave mailing list
> Dailydave at lists.immunitysec.com
> http://lists.immunitysec.com/mailman/listinfo/dailydave
>
--
http://www.unprotectedhex.com
http://www.smashthestack.org
More information about the Dailydave
mailing list