[Dailydave] Source Code Analysis

Mateusz Berezecki mateuszb at gmail.com
Thu Sep 7 12:03:49 EST 2006


On 9/7/06, Dave Aitel <dave at immunityinc.com> wrote:

> CoolQ gave a talk on his efforts regarding source code analysis via
> gcc AST translation and state-table analysis at XCon 2006. I thought
> it was well put together for people who are not completely wrapped in
> static analysis to understand the basic concepts. I don't think his
> paper is available publicly yet, but he found some bugs in the Linux
> kernel with his tool relating to lock/unlock issues. His tool is also
> not public, but the concepts don't seem that hard to implement for the
> GCC team or someone familiar with the code-base.

My opinion is that there needs to be a separate tool completely focused on code
scanning rather than implementing a defect scanner within a compiler
suite itself.

When it comes to a static analysis itself there are a lot of papers
online like for example
at http://metacomp.stanford.edu which are good but there are also a lot of books
which cover a topic more in depth showing that static analysis is not
that hard. It just takes
some time to write some useful tools using the proper kind of knowledge.

Most of the scanners are implemented in a super-grep kind of way so
they instead of
just knowing the syntax also happen to know the semantics of code, and
thus "know" what a
pointer is, etc. and can search for user specified specific patterns
within a code.

There are some interesting experiments that show that even a
probability theory can be
employed to static analysis by simply checking for code patterns which
occur more often
(and are implied to be valid!) against the patterns which occur once
or twice (suspect to
being a bug) and that's what can be deduced even if the defect tool
has no knowledge of
the code it's scanning and the user specified patterns dont match (so the tool
can discover new patterns). It's all about collecting appropriate
parts of knowledge and
putting it together. It's surprising that most of this knowledge is
very easy to grasp.

Anyways, these are still just new enhancements to years old technique
and not a new
technique itself. But it's good!!! I know a lot of people who are
dealing with computer science
 as _science_ and they do really amazing stuff with static analysis
and model checking and
refuse to do anything "commercial" as they claim this is in general a
SAT problem[1] and
theory says most of this stuff is impossible. It's good to see people
actually try to see
what works in practice and what's not.


Mateusz

[1] http://en.wikipedia.org/wiki/Boolean_satisfiability_problem


More information about the Dailydave mailing list