Oct 16, 2017

Vulnerability/Software Bugs as language modelling

http://videolectures.net/deeplearning2017_blunsom_language_understanding/

What I learn which can be applied to Vulnerability Discovery:

a.   Language modelling==> Bugs modelling.

b.   Pattern of language==>pattern of bugs.

c.   Sequence of language token==>proper sequence of programming tokens to constitute a "bug".

d.   Sequence of strings and its probability of occurrence==>given a sequence of strings, what is the probability it is a "bug"?   (many similar sequence need not necessary be a bug)

e.   The longer the sequence of strings, the richer the meaning it can convey==>the longer the sequence, the more contextual information it can be enveloped a thus more certainty of software vulnerability can be asserted.

f.   "Grammar" of language==>grammar for software program description.

g.   Generator language sentences (obeying some grammatical rules)==>random generator of programming tokens 

h.   N-gram model in language==>canonical sequence which maps to each algorithmic operation.

Action Items:

1.   Each strings of sequence - if it can be preceded by some "determining factor", or "classifier", we can easily add more contextual information to existing construction.   With this "classifier", it is possible to reorder the sequence without affecting the context or meaning given to the construction.   

2.   Generator of language ==> generator of program.

References:

https://medium.com/huggingface/launching-a-deep-learning-for-nlp-study-group-60ae8aca48ac

No comments: