I was going back in the history seeing how the current security scan (paranoid one) came about
(iframe|\\<\\?|\\<%|html|eval|body|script\W|[CF]WS[\x01-\x0C]) //Improved regular expression detection
(iframe|\\<\\?php|\\<\\?[\s=]|\\<%[\s=]|html|eval|body|script\W) // Don't allow the word 'description' to trigger a false positive.
(iframe|\\<\\?php|\\<\\?[\s=]|\\<%[\s=]|html|eval|body|script) // Added protection against <?= and <%=
(iframe|\\<\\?php|\\<\\?\s|\\<%\s|html|eval|body|script) // Relax the conditions for an avatar to be refused.
(iframe|\\<\\?php|\\<\\?|\\<%|html|eval|body|script) // Prevent certain ascii data to appear in avatars
The current one looks for \< or \<\ or \<% and will fail ... seems pretty strict to me, so strict in fact that probably no one uses it since the odds of find \< are darn good.
Looking at the progression, I don't think that was the intention but wanted to get some others thoughts on that. I'm not sure what the signature in the file would be. Even the earlier ones of |\\<\\?php which means \<php or \<\php don't make sense to me, I could see \\<\\\?php or even \\<\\?\?php
Any of you at heart hackers have insight on this one?
I searched a bit and found a simple example here:
http://ha.ckers.org/blog/20070604/passing-malicious-php-through-getimagesize/
I copied in an hex-editor the image code and it works (the image I mean, I can see the phpinfo output).
Then I used that image to test Elk's paranoid checks and they work (I also used the shor-tags and they are detected).
So the pattern (at least some) are correct and detect the expected code.
Then, to verify, I put the last of these regexp in here:
http://rubular.com/
meh, as it is, it doesn't match <?php
Converting the double escapes into single escapes it is able to match things.
I finally decided to test the preg_match in isolation:
$cur_chunk = '<?';
echo preg_match('~(\\<\\?)~i', $cur_chunk);
???
Why does it give 1?
Shouldn't this regexp match a
\< with or without a
\ after it?
/me is confused...
Better go bed and forget about things I don't understand... lol
I see this note in the preg section
In some testing ~\\<~
matches < or \<
~\\\\<~
only matches \< not <
~\\?~
matches ? and \?
~\\<\\?~
will match \<? or <? but not \<\? and I would have expected it to based on the above.
So this appears to be another php thing due to its use of \ as an escape sequence in a string on top of preg's use of \ as the same escape sequence. Still not clear on the rules though.
Do you remember if something was changed here?
Nope ... It was mostly my confusion over both php and preg using \ as the escape character. Forgetting that had me looking at that regex going wtf ?
Then I move it out. :P