Regex question of the day.
I have added a dummy class to mimic a database.
The class takes a query string, parses the SELECT and returns an array with the appropriate indexes (sort of).
To do so I used the following code:
// clean tabs
$string = trim($string);
// remove SELECT
$string = trim(substr($string, 6));
$from = strpos($string, 'FROM {');
$string = substr($string, 0, $from);
$chunks = array_map('trim', preg_split('~,(?! [^(]*\))~i', str_replace("\n", '', $string)));
$array = array();
foreach ($chunks as $chunk)
{
$res = explode(' as ', strtolower($chunk));
if (count($res) == 2)
$array[$res[1]] = '';
else
$array[$res[0]] = '';
}
The idea looked simple: strip anything that is not necessary, split on the commas and do something about the "as".
It works on things like:
SELECT something, something2 FROM whatever
and:
SELECT something, IFNULL(something2, 0) FROM whatever
But of course we have weirder things like:
SELECT something, '1, 0' as something2 FROM whatever
and here it fails.
I could add the single quotes to the regex I already used, something like:
preg_split('~,(?! [^(\']*[\)\'])~i',
though we have even more weird things like:
SELECT usepostcounts != 'yes' AS count_posts, '-1,0' AS member_groups FROM whatever
and that regex would fail.
Is there any way to split a similar string simply based on a regexp?
Or do I have to write a parser?