Friday, April 22, 2011

Passing Information to PHP

There are various ways of passing information to a PHP program, and we’ll cover many of them in the database-related chapters that follow, and when we come to talk about HTML forms later on. But here’s a fun example for now, which illustrates a couple of important points about PHP.
Open PSPad, delete the contents of demo.php and replace it with the following:
<?php
$person = $_GET["name"];
echo "<HTML>";
echo "Hello ";
echo $person;
echo ", how are you today?";
echo "</HTML>";
?>
Surf to the demo.php page and you’ll see a message that says "Hello , how are you today".

Here, we’re passing 2 variables called firstname and surname. Note how only the first variable is preceded by a question mark. Any subsequent variables must be preceded by an ampersand.
Back to our demo.php program. Assuming you entered a name of Robert on the URL, you’ll see the following in your web browser:


To understand how this works, take another look at the PHP program code. The only line that will be unfamiliar to you is $person = $_GET["name"];

The $_GET[" "] syntax is how you retrieve variables (known as parameters in this context) that were added to the end of the URL. In this case, because we referred to the parameter as name in the URL, $_GET["name"] retrieves it for us. I’m retrieving it into a variable called $person rather than $name, but that’s entirely up to you.
Note that the URL variable in the square brackets doesn’t have a dollar sign at the start, so $_GET["$name"] wouldn’t work. And note that GET is in upper case – this is important too.

If you’ve got a few minutes spare, here’s something that you can try. Enhance the program so that, if no name is entered on the URL, the program displays an error message. To do this, you’ll need to know that the strlen() function returns the length, in characters, of a string. So having retrieved the name from the URL, code such as:
if (strlen($person) == 0)
will allow you to find out whether any name was supplied.
Alternatively, adapt the program to display not just the name that was entered, but also the number of characters in the name.

Never Forget to Sanitize
Being able to pass information to a PHP program so easily is a good one, and you’ll find that you use it a lot. Mostly, it’s used in forms. For example, your visitor fills in a web-based form with his name and address, which gets retrieved by a PHP program and displayed on screen or added to a database. But the ease with which PHP can accept information from visitors to your site hides a very serious security flaw. You can’t trust that information, because the visitor is free to enter anything he or she likes.
Here’s a very simple example. Surf to demo.php again by using the following URL:

http://www.websitename.com/demo.php?name=<b>Robert</b>

This is what you’ll see in your Web browser:









See how my name is now displayed in bold? That’s because I added the relevant HTML tags to the URL. The PHP program uses the echo statement to insert, into the dynamically generated web page, whatever was specified on the URL. In this case it’s not just some text, but some actual HTML code too. You can see it if you use the View Source option in your web browser.
But why is this bad? Because, at a trivial level, someone can mess up the look of your web page by forcing it to display lots of stray HTML tags. At the other end of the scale, consider what would happen if someone entered a "name" of <script> followed by a load of Javascript code. Would the web browser then execute that code? Yes, it would.

All of which leads to one of the most fundamental rules of security when it comes to PHP programming: filter or sanitize information whose source you can’t be sure of. Quite how you do this will depend on circumstances, and on the precise nature of the data.

In our current example, where we’re expecting the visitor to enter a name, it’s obvious that the only characters we need to allow are the letters a to z (and A to Z). Any other character can be removed. This will include the pointy brackets which would allow someone to enter HTML code, and also many other unnecessary punctuation symbols. You’ll see later, when we talk about SQL Injection attacks, that some of those other punctuation symbols are just as dangerous.

One way to filter or sanitize data is to use the PHP str_replace function. This stands for "string replace". It can quickly and easily replace any characters in a string (ie, in a variable that corresponds to some text) with another character. Or, if you prefer, it can replace them with nothing and thus delete them.
For example, having retrieved the name from the URL into the $person variable, we could then add a line which says:
$person = str_replace("<","",$person);

As you can see, str_replace requires 3 parameters. What to search for, what to replace it with, and the variable within which to do it. This line would have the effect of replacing all < symbols in $person with nothing, ie deleting them.
Although this would work, you’d need lots of str_replace lines to deal with every unwanted character that you need to filter out. But thankfully there’s a neater way, using a function called ereg_replace. Take a look at this:
$person = ereg_replace("[^A-Za-z0-9 .,';:?]", "", $person);
This is similar to str_replace, in that you specify what you want to look for, what you want to replace it with, and the string to operate on. But take a closer look at what we’re searching for. No longer are we specifying a single character, but:
[^A-Za-z0-9 .,';:?]
Although this might look like gobbledygook, it’s actually just shorthand for the entire list of characters to search for. Let’s go through it in detail. First, the whole thing goes in square brackets, because that’s the rule for ereg_replace. A-Z means every character from A to Z.

Equally, a-z means all the lower-case letters. Then, in addition to those 52 characters, we list a few others. Namely the space, full stop, comma, apostrophe, semicolon, colon, and question mark.
The most important character in this whole collection, though, is the ^ at the start. This is ereg_replace shorthand for "everything except". So what the whole command actually does, is to replace every character in $person with a blank (ie, to delete that character), EXCEPT where the character is a letter, a space, a comma, an apostrophe, and so on.
You don’t really have to understand it. Just add this line to your PHP code, after the line which retrieves $person from the URL. Now surf to demo.php again and try entering "forbidden" characters. You’ll notice that they don’t appear in the generated web page.

Sanitizing strings is fiddly but, thankfully, it only normally takes just line or two of extra code. And it is vital that you do sanitize every string that comes from a visitor to your site, because you have no idea what the visitor has typed. Get into the habit now. Every time you use $_GET to retrieve information from a URL, sanitise it before using it. Failure to do this will mean that your site WILL get hacked.


0 comments:

Post a Comment

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | Grants For Single Moms