However, one of the most useful features of PHP is its ability to
talk to server-based databases, so we first need to understand such
things. There are various server-based database products which store
their information on a server
rather than on the user’s local PC. That server doesn’t necessarily have to be a web server – the main point is that the database is held centrally and then accessed by multiple users, from multiple locations.
In the world of databases, the most popular language for interrogating a database is something called Structured Query Language, or SQL. Many database products use SQL, or something like it. Most SQL or SQL-like database server products cost serious money, such as Oracle or Microsoft SQL Server. One, though, is totally free of charge, and unsurprisingly this is the one that most web hosting companies install for your use. The product is called MySQL (pronounced my ess queue ell, or sometimes my sequel). Don’t be put off by the price tag – MySQL is corporate-strength software, just as happy storing tens of millions of items as a mere handful.
In this article I’ll show you how to manually connect to your web server via a browser and create a database. In the next chapter, you’ll see how to access your database using PHP, which will finally allow you to create web-based applications and database-driven web sites.
You won’t be able to try any of the examples that follow unless your web host offers MySQL.
Almost all of them do, though some make a small additional charge to enable it. In the case of the hosting companies that we’re featuring throughout this book, namely site5.com and hostmonster.com, MySQL is included.
Remember that MySQL is a server-based product, installed on your web server rather than your local PC. Equally, all of the data in its databases is stored on the server too. To access the system in order to create and browse databases, you use a web browser. In the case of many hosting companies, and both site5 and hostmonster are no different, accessing MySQL requires you to access two different web sites. One location is for creating the initial databases.
The other location is for managing those databases, which means: creating tables and fields; adding and deleting data; searching the database tables. The reason for this demarcation is because of security considerations.
What you probably think of as a database is actually referred to as a table. A table is a collection of rows and columns. Think of it just like a spreadsheet. You might have a table called "contacts". The columns are headed "name", "address" and "phone_number". A row might consist of 3 fields (just like spreadsheet cells) containing the values "John Smith", "65 High Street, London", and "01234 567890".
A database is a collection of tables. Typically, those tables are all related but there’s no rule that says they have to be. For example, imagine that you’re creating a database application for a system that manages the running of a hotel. The database itself might be called "hms" (for hotel management system). Within the database might be 3 separate tables. One, called "customers", contains the contact details of all past, present and future guests. Another table, called "staff", might contain the contact and salary details for staff. A final table, called "rooms", might contain details of the bedrooms, with column headings such as room_number, occupied_by, bed_type, phone_number, needs_cleaning,
date_last_occupied and so on.
Many hosting companies limit the number of databases that you can create. Typically, to between 1 and 5. However, this is not a problem as you are rarely limited in the number of tables that you can create within each database. Even if you only have one database, you can still create hundreds of tables in it. Obviously, being able to group your tables into related functions by using separate databases is handy, but it’s not essential. If you only have the one database, and you want to create the hotel management system, the standard way of doing such things is to add a prefix to each of the related tables in order to indicate that they are linked. So we might choose an hms_ prefix for our hotel management system, and therefore name the tables hms_rooms, hms_guests, hms_staff and so on. Other tables that
are for use with other systems would have their own prefix.
The golden rule of database design is: "don’t store anything more than once. Instead, store it just once, give it a unique reference number, and refer to it by that number in future". Why?
Because it’s a much safer way to ensure that the database content remains accurate. If you need to change something, you only need to change it once, and all references to it are automatically up to date. Let me give you an example.
Consider our hotel management system. There’s a table called "rooms", with columns called:
room_number
occupied_by
bed_type
phone_number
needs_cleaning
date_last_occupied
There’s also a "customers" table, with columns called:
name
address
phone_number
Within the customers table is John Smith, and he’s currently resident in room 225. How do we represent this in the database? The obvious answer is to add him to the customers table (because he’s a customer), and then create a new row in the "rooms" table, too, as follows:
data:image/s3,"s3://crabby-images/eab1b/eab1bca487df7f363fa852435d601cd01fae88d4" alt="Untitled"
However, this breaks our golden rule of only storing something once. John Smith appears in both the "customers" table and the "rooms" table. It’s a waste of space storing this information twice. Also, if John Smith calls the hotel and says that we’ve spelled his name wrong on his bill (it should be Jon), we’d have to search every table looking for instances to
change.
There’s one further problem, too, which is even more serious. What happens if a new guest, also called Jon Smith, checks into the hotel? There are now 2 Jon Smiths in the customers table, but there’s no clue in the "rooms" table as to which of them is currently in room 225. Which will play havoc with the billing system.
So here’s how we do it, according to the golden rule of giving everything a reference number and then referring to things by their number. We create the customers table like this:
id
name
address
phone_number
The difference is that we’ve added a reference number column. We’ve called it "id" rather than "reference number", simply because that’s the convention that most MySQL database designers use, but it’s not essential.
So now, Jon Smith’s entry in the customers table might look like this:
data:image/s3,"s3://crabby-images/a112d/a112d40df8b3f773c49142858dcb319f69b27a26" alt="tb1"
And the reference to him in the "rooms" table looks like this:
data:image/s3,"s3://crabby-images/11dc4/11dc4d09abfbb29e3fabeede9afba12de5950f24" alt="tb2"
No longer is room 225 occupied by Jon Smith. It’s now occupied by guest number 64. And by checking the customers table and searching for the person whose id is 64, we know exactly which Jon Smith is in the room. Furthermore, if Jon changes his name back to John, all we need to change is the single reference in the customers table. All other references to him in the rooms table, the billing table, the restaurant bookings table, and so on, will only refer to him as customer 64 and not by name.
So now you know why large companies always want to know your customer reference number when you telephone them with an enquiry!
Designing databases in this way, ie with information referenced by a unique number, is known as normalization. It’s often harder than it looks, but is well worth the effort invested. It ensures that your database, ie your collection of related tables, is working at peak efficiency.
As it happens, we've already missed a trick in the design of our hotel system. You'll see that there's a ate_last_occupied field in the rooms table. But why do we need this? Presumably, hotel bookings will be held in a "bookings" table which has columns called id, customer, arrival date, departure date and room number. So in order to find out which guest last occupied a room, we don’t actually have to store any data at all. We just search the bookings table instead. It's this sort of planning that takes just seconds to think about when you're designing a system, but days or weeks to implement if you discover, towards the end of the project, that you've been working with an inherent design flaw. So, always give lengthy consideration to your database design before you start writing any programs.
MySQL is known as a relational database. Hopefully you can now see why. Within the database is a collection of tables, and the information in those tables is often related. In the case of our rooms table, for example, there’s a relationship with the guests table.
Consider the example above, where room 225 is currently occupied by guest number 64. Five years from now, you notice that Jon Smith has stopped visiting your hotel so you decide to remove him from the customer database. Or perhaps he calls the hotel and asks to be taken off the mailing list, and you do this by simply removing his entry from the customers table.
As far as the customers table is concerned, there's no problem. Where once was record number 64, now there's nothing. But customer 64 is still referred to in the bookings table, and probably some other tables too. Which is going to cause problems. For example, when you run the "show me everyone who's stayed in room 225 over the past 7 years" report, and the system tries to find out the name and address of the person whose customer id is 64, there will be nothing on file. The integrity of the relational database has failed.
It's important that you handle such instances, and that you handle them correctly. In the case of the deleted customer, you can do one of three things:
1. Delete the customer from the table. In addition, also delete every affected row in every other table. Which in this case would mean deleting all of customer 64's roombookings, restaurant reservations, customer comments, and whatever other tables have a customer_id reference. The problem here is that you lose valuable data.
2. Replace the contents of customer's record with text such as "deleted". Leave the id number in place, but change the firstname, surname, phone number, address etc. Now, all your reports will still work correctly but, as per Mr Smith's wishes, his information has been deleted from your system.
3. Add a field to the customers table called, say, is_live, which specifies whether this is a live customer record or not. Keep it set to Y for all current customer records. To delete a customer, you don't remove anything from the customers table. Instead, you just set the customer's is_live flag from Y to N. Now, in the code that produces your reports, you check whether a customer is live or not and, if not, you display a message that says "non-live customer" rather than the real data. The old data is still there, but you simply choose not to display it. Bear in mind, though, that this can sometimes fall foul of data protection legislation in some countries, which states that you should delete personal customer information if you no longer need it or if the customer asks you to. Merely marking it as non-live doesn't count as having deleted it.
rather than on the user’s local PC. That server doesn’t necessarily have to be a web server – the main point is that the database is held centrally and then accessed by multiple users, from multiple locations.
In the world of databases, the most popular language for interrogating a database is something called Structured Query Language, or SQL. Many database products use SQL, or something like it. Most SQL or SQL-like database server products cost serious money, such as Oracle or Microsoft SQL Server. One, though, is totally free of charge, and unsurprisingly this is the one that most web hosting companies install for your use. The product is called MySQL (pronounced my ess queue ell, or sometimes my sequel). Don’t be put off by the price tag – MySQL is corporate-strength software, just as happy storing tens of millions of items as a mere handful.
In this article I’ll show you how to manually connect to your web server via a browser and create a database. In the next chapter, you’ll see how to access your database using PHP, which will finally allow you to create web-based applications and database-driven web sites.
You won’t be able to try any of the examples that follow unless your web host offers MySQL.
Almost all of them do, though some make a small additional charge to enable it. In the case of the hosting companies that we’re featuring throughout this book, namely site5.com and hostmonster.com, MySQL is included.
Remember that MySQL is a server-based product, installed on your web server rather than your local PC. Equally, all of the data in its databases is stored on the server too. To access the system in order to create and browse databases, you use a web browser. In the case of many hosting companies, and both site5 and hostmonster are no different, accessing MySQL requires you to access two different web sites. One location is for creating the initial databases.
The other location is for managing those databases, which means: creating tables and fields; adding and deleting data; searching the database tables. The reason for this demarcation is because of security considerations.
Databases, Tables, Fields, Rows and Columns
At this point, you’re probably getting confused by the mention of databases, tables, fields and so on. So before we continue, let’s clear up the differences. It’s vital that you understand some basic terminology in order to avoid confusion later on.What you probably think of as a database is actually referred to as a table. A table is a collection of rows and columns. Think of it just like a spreadsheet. You might have a table called "contacts". The columns are headed "name", "address" and "phone_number". A row might consist of 3 fields (just like spreadsheet cells) containing the values "John Smith", "65 High Street, London", and "01234 567890".
A database is a collection of tables. Typically, those tables are all related but there’s no rule that says they have to be. For example, imagine that you’re creating a database application for a system that manages the running of a hotel. The database itself might be called "hms" (for hotel management system). Within the database might be 3 separate tables. One, called "customers", contains the contact details of all past, present and future guests. Another table, called "staff", might contain the contact and salary details for staff. A final table, called "rooms", might contain details of the bedrooms, with column headings such as room_number, occupied_by, bed_type, phone_number, needs_cleaning,
date_last_occupied and so on.
Many hosting companies limit the number of databases that you can create. Typically, to between 1 and 5. However, this is not a problem as you are rarely limited in the number of tables that you can create within each database. Even if you only have one database, you can still create hundreds of tables in it. Obviously, being able to group your tables into related functions by using separate databases is handy, but it’s not essential. If you only have the one database, and you want to create the hotel management system, the standard way of doing such things is to add a prefix to each of the related tables in order to indicate that they are linked. So we might choose an hms_ prefix for our hotel management system, and therefore name the tables hms_rooms, hms_guests, hms_staff and so on. Other tables that
are for use with other systems would have their own prefix.
Normalization
While we’re discussing MySQL terminology, there’s one more thing that you need to know before we can start creating databases and tables for real. It’s nothing to do with MySQL or web sites specifically, but is more to do with basic computer science and the creation of a computer database.The golden rule of database design is: "don’t store anything more than once. Instead, store it just once, give it a unique reference number, and refer to it by that number in future". Why?
Because it’s a much safer way to ensure that the database content remains accurate. If you need to change something, you only need to change it once, and all references to it are automatically up to date. Let me give you an example.
Consider our hotel management system. There’s a table called "rooms", with columns called:
room_number
occupied_by
bed_type
phone_number
needs_cleaning
date_last_occupied
There’s also a "customers" table, with columns called:
name
address
phone_number
Within the customers table is John Smith, and he’s currently resident in room 225. How do we represent this in the database? The obvious answer is to add him to the customers table (because he’s a customer), and then create a new row in the "rooms" table, too, as follows:
However, this breaks our golden rule of only storing something once. John Smith appears in both the "customers" table and the "rooms" table. It’s a waste of space storing this information twice. Also, if John Smith calls the hotel and says that we’ve spelled his name wrong on his bill (it should be Jon), we’d have to search every table looking for instances to
change.
There’s one further problem, too, which is even more serious. What happens if a new guest, also called Jon Smith, checks into the hotel? There are now 2 Jon Smiths in the customers table, but there’s no clue in the "rooms" table as to which of them is currently in room 225. Which will play havoc with the billing system.
So here’s how we do it, according to the golden rule of giving everything a reference number and then referring to things by their number. We create the customers table like this:
id
name
address
phone_number
The difference is that we’ve added a reference number column. We’ve called it "id" rather than "reference number", simply because that’s the convention that most MySQL database designers use, but it’s not essential.
So now, Jon Smith’s entry in the customers table might look like this:
And the reference to him in the "rooms" table looks like this:
No longer is room 225 occupied by Jon Smith. It’s now occupied by guest number 64. And by checking the customers table and searching for the person whose id is 64, we know exactly which Jon Smith is in the room. Furthermore, if Jon changes his name back to John, all we need to change is the single reference in the customers table. All other references to him in the rooms table, the billing table, the restaurant bookings table, and so on, will only refer to him as customer 64 and not by name.
So now you know why large companies always want to know your customer reference number when you telephone them with an enquiry!
Designing databases in this way, ie with information referenced by a unique number, is known as normalization. It’s often harder than it looks, but is well worth the effort invested. It ensures that your database, ie your collection of related tables, is working at peak efficiency.
As it happens, we've already missed a trick in the design of our hotel system. You'll see that there's a ate_last_occupied field in the rooms table. But why do we need this? Presumably, hotel bookings will be held in a "bookings" table which has columns called id, customer, arrival date, departure date and room number. So in order to find out which guest last occupied a room, we don’t actually have to store any data at all. We just search the bookings table instead. It's this sort of planning that takes just seconds to think about when you're designing a system, but days or weeks to implement if you discover, towards the end of the project, that you've been working with an inherent design flaw. So, always give lengthy consideration to your database design before you start writing any programs.
MySQL is known as a relational database. Hopefully you can now see why. Within the database is a collection of tables, and the information in those tables is often related. In the case of our rooms table, for example, there’s a relationship with the guests table.
Referential Integrity
A normalised database, where each type of data item is held in a separate table with a unique id, makes a great deal of sense. It's by far the most efficient way to store and manage information, whether you're designing a booking system for a 10-room hotel or an inventory manager for a web site the size of Amazon. However, it's important to ensure that you don't end up with tables that refer to non-existing information. In database terminology, this means making sure that you maintain the referential integrity.Consider the example above, where room 225 is currently occupied by guest number 64. Five years from now, you notice that Jon Smith has stopped visiting your hotel so you decide to remove him from the customer database. Or perhaps he calls the hotel and asks to be taken off the mailing list, and you do this by simply removing his entry from the customers table.
As far as the customers table is concerned, there's no problem. Where once was record number 64, now there's nothing. But customer 64 is still referred to in the bookings table, and probably some other tables too. Which is going to cause problems. For example, when you run the "show me everyone who's stayed in room 225 over the past 7 years" report, and the system tries to find out the name and address of the person whose customer id is 64, there will be nothing on file. The integrity of the relational database has failed.
It's important that you handle such instances, and that you handle them correctly. In the case of the deleted customer, you can do one of three things:
1. Delete the customer from the table. In addition, also delete every affected row in every other table. Which in this case would mean deleting all of customer 64's roombookings, restaurant reservations, customer comments, and whatever other tables have a customer_id reference. The problem here is that you lose valuable data.
2. Replace the contents of customer's record with text such as "deleted". Leave the id number in place, but change the firstname, surname, phone number, address etc. Now, all your reports will still work correctly but, as per Mr Smith's wishes, his information has been deleted from your system.
3. Add a field to the customers table called, say, is_live, which specifies whether this is a live customer record or not. Keep it set to Y for all current customer records. To delete a customer, you don't remove anything from the customers table. Instead, you just set the customer's is_live flag from Y to N. Now, in the code that produces your reports, you check whether a customer is live or not and, if not, you display a message that says "non-live customer" rather than the real data. The old data is still there, but you simply choose not to display it. Bear in mind, though, that this can sometimes fall foul of data protection legislation in some countries, which states that you should delete personal customer information if you no longer need it or if the customer asks you to. Merely marking it as non-live doesn't count as having deleted it.
0 comments:
Post a Comment