One of the most powerful tools available to the web developer are http cookies.
Used and, more often-than-not, misused, cookies are what power every major
website on the net today. Did you know cookies can be tagged secure or hidden
from Javascript? Here's the 411 on how cookies work, complete with public domain
Javascript and PhP functions to use them, and an image hot link protection
example that uses cookies in addition to referrers.
Introduction
Long before the web, Unix programs exchanged information via a process known
as a "magic cookie". When the browser became popular the "magic cookie" concept
was introduced to allow browsers to retain small chunks of persistent data which could then
be passed back to the server. Today, because of their wide-spread use, cookies are not so
quite as magical, or mysterious as they used to be. However, with the Internet
deep within the throes of Web 2.0 they are more important than ever.
One possible source of the term, which cannot be proven definitely
today, was the comic strip Odd Bodkins by Dan O'Neill published in the
San Francisco Chronicle during most of the 1960s. Several of O'Neill's characters
ate Magic Cookies from the Magic Cookie Bush, many thought this was a
euphemism for LSD mentionable in a major newspaper. In any case,
the Magic Cookie transported the eater into Magic Cookie Land; thus, perhaps,
a small token produced a whole context or experience. -- Wikipedia
The Limits
Cookies, from a programming perspective, are highly unreliable. It's like
having your data stored on a hard drive that sometimes will be missing, corrupt,
or missing the data you expected. There are a LOT of people who surf with
cookies off or opt-in (where the user manually approves each site allowed to
store cookies on his or her computer).
Even if a user surfs with cookies fully allowed, the cookies your site stores
may suddenly disappear if the browser decides it needs to make some space. The
cookie may also be deleted if the user opts to "purge private data", and more
than a few "tune-up" programs arbitrarily purge cookies along with temporary
internet files as part of their "enhancements." If the user just fiddles around
with his system clock and advances the date a few years all of your stored cookies
could suddenly disappear.
All browsers are required to be able to store at least 300 cookies -- total.
Fortunately most browsers store far more than this. Each individual domain name
can store 20 cookies with each cookie having a maximum size of 4096 bytes (4k). While
most browsers will store more than 300 cookies, almost all of them strictly enforce the
20 cookies per site and 4k size limits.
That is actually quite a lot of data however. Each domain can store 81,920 bytes or
80k in the browser. Now here's the reason why you shouldn't even consider storing
more than a few hundred bytes: A site's cookies are transmitted back to the server
with every last single http request. If the web page wants an image, the cookies
are transmitted. If the web page wants an external style-sheet or javascript the
cookies are transmitted. If you have an Ajax routine which queries the database
each time the user hits a key on the keyboard then those cookies will be transmitted
along with the request every time a user hits a key. Even on broadband 80k a keystroke
is a burden, on dial up this will break most user experiences irretrievably.
Best Practice
Unless you just have no access to server-side scripting at all, you will want to
keep your cookie data as small as possible. Ideally you would want just a user id
code that identifies the browser to the database and then have your database supply
all the data you would have ordinarily kept in the browser's cookie. Not only is
this friendlier to your users in terms of bandwidth and response times, but it's also
safer for you since the user data will be stored in your database and not in the
extremely unreliable client.
If you're matching the user id in your database the worst that can happen if
the user loses his cookie is that he or she will have to log in again to get his
or her cookie back.
A Cookie's Structure
A cookie is just a single line of text. When transmitted between the browser
and the server it looks much like this:
foo=bar; path=/; expires=Mon, 09-Dec-2008 13:46:00 GMT
In this case the cookie's name is "foo", the value = "bar", it's valid for
all paths on the domain and it expires at 1:46 GMT on December 9th 2008.
Cookie Crumbs -- Its Attributes
The attributes a cookie can have are as follows
- Name=Value
- Required. The name of the state information ("cookie") is NAME,
and its value is VALUE. foo=bar for instance. NAME can not begin
with a dollar sign.
- Comment=comment
- Optional. This is a bit of information a browser can display to help
the user decide whether to approve the cookie or not. In practice this
is not used anywhere.
- Domain=domain
- Optional. The Domain attribute specifies the domain for which the
cookie is valid. An explicitly specified domain must always start
with a dot.
- Expires=delta-seconds
- Optional. The Max-Age attribute defines the lifetime of the
cookie, in seconds. The delta-seconds value is a decimal non-
negative integer. After delta-seconds seconds elapse, the client
should discard the cookie. A value of zero means the cookie
should be discarded immediately.
- Path=path
- Optional. The Path attribute specifies the subset of URLs to
which this cookie applies. For instance '/images' would send the
cookie only if the url were some variation of 'mydomain.com/images'.
- Secure
- When this is attribute is present (no value required) the cookie will
be transmitted ONLY when a secure connection has been initiated between the
client and server. This should be used if you are transmitting sensitive
information like social security numbers or credit card via a cookie (which is
bad practice).
- Version=version
- The official RFC states that this
is required, however in practice it is usually never specified and used by
the various interfaces.
Despite all the options, most applications use only name, value and sometimes
expires with secure running a distant (but still useful) fourth.
Javascript Cookies
While Javascript supports cookies through the use of document.cookie, it is
extremely unfortunate that the API is very weak.
document.cookie is a string that is built and maintained by the browser. If you set the
string (document.cookie='name=value') then the browser will attempt to create a cookie
from the data you set and then the string will revert to the cookies actually stored
in the browser. It's a string that acts like a function and it's very unnerving.
What's more, when you access document.cookie it shows only the name/value pairs
and no other information about the cookie so you won't be able to tell when it expires,
what paths and domains its good for. All document.cookie will show you will be the
cookie names and their values.
To effectively use javascript
cookies you'll need four standard functions to
set, read, delete, and check your cookies. The good news is that once these
are added to your toolbox, managing your site's cookies becomes a breeze.
I'm not re-inventing the wheel here. Get and delete cookie are public domain
functions, setCookie is a modified variant of a public domain function and I quickly
wrote cookiesAllowed() to test if the browser will accept cookies or not. They're of course
all released into the public domain for you to use as you see fit.
function cookiesAllowed() {
setCookie('checkCookie', 'test', 1);
if (getCookie('checkCookie')) {
deleteCookie('checkCookie');
return true;
}
return false;
}
This function returns true if the user has cookies enabled (or disabled but enabled for your site),
and false if cookies are disabled. All it does is attempt to set a cookie and if
the cookie isn't subsequently found, it returns false. If the cookie is found, the
cookie is deleted and the function returns true. Note that this function will return
false if the site is using 20 cookies already, or the size of the site's cookies
doesn't allow enough space for this test to work.
function setCookie(name,value,expires, options) {
if (options==undefined) { options = {}; }
if ( expires ) {
var expires_date = new Date();
expires_date.setDate(expires_date.getDate() + expires)
}
document.cookie = name+'='+escape( value ) +
( ( expires ) ? ';expires='+expires_date.toGMTString() : '' ) +
( ( options.path ) ? ';path=' + options.path : '' ) +
( ( options.domain ) ? ';domain=' + options.domain : '' ) +
( ( options.secure ) ? ';secure' : '' );
}
This is the most complex of the functions, but, without question, the most
important. This function accepts up to 6 arguments. Only Name and Value are
required, the rest are optional although expires should probably always be set.
name is a string which indicates the name of your cookie. It's the
same as a variable name. You should avoid using symbols in the name of your cookie.
value is the value of the cookie. You don't have to use any other values:
you can simply call setCookie('name', 'value'), but if you'd like for
the cookie to expire you can specify the number of days to keep the cookie. For instance
for the cookie "name" to exist for only 4 days you'd specify "setCookie('name', 'value', 4);" .
The rest of the options are passed in Javascript object notation (JSON!) to allow you some
flexibility for what you want to set and what you don't want to set. The names that are looked
for are path, domain, and secure. You can specify any or all of them. If you want to set a path
and secure you would call the function as such...
setCookie('name','value',1, { "path" : "/", "secure" : "true" });
To set domain and path with a different syntax (but passing the same basic object)...
var cookieOptions = {};
cookieOptions.domain='yahoo.com';
cookieOptions.path='/';
setCookie('name', 'value', 1, cookieOptions);
And if you just want to make the cookie secure using no other options...
setCookie('name', 'value', 1, {"secure" : "true"});
Here's a brief overview of the three options:
path (optional) specifies which path in the url to send the cookies, generally this is '/'
which specifies a site-wide cookie. If you set path equal to '/images' then the cookie
will only be sent and usable when you're actually in the images directory.
Note that if you want to set a path you MUST specify expires.
domain (optional) specifies which domain the cookie is for. If you
specify a domain you must specify an expires and path. There's a little idiosyncrasy here
in that your domain name needs to BEGIN with a period. So if you want to specify mydomain.com
then you need to pass '.mydomain.com' as the domain. Fortunately you can only
set the domain for which the web-site is a domain. That is, if the page is being
served from "mydomain.com" then you can specify any prefix to mydomain.com you want
from ".www.mydomain.com" to ".wow.this.is.a.really.long.domain.name.at.mydomain.com" however
you can not set the cookie for ".yahoo.com" or ".com".
secure can be
any non-falsy value (passing true is recommended). If secure has a truthy
value then the cookie will be flagged secure which means it will be transmitted
ONLY when the browser and server have established an encrypted link. If you
want to establish a secure flag you will need to pass expires, path, and domain.
Here are a few examples:
setCookie('user', '19')
-- sets a cookie with a name of "user" and a value of "19".
setCookie('sky','blue',12)
-- Sets a cookie with a name of "sky", and a value of "blue"
which will expire in 12 days.
setCookie('person', '21', 12, '/', 'somedomain.com', true)
-- Sets a cookie with a name of "person" and a value of "21"
which will be sent on all paths of "somedomain.com"
but only when the connection is encrypted.
Once a cookie has been set you'll need a way to read it.
function getCookie( name ) {
var start = document.cookie.indexOf( name + "=" );
var len = start + name.length + 1;
if ( ( !start ) && ( name != document.cookie.substring( 0, name.length ) ) ) {
return null;
}
if ( start == -1 ) return null;
var end = document.cookie.indexOf( ';', len );
if ( end == -1 ) end = document.cookie.length;
return unescape( document.cookie.substring( len, end ) );
}
This function accepts a name of a cookie and if it exists it will pass back
the value, and if it doesn't exist it will pass back null. It's an extremely
easy function to use. Sticking with the examples we gave for setCookie...
getCookie('user') -- will return 19.
getCookie('sky') -- will return blue.
getCookie('person') -- will return 21.
getCookie('monster') -- will return null since we never set it.
Finally you'll need a way to delete the cookie.
function deleteCookie( name, path, domain ) {
if ( getCookie( name ) ) document.cookie = name + '=' +
( ( path ) ? ';path=' + path : '') +
( ( domain ) ? ';domain=' + domain : '' ) +
';expires=Thu, 01-Jan-1970 00:00:01 GMT';
}
This function accepts a name (mandatory) and an optional path and domain.
If the name of the cookie is found ( it calls getCookie ), the expires value
of the cookie is set to a date in the far past which automatically deletes the
cookie.
So to delete our examples.
deleteCookie('user');
deleteCookie('sky');
deleteCookie('person');
deleteCookie('monster');
In the above example user,sky, and person would all be deleted from the
browser's cookie. Monster never existed so it won't be deleted but it
also won't throw an error or do anything unexpected.
Server Cookies
As you might expect PhP has several commands to handle and process cookies
on the server side. First off, all cookies are placed in the $_COOKIE array.
Using our examples, we can extract the values in PHP as follows.
$user=$_COOKIE["user"]; -- $user=19
$sky =$_COOKIE["sky"]; -- $sky=blue;
$person=$_COOKIE["person"]; -- $person=21;
$monster=$_COOKIE["monster"]; -- $monster=null;
You can test to see if a cookie exists with the isset command.
if (isset($_COOKIE["sky"])) { } -- This will be true
if (isset($_COOKIE["monster"])) { } -- This will be false
You can set cookies with the setcookie command.
setcookie ( string name [, string value [, int expire [, string path [, string domain [, bool secure [, bool httponly]]]]]] )
This is similar to our javascript function but a little different. expires is
the number of seconds until the cookie expires. If you use zero or a negative
number here it will delete any existing cookie of the name you passed. To get a
"days" value like javascript, you'll need to do a little math: time()+(days* 60 * 60 * 24) where
days is the number of days you want to keep the cookie before it expires.
Path, domain, and secure are the same as our javascript function but there's an added argument
here called httponly. If you pass httponly true the the cookie will only be visible to the
server, javascript will have no access to the cookie which is very useful if you're
concerned about theft of sensitive data.
Here's the php equivelant of our javascript example.
setcookie('user', '19')
-- sets a cookie with a name of "user" and a value of "19".
setcookie('sky','blue',(time()+12*60*60*24))
-- Sets a cookie with a name of "sky", and a value of "blue"
which will expire in 12 days.
setcookie('person', '21', (time()+12*60*60*24), '/', 'somedomain.com', true, true)
-- Sets a cookie with a name of "person" and a value of "21"
which will be sent on all paths of "somedomain.com"
but only when the connection is encrypted and this cookie will be
visible only to the server, not to javascript.
To delete a cookie in php you just use setcookie, specifying the name of the
cookie you want to delete, a value (which can be anything), and then specify a time
of zero or a negative number.
Practical Uses
For me personally, the only valid use of a cookie is a small, discrete user
code after the user has submitted a login and password to your site. The user
code can then be used to match the browser in a database and from there access
necessary data and preferences.
There are of course other valid uses for cookies. Marketing types use
cookies to assist in filtering out hits from unique users. Some websites
store user preferences in cookies. Cookies are pretty much mandatory for e-commerce
shopping carts.
One of the more interesting uses of cookies is a means of preventing hot-linking.
Using a server language like php and a few tweaks to your web server you can
have all image requests redirected to your script. If the user has the appropriate
cookie, set by your site, then the script sends the appropriate headers and then
passes along the image/file. If the cookie isn't set then the file is processed
differently. It's not a perfect solution since there are as many users which
surf without cookies are there are that surf without referrers, but through
a combination of cookie and referrer checks you can drastically minimize the
false positives.
Hotlink Protection Example
Here's a little hot-link protection system that uses a combination of
cookies and referrer tracking to prevent other sites from using your images on
apache web-servers.
The first step is to modify .htaccess, we'll intercept all requests for
/images/ and pass it to a php program NAMED images. If you have a url of
"http://www.domain.com/images/pretty-sunset.gif" then pretty-sunset.gif would be
in your root directory. There is no directory named images, but there is a php
program NAMED "images".
<FilesMatch "^images$">
ForceType application/x-httpd-php
</FilesMatch>
This tells the server that any file named "images" is actually a php program
even though it doesn't have a .php extension.
Next, create a file named "images" in your root directory.
<?php
if ((isset($_COOKIE["allowImages"]))||(@preg_match($HTTP_REFERER,"mydomain.com"))) {
// extract the url, decode it and escape the characters so it doesn't do anything nasty.
$uri = htmlentities( urldecode ( $HTTP_SERVER_VARS["REQUEST_URI"] ) );
$expl = explode("/",$uri); // split the url on the slashes
$fname = $expl[count($expl)-1]; // extract the filename
header('Content-Type: image/gif'); // tell browser we're sending a gif file
readfile($fname); // send the gif file.
} else {
header('Content-Type: image/gif'); // tell browser we're sending a gif file
readfile('error.gif'); // send error.gif
}
?>
Simply put this function looks for a cookie named allowImages, and if it exists
OR the refferer is from your domain (you'll need to change mydomain.com) then
this php example will show the requested gif file, otherwise it will show a file
named error.gif. Remember that this works only for .gif files, you'll need to test
fname and send the appropriate mime-types for jpg, png, etc. Also, even though
it looks like the images are in a directory named images, they should all be in your
root directory.
All you need to do is somewhere in your site set a cookie named "allowImages", once
you do that, even if the user doesn't transmit referrer requests the image will still
be displayed because he has the cookie which says it's ok to show the image.
Additional Reading
Cookies are like an iceberg in that they're a lot deeper than they appear.
I've tried to go into as much detail as possible, without overwhelming you, but
if you are pushing the cutting edge you'll want to check out the official
RFC and the Wikipedia entry which contain much in the way of abstract information
but little in the way of practical use. Wikipedia especially is very good at
going over the privacy and security concerns inherent in using cookies.
|