Using the CURL library in PHP
(Page 2 out of 3)What's possible with the curl options
If you have a look at the manual for the curl_setopt() function you'll notice there's a huge list of different options. Let's go through the most interesting.
The first interesting option is CURLOPT_FOLLOWLOCATION. When this is set to true, curl will automatically follow any redirect it gets sent. For example, when you try to retrieve a PHP page, and the PHP page uses header("Location: http://new_url"), curl will automatically follow it. The example below demonstrates this:
// create a new curl resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.google.com/");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// grab URL, and print
curl_exec($ch);
?>
If Google decides to send a redirect, the example above will now follow to the new location. Two options that are related to this are the CURLOPT_MAXREDIRS and CURLOPT_AUTOREFERER options.
The CURLOPT_MAXREDIRS option allows you to define how many redirects should be followed, and any more after that won't be followed. If the CURLOPT_AUTOREFERER option is set to TRUE, curl will automatically include the Referer header in each redirect. Not that important really, but could be useful in certain cases.
Next up is the CURLOPT_POST option. This is a very useful function, as it allows you to do POST requests, instead of GET requests, which actually means you can submit forms to other pages without having to actually fill in the form. The below example demonstrates what I mean:
// create a new curl resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://projects/phpit/content/using%20curl%20php/demos/handle_form.php");
// Do a POST
$data = array('name' => 'Dennis', 'surname' => 'Pallett');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
// grab URL, and print
curl_exec($ch);
?>
And the handle_form.php file:
echo 'Form variables I received:
';
echo ''
;
print_r ($_POST);
echo '';
?>
As you can see this makes it really easy to submit forms, and it's a great way to test all your forms, without having to fill them in all the time.
The CURLOPT_CONNECTTIMEOUT is used to set how long curl should wait whilst trying to connect. This is a very important option, since it could cause requests to fail if you set it too low, but if you set it too high (e.g. 1000 or 0 for unlimited) it could cause your PHP scripts to crash. A related option to this is the CURLOPT_TIMEOUT option, which is used to set how long curl requests are allowed to execute. If you set this to a low value, it might cause slow pages to be incomplete, since they take a while to download.
The final interesting option is the CURLOPT_USERAGENT option, which allows you to set the user agent of the request. This makes it possible to create your own web spiders, with their own user agent, like so:
// create a new curl resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.useragent.org/");
curl_setopt($ch, CURLOPT_USERAGENT, 'My custom web spider/0.1');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// grab URL, and print
curl_exec($ch);
?>
Now that we've had most of the interesting options, let's have a look at the curl_getinfo() function and what it can do for us.
Getting info about the page
The curl_getinfo() is used to get all kinds of different information about the page that was retrieved and the request itself. You can either specify what information you want by setting the second argument or you can simple leave the second argument out and get an associative array with every detail. The below example demonstrates this:
// create a new curl resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.google.com");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FILETIME, true);
// grab URL
$output = curl_exec($ch);
// Print info
echo ''
;
print_r (curl_getinfo($ch));
echo '';
?>
Most of the information returned is about the request itself, like the amount of time it took and the response header that was returned, but there's also some information on the page, like the content-type and last modified time (only if you explicitly state you want to get the last modified time, like I did in the example).
That's all about curl_getinfo(), so let's have a look at some practical uses now.
April 25th, 2006 at 4:15 am
You can also use curl to submit XML requests to XML providers, like credit card clearing houses. While I don’t know how efficent this is below is a small sample.
$xml = ”
$MerchantID
$Account
$OrderID
$Amount
$CardNumber
$CardExpiryDate
$CardType
“;
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $PaymentServer);
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $xml);
$response = curl_exec ($curl);
curl_close ($curl)
$returnedXML = simplexml_load_string($response);
April 25th, 2006 at 4:17 am
Sorry forgot that you won’t see xml tags, so here I’ll try again;
$xml = ”
$MerchantID
$Account
$OrderID
$Amount
$CardNumber
$CardExpiryDate
$CardType
“;
April 25th, 2006 at 5:59 pm
[…] I’ve heard a few references to CURL but never knew much about it. CURL allows you to scrape/use other web pages as data . The most interesting use (I found) is you can automatically fill out form data and retrieve the $_POST array. CURL… it’s one bad mutha… http://phpit.net/article/using-curl-php/1 […]
April 26th, 2006 at 1:29 pm
Excellent***
Very very easy to understand
April 28th, 2006 at 7:44 am
As a Java/J2EE developer who also happens to use PHP for a few projects, I must conclude that this is the worst HTTP API that I have ever seen. It is not OO; you have to use curl_setopt() to set everything, including the URL and the parameters; by default output is printed to the browser instead of returned to the user, …
Maybe it’s not a bad idea to create a wrapper around it.
April 29th, 2006 at 10:38 am
Jan, there are probably tons of wrappers already on phpclasses.org, and it’d be pretty easy to write your own as well.
August 23rd, 2006 at 1:45 pm
CURL is certainly a good thing. Using this I wrote couple of programs to remotely login to the site and get data posted there. This is simply amazing thing to work on by every PHP programmer.