Tag Archives: Curl

Simple PHP proxy for cross origin HEAD requests

Have some Javascript that runs through all the URLs on a site of mine to prep caching and check for dead links and other problems. Problem is that the site also includes embedded video files hosted by someone else. These fails of course because of cross origin restrictions and since I’m not in control of their servers I needed a workaround.

I wrote this small PHP script to work as a proxy and as far as I can see it works pretty great. Short and easy to follow as well, so that’s always fun :)

So basically the script just pipes through whatever headers it gets to the the target URL an pipes back whatever headers it gets in the response. Seems to work, but let me know of weaknesses and easy improvements if you see any :)

<?php

$request_headers = getallheaders();
unset($request_headers['Cookie']);
unset($request_headers['Host']);

foreach($request_headers as $key => &$value)
    $value = $key.': '.$value;

$url = $_SERVER['QUERY_STRING'];
$limit = 20;

$curl = curl_init();
do
{
    curl_setopt_array($curl, array
    (
        CURLOPT_URL => $url,
        CURLOPT_HTTPHEADER => $request_headers,
       
        CURLOPT_RETURNTRANSFER => TRUE,
        CURLOPT_HEADER => TRUE,
        CURLOPT_NOBODY => TRUE,

        CURLOPT_FOLLOWLOCATION => TRUE,
        CURLOPT_MAXREDIRS => $limit--,
    ));

    ob_start();
    $headers = trim(curl_exec($curl));
    $url = curl_getinfo($curl, CURLINFO_REDIRECT_URL);
    ob_end_clean();
}
while($url and $limit > 0);

curl_close($curl);

$headers = trim(substr($headers, strrpos($headers, "\r\n\r\n")));
header_remove();
foreach(explode("\r\n", $headers) as $h)
    header($h);

Notes

  • The Cookie and Host headers sent from the browser are removed so they don’t mess up the request.
  • CURL wasn’t acting properly where I deployed this. It didn’t follow redirects, and curl_exec output content to the browser even with return transfer set to true. So it has a manual workaround for following the redirects and uses output buffering to make sure nothing goes to the browser before we want to.

How to install software on SunOS

Solaris is a major pain and unfortunately several servers I have to deal with runs on it. Why is it a pain? Because for some reason it seems all Solaris machines are set up differently. Different sets of installed binaries, different places things are configured, and so on. Probably all caused by lazy sysadmins and/or developers.

Anyways, today I found out that curl wasn’t installed in a server where i needed to test a SOAP request. After some hours of digging I finally managed to get a working curl installed, and figured I should document the procedure here in case I need it again, which I probably will…

Continue reading How to install software on SunOS

Pretty print XML on Unix command line

Needed to check some XML output from a CalDAV service so I used curl, which is nice and simple. Only problem was that all the XML came on a single long unreadable line. Figured out this was quite simple to fix.

$ curl --digest --user usr:pwd -X PROPFIND  | xmllint --format -

The key part here is of course the piping into xmllint. --format tells it to format the XML and the - tells it to read the XML from standard in. The dash can be swapped with the path to an XML file, if you need to format already downloaded XML.

$ xmllint --format file.xml

Simple pimple dimple :)

PHP: How to get all images from an HTML page

rgbstock.com
I was curious to how I could make something similar to what Facebook does when you add a link. Somehow it loads images found on the page your link leads to, and then it presents them to you so you can select one you want to use as a thumbnail.

Well, step one to solve this is of course to find all the images on a page, and that is what I will present in this post. It will be sort of like a backend service we can use later from an AJAX call. You post it a URL, and you get all the image URLs it found back. Let’s put the petal to medal!

Continue reading PHP: How to get all images from an HTML page