Wednesday, November 2, 2011

Swedish browser statistics Q4 2011

It's been a while since I did this last but it's time for another rundown of who uses what in the Swedish (and nordic) marketspace.

These figures are extracted from a 24-hr statistics collection period and represent around five million pageviews of which roughly 350K were mobile views. Total unique visitors around one million. The sources are mainstream sites that serve mainly swedish content and targets a wide array of audiences.

Desktop browsers:

Browser makeNowFeb2011Change
1.
Firefox
43,42 %(51,92%)-8,5%
2.
Chrome
42,07 %(31,57%)+10,5%
3.
Internet Explorer
9,25 %(12,12%)-3,13%
4.
Safari
3,23 %(2,16%)+1,08%
5.
Opera
1,60 %(1,81%)-0,21%
6.
Mozilla Compatible Agent
0,20 %(0,01%)+0,19%
7.
RockMelt
0,08 %(0,11%)-0,03%
8.
BlackBerry9700
0,04 %---+0,04%
9.
IE with Chrome Frame
0,02 %---+0,02%
10.
Mozilla
0,01 %(0,03%)-0,02%

Mobile browsers:

iPhone

396,326

70.9%

Android

153,449

27.4%

Symbian

8,076

1.4%

Blackberry

992

0.2%

Windows phone

351

0.1%

Windows CE

21

0%

Unknown mobile

17

0%

WebOS

2

0%

Conclusion:
IE steadily declining. Firefox and Chrome dominate desktop market. Android dominates mobile market allthough browser brand and versions seems obscured.

Mobile continues to grow as a platform, now at 14.7% of total visitors. iPad stands for 1.5% of that figure while Android tablets are hard to identify (and GA won't group very good on screen resolutions) so I'll leave that figure dangling until I get better data.

For us CSS and HTML jockeys it sure looks like we get to play with the nice toys a bit more next year since Chrome and Firefox totally dominate the market with a whooping 85.47%. Crossbrowser hell just became a little cooler.

Wednesday, October 5, 2011

Node.JS-based real-time web tracking

This all started out of a need. A need for a statistics service that did not suck and that could handle our quite intense traffic (we run at 1500 req/sec max). Google Analytics failed miserably, GetClicky failed a little less miserably and other options were quite simply to expensive for our taste.

Enter Node.JS and a little bit of pre-alpha code known as Hummingbird. I'll let Hummingbird introduce it self:

"Hummingbird lets you see how visitors are interacting with your website in real time. And by “real time” we don’t mean it refreshes every 5 minutes—WebSockets enable Hummingbird to update 20 times per second. Hummingbird is built on top of Node.js, a new javascript web toolkit that can handle large amounts of traffic and many concurrent users."

Sounds good to me, where do I sign up? Well, as is customary these days, this project is on github and is installed by first cloning the repo and then running "npm install" in the hummingbird subdir. Of course, you need to have a recent node.js core, mongodb and libgeoip-dev 1.4.7+ if you want to get ip-location working (hummingbird has a nice little map display in the demo setup). Unfortunately there is not much documentation written (yet) so be prepared for a lot of debugging and guesswork if you want it to do anything besides the graph / total / map that's included in the demo.
Note: On Debian Lenny, libgeoip is at 1.4.4 which means you'll have to load it from lenny-backports to get the correct version (1.4.7+). Instructions for installing from backports can be found here: http://backports-master.debian.org/Instructions/

So how does it work? Well, the core is really a pixeltracker - including a 1x1 pixel from the system in your content will register a view with hummingbird, in effect saving a little bit of data to our mongodb collection. Hummingbird also runs a dashboard (if enabled) that can be viewed via http. The dashboard utilizes Web Sockets for rapid updates and binds the updates to jQuery elements on the dashboard page.

This is about as far as I have gotten right now. The lack of documentation makes it a bit hard to develop more widgets for the dashboard but that's nothing a little trial-and-error won't fix :)

Update:
Managed to get another piece of the puzzle working - a custom widget for the dashboard page. I started with the backend connection by combining the code for cart_adds.js and total_views.js to get a starting point for an event-driven counter widget. Followed up by creating a very rudimentary widget endpoint that picks up on whatever is processed on the backend by hooking into the event system. So any tracking that has a specific event attached to it (basically parts of the query string for the tracking pixel) will result in a display on the dashboard that reacts to new hits by incrementing a value or otherwise adjusting the style or layout of the widget. Nothing new really but still a valuable tool for our editors who are constantly chasing those extra page views.

I'll update again once I have a better implementation of it all.

Thursday, August 18, 2011

How to ip-migrate web sites using Varnish

This is a quick guide on how to migrate a bunch of web sites by using Varnish as a intermediary to allow for low-latency zero downtime migration between ip addresses or domain names.

Requirements: Varnish 2.x up and running, access to apache config, access to dns config.

1) Configure Varnish

Start out by defining your existing web sites as backends in the varnish config (default.vcl) by adding them in this format to the beginning of your vcl:

backend website1{
// NO CACHING
// website1.company.com
.host = "10.0.0.1";
.port = "80";
.connect_timeout = 160s;
.first_byte_timeout = 55s;
.between_bytes_timeout = 25s;
}

backend website2{
...
}

backend website3{
...
}

Then add some clever host/backend switching logic to your vcl:

sub vcl_recv {
# Find out which backend to use for the request
if (req.http.Host == "website1.company.com"){
set req.backend = website1;
return(pass);
}
else if (req.http.Host == "website2.company.com"){
set req.backend = website2;
return(pass);
}
else {
# maybe switch to your default backend here if you're using varnish in production
}
.....


You're now ready to start using varnish as your intermediary frontend router. Varnish will PASS any requests to these sites and you are now free to change the IP:s of the sites without the lag of DNS. You remembered to load and use your varnish config too, right?

2) Alter DNS for sites
Update your dns records, all www pointers should be changed to the ip address of your Varnish server.

3) Migrate!
When the changes have propagated (and you've verified that it works as intended) you are free to change ip:s or whatever you need on your sites, Varnish will maintain the dnsname -> backend connection as long as you remember to update your backend config in vcl. And, changes are instant - allowing fast rollbacks if needed.

Wednesday, August 17, 2011

How to use bash to find all used IP:s in your apache configs (following the debian config model)

for i in `ls /etc/apache2/sites-enabled/*.vhost`; do echo "Working file:$i"; echo "Servername:";grep "ServerName" $i | cut -c 16-40; echo "Ipaddress:";grep "<VirtualHost" $i|cut -c 14-27; done

Tuesday, March 22, 2011

Top 5 Varnish commands

Here's a brief list of my top 5 varnish commands

varnishstat
This is what I use to get a birds-eye view of what we have cooking. Provides all the info you need to spot cache misses and errors.
Usage: varnishstat for continuous display or varnishstat -i 1 for a quick one-off. More on this command: http://www.varnish-cache.org/docs/2.1/reference/varnishstat.html

varnishhist
Provides a histogram view of cache hits/misses
Usage: varnishhist More on this command: http://www.varnish-cache.org/docs/2.1/reference/varnishhist.html

varnishlog
Provides detailed information on requests. I use it to debug cache operations by for example issuing the command varnishlog -c -o ReqStart [insert ip here] which allows me to see what happens when I visit a cache object in my browser.
Usage: varnishlog -c -o ReqStart 192.168.0.1 More on this command: http://www.varnish-cache.org/docs/2.1/reference/varnishlog.html

varnishtop
I use this mainly to get lists of things. Like this one that gives me a list of the top urls hitting the backend (pass).
Usage: varnishtop -b -i TxURL More on this command: http://www.varnish-cache.org/docs/2.1/reference/varnishtop.html

varnishadm
Command-line varnish administration. I mainly use this to reload vcl and purge urls.
Usage: varnishadm -t host:port command
Reloading a vcl from the command line can be done like this:
Load (your changed vcl): varnishadm -t localhost:6082 vcl.load load01 /etc/varnish/default.vcl
Use (apply) the vcl: varnishadm -t localhost:6082 vcl.use load01
More on this command: http://www.varnish-cache.org/docs/2.1/reference/varnishadm.html

Tuesday, March 15, 2011

Facebook iframe like box tracking with Clicky (or Google Analytics)

This is how to do GetClicky tracking for facebooks fbxml iframe like box and the fbxml like button.

Without going into details about fb app id and such, this is a working method to implement the GetClicky tracking for facebook like boxes and buttons.

Standard fb script:
<div id="fb-root"></div>
<script type="text/javascript">
//<![CDATA[
window.fbAsyncInit = function() {
 FB.init({appId: '157019890979609', status: true, cookie: true,
 xfbml: true});
});
(function() {
var e = document.createElement('script'); 
e.async = true;
e.src = document.location.protocol +
'//connect.facebook.net/sv_SE/all.js';
document.getElementById('fb-root').appendChild(e);
}());
//]]>
</script>

The facebook "like" event hook:
FB.Event.subscribe('edge.create', function(href, widget) {
// do something here
});

The clicky call:
if(typeof(clicky) != "undefined"){
clicky.log('page url','facebook like clicked','click');
}


Combined:
<div id="fb-root"></div>
<script type="text/javascript">
//<![CDATA[
window.fbAsyncInit = function() {
FB.init({appId: '157019890979609', status: true, cookie: true,
xfbml: true});
FB.Event.subscribe('edge.create', function(href, widget) {
if(typeof(clicky) != "undefined"){
clicky.log('page url','facebook like clicked','click');
}
});
};
(function() {
var e = document.createElement('script'); 
e.async = true;
e.src = document.location.protocol +
'//connect.facebook.net/sv_SE/all.js';
document.getElementById('fb-root').appendChild(e);
}());
//]]>
</script>

Of course the same works for Google Analytics, just swap clicky.log for _gaq.push

Monday, March 14, 2011

Custom Drupal RSS Feed with CCK

Ever needed to do a quick RSS feed one-off out of your Drupal site? This is how I do it.

Setup: Drupal 6, CCK, Views enabled.

First, create a new content type and name it rssarticle. Add three text fields named heading, text and link. Save the content type.

Now, we'll use Views as a query builder that will provide the SQL you will run to build the feed. Start out by setting up a Node View, set Style to unformatted and Row style to Fields. Select the fields you want in your feed and move on to add a node type filter to restrict the view to our newly added rssarticle content type. Keep the rest of the settings untouched.

Now, hit the Preview button. You should now see some nodes rendered and a Query field that displays the query like so:

SELECT node.nid AS nid,
   node_data_field_heading.field_heading_value AS 
   node_data_field_heading_field_heading_value,
   node.type AS node_type,
   node.vid AS node_vid,
   node_data_field_heading.field_link_value AS 
   node_data_field_heading_field_link_value,
   node_data_field_heading.field_text_value AS 
   node_data_field_heading_field_text_value
 FROM node node 
 LEFT JOIN content_type_rssarticle node_data_field_heading ON 
   node.vid = node_data_field_heading.vid
 WHERE node.type in ('rssarticle')

Copy that SQL query and fire up your favorite php editor. Create a new php file by the name of mycustomfeed.php. Start by adding the following lines.

<?php
// Load necessary Drupal parts
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

// XML header
echo('<?xml version="1.0" encoding="utf-8" ?>');

// Get nodes from db
$results = db_query("");

// Loop over the result
while ($result = db_fetch_object($results)) {
 print $result->nid;
}
?>

Now this won't do much on it's own, we need the SQL you just copied to go in here too. Complete the code with this, placing it within the db_query("") statement.

<?php
// Load necessary Drupal parts
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

// XML header
echo('<?xml version="1.0" encoding="utf-8" ?>');

// Get nodes from db
$results = db_query("SELECT node.nid AS nid, 
node_data_field_heading.field_heading_value AS 
node_data_field_heading_field_heading_value, node.type AS 
node_type, node.vid AS node_vid, 
node_data_field_heading.field_link_value AS 
node_data_field_heading_field_link_value, 
node_data_field_heading.field_text_value AS 
node_data_field_heading_field_text_value FROM node node 
LEFT JOIN content_type_rssarticle node_data_field_heading 
ON node.vid = node_data_field_heading.vid WHERE 
node.type in ('rssarticle')");

// Loop over the result
while ($result = db_fetch_object($results)) {
 print $result->nid . "\n";
}
?>

Upload the file to your Drupal root and run either from a browser or from the command line.

It should output an xml header and list of NIDs. Check error logs for any warnings or sql errors and correct your code accordingly.

Now that we have the query working it's time to finish the job, adding the xml parts required for a valid RSS 2.0 feed as well as adding a LIMIT clause and performing some ISO date conversion magic.

<?php
// Load necessary Drupal parts
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

// XML header
echo('<?xml version="1.0" encoding="utf-8" ?>');
?>

<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>My Custom Feed</title>
    <description>This is My Custom Feed</description>
    <link>http://www.domain.com/mycustomfeed</link>

<?php

// Get nodes from db
$results = db_query("SELECT node.nid AS nid,
node_data_field_heading.field_heading_value AS 
node_data_field_heading_field_heading_value, 
node.type AS node_type, node.vid AS node_vid, 
node_data_field_heading.field_link_value AS 
node_data_field_heading_field_link_value, 
node_data_field_heading.field_text_value AS 
node_data_field_heading_field_text_value,
node.changed AS node_changed FROM node node 
LEFT JOIN content_type_rssarticle node_data_field_heading 
ON node.vid = node_data_field_heading.vid
WHERE node.type in ('rssarticle') LIMIT 0,10");
// Execute query and loop over the result
while ($result = db_fetch_object($results)) {
?>
<item>
<title>
<?php echo $result->node_data_field_heading_field_heading_value?>
</title>
<description>
<?php echo $result->node_data_field_heading_field_text_value?>
</description>
<link>
<?php echo $result->node_data_field_heading_field_link_value?>
</link>
<author>noreply@domain.com (Corporate Inc)</author>
<dc:creator>Head.Honcho</dc:creator>
<category>CustomCategory</category>
<guid isPermaLink="false"><?php echo $node->nid?></guid>
<pubDate>
<?php echo date('D, d M Y H:i:s T',$result->node_changed)?>
</pubDate>
</item>
<?php } // End while loop ?>
</channel>
</rss>

For more on custom rss from drupal, check out my post on rss with imagecache images and media enclosures

Thursday, March 3, 2011

Drupal RSS 2 with imagecache images in media enclosures

I got a new project to do. It involved publishing news clips to various sites via valid rss 2.0. The feed includes a headline, a link, a short text and an image. But, the image has to go in an enclosure like the following

<enclosure url="http://domain.com/file.jpg" 
length="3737" 
type="image/jpeg"/>

The way I chose to do this was by setting up a view to handle the listings and then doing a custom template based on the dead simple Basic. I removed all trim and just added the outline of a valid rss 2.0 feed structure by creating my page.tpl.php, node-view-viewname.tpl.php and views-view.tpl.php where node-view-viewname is the template file that will print the actual item. Easy.

Now here comes a problem... To make up for the never-ending creativity of our dear editors, we use ImageCache. Without it we would be lost in a world of image sizes and formats. If you are at all familiar with ImageCache you know that it's lazy and only generates an image if a request for an image fires a 404 error. This is fine for most purposes but since rss media enclosures require certain metadata about the file and it doesn't exist yet we need to force ImageCache to generate the image before we can fill the enclosure tag with the correct values.
We can do this in template.php by adding the following function:

/*
* This function pregenerates ImageCache images
* to give access to the file metadata
* @param $vars
*/
function basic_attach_ic_metadata(&$vars) {
$presetname = "100x100";
$filepath = $vars['field_bild'][0]['filepath'];
$preset = imagecache_preset_by_name($presetname);
$dst = imagecache_create_path($presetname, $filepath);
if (!file_exists($dst)) {
imagecache_build_derivative($preset['actions'], $filepath, $dst);
}
$vars['genimage'] = $dst;
} 

Which in it's turn gets called from template.php like so:

<?php function basic_preprocess_node(&$vars, $hook) {
  basic_attach_ic_metadata(&$vars);
...
}>
Now, when the node is prepared for display the variable $genimage will be set with the path to a generated image. This now available for your node template to consume and we can print the enclosure tag.

<enclosure url="http://domain.com/<?php echo $genimage;?>"
length="<?php echo filesize($genimage);?>" 
type="image/jpeg"/>

All done!

Tuesday, March 1, 2011

External ads in Drupal

Allrighty. So someone is twisting your arm and trying to make you add blocks of ad network script tags on your perfect Drupal site? Polluting your DOM? No way, not on my shift! Here's how to keep it safe.

First you need a basic understanding of a few concepts. I'll use Taxonomy, CCK, Views and some funky template preprocess magic all bound together by our beloved page.tpl.php. The external ads are presumably a bunch of script tags.

Start out by creating a .php or .html file that hold each ad, place these somewhere within your drupal folder in a folder, why not name it "ads". Any cache-busting techniques should be done with php. Note the full URL:s.

Now for some Drupal hands-on: Create a Vocabulary that will hold the tags that will tell Drupal which nodes get what ads. Name it "Customers". It should be a flat vocabulary.

Next, create a content-type that will serve as the holder for the dreaded external ads. Name it "Customer". Add as many text fields as the total of your ad slots. The text fields will only hold URLs and should be plain text only. Save when done. (You can easily add more custom fields here that can be used for your nodes)

Move on to Taxonomy->Vocabularies. Configure the previously added vocab so that it is selected for your "regular" content-type so you can relate it with its Customer. Now, add as many terms to the vocabulary as you have customer sections, i.e. companya, companyb, companyc.

Create a Customer node. Populate the fields with the URL:s from the first step. Fill the title field with one of your terms from Customers, for example "companya". Note: This needs to be unique.

Now, edit one of your "regular" nodes. Add the term "companya" to the node and save it.

Set up a view to load your Customer data by creating a new View, naming it "customerdata". The View should be Unformatted and row style should be Node. Add Node:Title as an argument, accepting the default 404/hide view settings. Add a filter on node type and set it to Customer. Save it.

The time has come for us to bring it all together and display the ads. This is when your template.php hacking skills come into play. Fire up an editor and open template.php for the theme you'll be using. You now need to create a function that will load the data from the Customer node and attach it to the node before rendering.
You also need to make sure this gets called when the page is constructed. You might need it on the page level or the node level, it depends.

template.php:


The last part is your page.tpl.php. To print your now safer external ads, construct something along the lines of
echo "iframe src=" . $ad1 . "/iframe";

The same can be repeated for any custom field. The process is the same, add it to the Customer content-type, fill out the Customer node, tag your node(s) with the appropriate term and output it in your page.tpl.php.

Upload template.php and page.tpl.php, flush your caches and load one of your tagged nodes. You should now see the ad being displayed. If not, check the html source for your iframe tag or flush your cache.

This example is just a basic example and can be improved upon endlessly. A few of those being Rules integration, error checking and Views caching.