Schema.org examples showing search result snippets in Google

Schema has been big news for SEO for quite a while now, yet very few people are using it. Whatever the project I’m involved in, I make sure schema forms part of the basic on-site recommendations. Why wouldn’t you take advantage of improved snippets, especially when all that is required are simple html decorations.

 

This post shows a few examples of the impact of using schema for improvements in search result snippets in Google.

Example MusicRecording Schema

Music Tracks in search results

Music Tracks in search results

The MusicRecording schema has been used to enable the rich snippet above. The code for the MusicRecording schema was as follows:

MusicRecording Schema

MusicRecording Schema

 

Example Recipe Schema

hrecipe is currently the most popular mark-up method, however the schema.org recipe can help dominate the SERPs.

This is how the common hrecipe microformat appears in SERPs:

Using hrecipe and appearance in SERPs

Using hrecipe and appearance in SERPs

In contrast, using the recipe schema can help occupy more space, the more ingredients you add…

SchemaRecipe helps dominate SERPs

SchemaRecipe helps dominate SERPs

The code used to achieve the result above is as follows:

Code for SchemaRecipe with several ingredients

Code for SchemaRecipe with several ingredients

It’s easy to see how schema could be used for tactical snippets.

Example Movie Schema

Movie Schema in SERPs

Movie Schema in SERPs

The code to create a snippet for movies as above is as follows:

Code example for the Movie Schema

Code example for the Movie Schema

The movie SERPs show the beginnings of how decorated data can be used to create readable text. ‘Directed by xxx, starring xxxx’ is the first step towards producing content created entirely automatically from an enriched data source.

Example of TV Series Schema

The TV Series schema offers ratings information:

TV Series in SERPs

TV Series in SERPs

To create the TV snippet above the following code was used:

TV Series Schema code

TV Series Schema code

If they’d included the optional TVSeries schema information above, then the snippet could have been better still.

 Example Event Schema

Events and news get special treatment in Google (fresh content), so if you want to get ahead in SERPs, then add a properly marked-up events page to your site.

Using vevent, the search results will look like this:

Using vevent in serps

Using vevent in serps

The code for vevent to create the snippet above:

The vevent html code

The vevent html code

Very few sites currently use the schema Event markup, here’s an example from the SES site:

Schema Event in SERPs

Schema Event in SERPs

and the code used to create the snippet above is the result of a minimal implementation. Schema.org offers many types of richer event markup, so this entry could have included much more information.

Code to create Event Schema

Code to create Event Schema

 

Sites are still not using schema.org to its full potential, despite the major search engines announcing that they support this format as the standard. There is substantial scope for all sites to incorporate pages with schema in order to improve appearance in SERPs and encourage the click.

Why is this taking so long to catch on?

 

 

Enhanced by Zemanta

SEO Colour Experiment Update

The Google images experiment has been running for 4 days, and so far, just two of the ten images published in the previous post (laying out the details of the experiment) have been indexed; namely the purple image and the gray image.

Getting images indexed in Google

Getting images indexed in Google - purple preferred

Getting Google to own up to indexing the gray image requires advanced search options, it is revealed for searches of black and white and gray images:

Indexed images revealed using black and white filter

Indexed images revealed using black and white filter

Indexed images revealed using gray filter

Indexed images revealed using gray filter

 

There are some early takeaways from the experiment so far:

  1. Google chose to index just two of the available images in the post, a purple image and a gray image. This might suggest that galleries of images are not a good idea for SEO, as some of those images might not make it into the index. This could have been caused by the naming convention I used in the test – most images had very similar filenames – so perhaps the indexing of just two of a page full of test images is to be expected ? Galleries may cause Google to make a choice about which image to index and relate to the content…
  2. Looking back at the number of results in Google for different coloured images, it is interesting to notice that there were only 55k purple results and 61k gray results… perhaps this is why those two images were chosen and indexed by Google, to boost the number of available search results for those colours (purple and gray)? As with keyword research for universal search results, picking under-competed image types could provide quick wins…

What else is interesting so far ?

Searching using the keyword ‘experiment’ and then filtering for ‘large’ images and selecting ‘purple’ images shows our test image in #1 as might have been predicted ! After all, it was a large purple image with ‘experiment’ in both the filename and alt text.

Results for search 'experiment'

Results for search 'experiment' with large/purple filters set

The other images in the search results had the following filenames:

#2 http://i259.photobucket.com/albums/hh289/LynnViehl/Experiment2.jpg

#3 http://www.eu-atp.org/wordpress/wp-content/uploads/2009/07/experimento-1024×768.jpg

#4 http://4.bp.blogspot.com/-_34x_4u4wgM/TzBW2aQ0SzI/AAAAAAAAAp4/yr-2nm5TeXI/s320/the%2Bgolden%2Blight%2Bexperiment%2Bpurple%2B%2526%2Bhay%2B%2528preview%2529.jpg

#5 http://www.tablix.org/~avian/blog/images2/2012/02/two_spectrograms_recorded_at_the_crew_munich-t.jpg

So the obvious difference is that the test image had the keyword ‘purple’ as well as well as the word ‘experiment’ in the filename.

What about the gray image ?

It appears in position #4 in SERPs when using the same filters:

Keyword experiment and filters 'large' and 'gray'

Keyword experiment and filters 'large' and 'gray'

The competitive images for this search have the following filenames:

#1 http://jn.physiology.org/content/103/6/2938/F10.medium.gif

#2 http://www.sns.ias.edu/%7Ejnb/JohnphotosHtml/images/John%20Bahcall%2C%20Homestake%20mine%2C%20Cl%20Solar%20Neutrino%20Experiment%2C%20SD%2C%20%7E%201964.jpg

#3 http://arkansasagnews.uark.edu/Eason_rdax_250x345.jpg

#4 http://www.seoeditors.com/wp-content/uploads/2012/05/gray-seo-colour-experiment.jpg

#5 https://share.sandia.gov/news/resources/releases/2006/images/mat-celina_nr.jpg

So, on closer inspection the images at the top of the SERPs are all from authority sites… Journal of Physiology, two .edu sites and a .gov

The test image from this blog for the keyword ‘experiment’ performs quite well given the competition here !

By searching specifically for ‘gray experiment’, the image performs better and is returned in #1:

Search specifically for gray experiment

Search specifically for gray experiment

Positions #2 and #3 discuss the YouTube ‘gray’ channel theme.

At this stage, not many more insights are possible so I will return to this experiment in a few days time when more of the original set of images may have been indexed.

 

 

Enhanced by Zemanta

SEO colour experiment

Having spent a long time as a C++ image processing software engineer and SEO, I’ve always thought I had a good handle on what works best for image search. A background in the field provides insights into the possibilities for image classifiers, such as skeletonization, histograms and more complex image transformations.

So I thought I’d share an experiment, to put some theories to the test. Starting with an experiment with colour.

As with any test in SEO, there can be no conclusion as the rules change (Google keeps updating algorithms), and many SEO factors will interfere as they cannot be isolated adequately for statistical rigour. The following test is subject to competition from images already indexed and the inter-connectivity of the web through time. So this test is pseudo-scientific. That said, I hope it still proves to be interesting…

The SEO colour experiment involves a series of coloured images.

Pink SEO experiment images

pink seo colour experiment

pink seo colour experiment

pink seo colour experiment

pink seo colour experiment

The current results for ‘pink seo colour experiment’ are:

current pink seo colour experiment

current pink seo colour experiment

 

Green SEO experiment images

There are two green SEO experiment images

Green SEO experiment images

Green SEO experiment images

Green SEO experiment images

Green SEO experiment images

The current results for ‘green seo colour experiment’ searches are:

current green seo colour experiment

current green seo colour experiment

 

Yellow SEO experiment images

Yellow SEO experiment images

Yellow SEO experiment images

Yellow SEO experiment images

Yellow SEO experiment images

The current results for ‘yellow seo colour experiment’ searches are:

current yellow seo colour experiment

current yellow seo colour experiment

Purple SEO experiment images

Purple SEO experiment images

Purple SEO experiment images

Purple SEO experiment images

Purple SEO experiment images

The current results for ‘purple seo colour experiment’ searches are:

purple seo colour experiment

purple seo colour experiment

White SEO experiment images

White SEO experiment images

White SEO experiment images

White SEO experiment images

White SEO experiment images

The current results for ‘white seo colour experiment’ searches are:

current white seo colour experiment

current white seo colour experiment

Gray SEO experiment images

Gray SEO experiment images

Gray SEO experiment images

Gray SEO experiment images

Gray SEO experiment images

The current results for ‘gray seo colour experiment’ searches are:

gray seo colour experiment

gray seo colour experiment

 

So, the scene is set. A selection of big block colour images, with a ‘gray’ version of each image colour so we can see whether the filename and other descriptors can influence the decision for the classifier, or whether the information in the image histogram overrides this deliberate misdirection.

The number of indexed images is in the order of billions, so the capacity for complex image processing may be reduced. In other words, image classifiers may still be relatively simple to avoid huge processing overheads, hitting only the filter options offered in image search.

I’ll revisit this post when search results change.

A work in progress !

Is SEO bounded by the PPC paradox ?

Data suggests PPC may harm your ability to rank…

One of many holy grails in search is striking the right balance between PPC spend and getting appropriate traffic through SEO. By SEO, in this article, I’m referring to the rather narrow definition of position in organic SERPs.

There are 3 main scenarios involving optimisation, paid search and organic rankings

Scenario A – A small business with limited budget

A client is ranking on page one, and receiving a paltry amount of traffic for a head term because their URL is ranking beneath competitors. The top positions are way out of reach, because the client is small, new to market, or has an unknown brand. Whatever the reason, the problem for the client is that the competitors are simply gaining the click throughs because they are in top spots.

In order to compete, the client turns to PPC and pays to be seen at the top of the page for important keywords, with optimised ad copy and a well organised PPC campaign structure.

 

Scenario B – A cash-rich business in a competitive market

A client is ranking well on page one, but wants more, much more. In fact, they are bent on annihilating all competitors and are prepared to spend on PPC to achieve this aim. This happens in competitive startups where brands vie for mindshare, or when impressions are all important to boost brand awareness metrics. This scenario is generally a declaration of war amongst marketeers, with PPC adverts littered with brand terms and brand references.

 

Scenario C – Most medium sized business with in-house SEM teams

Even established campaigns need optimisation (!)

Professionals in search are target driven. Whether the target is to increase ROI, leads or conversions or support marketing efforts in a timely fashion. The overriding objective is often to control budgets and minimise waste whilst optimising PPC campaigns automatically through APIs, through tools such as Kenshoo or Marin or manually via Adwords Editor. The process is likely to involve a pivot table or two. CTR for PPC campaigns is undeniably a very important metric. The best PPC managers will also pay attention to natural rankings for keywords used in PPC campaigns. The noble objective is to marry organic ranking data with that of PPC bids/positions and achieve the targets set.

When budgets are cut, a common tactic is to look at your keywords capable of ranking organically in 1st place, and then trimming the PPC budget accordingly – the logical question asked is “Why waste money on PPC, when we are getting 40+% of the traffic from being in 1st place organically?”

Conversely, if a competitor is ranking in 1st place organically, pushing your business into 2nd or 3rd place, then there is the fear that turning off top-placed PPC slots will be akin to “handing your traffic to the competition on a plate”. That’s the hold PPC can have on your business and this is where achieving a balance between SEO and PPC becomes paradoxical…

Let’s step out of the PPC world for a moment and consider organic search.

Most people in search would agree that there are at least 200 different factors involved in determining where a URL ranks in search results. Hold the front page! Furthermore, as time passes it is a safe bet that the algorithms competing and collaborating to determine the mix of results in the final ‘sort order’ become ever more sophisticated. Nothing too contentious there either I hope!

Amongst the myriad of ranking factors, it is quite likely that CTR is included in the decision taken. For SEO, Google makes CTR available through GA and GWMT and it can provide useful information when addressing poor calls to action in search results.

SEO’s have noticed for some time that new URLs are often ‘tested’ in search results – perhaps to determine an initial CTR score ?

See: http://www.coconutheadphones.com/does-google-use-click-through-rate-as-an-organic-ranking-factor-answer-maybe/

See: http://www.stonetemple.com/search-algorithms-and-bing-webmaster-tools-with-duane-forrester/

The underlying principle here, is that a URL attracting all the clicks, could easily be the most useful page in the search results, especially if visitors hang around on the site post-click.

So, it makes logical sense that a URL attracting all the clicks should be promoted in the search results. After all, it is likely to be the best result available.

Conversely, a URL with abysmal CTR is unlikely to be kept in 1st place for very long… this seems like a good way to filter out bad sites – URLs in high ranking positions with poor calls to action, poor Ux, or badly laid out snippets will attract fewer clicks and over time drop in search results.

My question is: Do you think CTR may be a contributory factor in the calculations to determine the future position of a URL in the SERP?

My answer: is that there may be a time-weighted relationship between CTR and rank ie. a time lagged causal relationship rather than just a correlation.

Part of the work of an SEO is to optimise snippets to attract clicks (the analogous activity to improving PPC ad copy), one can monitor improvements in GWMT (With Change) statistics showing how the new improved snippet has changed the CTR for the better. Over a period of weeks or months, the URLs with improved CTR often move into higher positions in SERPs. Improving CTRs can help towards an improved rank.

The diagram below shows a typical screen layout following a search on Google. The blocks represent positions of PPC and organic search results that could be representative of any of our PPC scenarios.

In this sample layout, your clients PPC ad is dominating the page in position 1, forcing the PPC advert from Competitor B into position 2. The aim of aggressive PPC bidding may be to counteract the competitors strong organic position #2, in an attempt to attract their clicks.

PPC SEO paradox

PPC used to compensate for poor SEO

The paradox here, is that if CTR in organic results is used to determine rank over time, then by bidding through PPC you are harming your organic CTR.

The clicks for your brand will go to your PPC advert at the top of the page, starving your organic result of clicks.

A visitor looking for your company or brand, will see the PPC advert first and the organic result in third place will attract less clicks than it deserves. At least some clicks will be lost to the PPC advert at least some of the time.

What makes this worse, is that the resulting lower CTR for the organic result can be exaggerated by the very presence of the PPC advert whether the ad copy is good or bad…

If a visitor sees the PPC advert for your site at the top of the page but chooses not to click on it, then they may be less inclined to click on your organic result too. The decision ‘not to click’ on the PPC ad could be taken when the ad copy is poor or irrelevant, or when they’ve seen the advert before and it didn’t give them the landing page they wanted.

Whatever the reason, the PPC advert that you place in #1 must be good else you risk ensuring that the click goes to your competitor.

…and…

the better your PPC advert, the less clicks your organic result receives

So, running PPC is not a simple decision. A bad advert can turn people away from your brand, and a good advert will also steal clicks away from your organic result, increasing costs.

If, as is logical, CTR is used in the calculation of position of your URL in the organic SERPs, then by reducing your CTR as a consequence of running PPC campaigns (optimised or otherwise!), you are more likely to drop in search results than to improve.

Therein lies the paradox of SEO and PPC.

Time to test this theory and show some interesting results !

Introduction to test methodology

First, pick a single, highly competitive head term, for which the site ranks sustainably in the top 4 and for which there is a history of considerable PPC spend.

A term attracting many thousands of the clicks per day through PPC was chosen (typically over 3k visits per day), making it an ideal test candidate for a CTR impact study.

Test Stage 1: Does Turning PPC off improve CTR and rank ?

For 2 days at the end of March, we turned PPC for this high traffic term off in all our major campaigns. The impact on PPC traffic, for this term alone, can be seen below.

Turning off PPC to test impact on organic CTR

PPC turned off for 2 days on 30th March to test impact on CTR

Lets compare the PPC traffic graph above to the CTR and Average Position data for this same head term below…. Discussion on the impact turning off PPC has had on organic CTR follows…

Improved organic CTR lead to rank change

Improved organic CTR lead to rank change the following day

  • When PPC was running, our head term averaged 3rd place organically, with a relatively poor CTR (for third place) of just 5%.
  • When PPC was turned off on the 30th March, CTR improved instantly to 9.23% (the term remained in 3rd place)
  • The following day (31st March), PPC campaigns were still switched off. The CTR for this term increased to over 15% and average position had improved to 2.5.
  • The really interesting day came next because the average position improved to 1.3. The organic CTR is just 12% because PPC had been turned back on.

If PPC had remained off, then the organic CTR in first place would have been higher.

 

Stage 2: Turning PPC back on

By the 2nd of April, PPC was once again running at near previous levels (2.5k a day visitors) for the head term.

Once again, the average position for the head term dropped back to 3.0 and the reported organic CTR had dropped back to 5%.

Our new aim was to determine if there was a ‘sweet spot’ for PPC alongside SEO, so we decided to aim to get the organic CTR as close to 9% as possible. This was the level at which organic rank seemed to improve…

Stage 3: Optimising organic CTR by dialing in PPC

The theory was that a CTR of 9% had been sufficient to improve the average position from third to first at the end of March within a day… (stage 1).

In order to test this idea, we estimated that by reducing PPC spend by half and waiting for CTR results to come through (as shown above), we might be able to improve organic rank once again.

This test started on 4th April and it took until Monday 9th April before we saw improvement in average position. By Tuesday 10th April (one week later), the chosen head term had an average position of 1.2.

Reducing PPC to help improve organic CTR

One week later, aiming to support an organic CTR of 9% in third place, average position improved. By the next day, the head term was in first place.

Stage 4: Perturbation testing

Now that the head term is in first place organically and has a reported CTR of 25%, the next step is to increase PPC once more, to see if there is a point at which we lose this organic position and uncover correlations to reported organic CTR.

You can see in the graphs above, that on 13th April, PPC traffic has been increasing and the organic CTR has already dropped back to 12%.

At the time of writing, the head term is still in first place organically and our traffic from organic search looks like this:

Traffic from organic search during CTR test

Optimising organic search traffic by dialing down parasitic PPC - a test in progress

There is some cannibalisation of search traffic by PPC once more, yet for the time being the head term remains in first place…

In summary

We continue to test running PPC at various levels against top organic results, early results suggest that CTR may lead rank change, so there are two important points to consider:

  • you could be stealing clicks away from your organic result – harming the ability of a head term to climb in rank organically
  • with bad PPC copy, you could even be increasing CTR for your biggest competitor, boosting their ability to rank well organically…
Enhanced by Zemanta

Going on a date with Google – Dublin Core and More

It’s all about the first date… and this post covers some of the devious lengths Google will go to to get it.

Why does Google want to get a date out of you?

Time is generally accepted as an essential ingredient in the calculations of relevancy in SERPs. Some events will be promoted in SERPs because they are news, others, such as recurring events, may well appear in results at appropriate times. Updates and edits to site content can help revive an older story. All these examples are dependent upon the date attributed to the article, the origin and the changes made. The minty freshness update back in November 2011 caused noticeable impact to around 10% searches.

To try to benefit from these changes, many site owners looked for ways to touch up older pages and content to make it at least look more current. It is the equivalent to software folks in Unix world reaching for the ‘touch’ command to change the ‘last modified dates’ on all their files to ‘get things noticed’.

 

But what is Google looking for?

With the drive to discover fresh content and attempts to improve the understanding of timelines, Google will try to ‘datestamp’ everything it finds… and it can make mistakes when generating a datestamp for your content that impacts your relevancy to search, especially if ‘date order’ is selected as an advanced search option.

 

Unique IDs as Dates

Many e-Com sites use unique identifiers in page titles in order to reduce duplications and create unique, informative pages. Readers are likely to be familiar with GWMT diagnostics section on HTML suggestions:

Non Informative Title Tags and duplications

 

In order to reduce the number of duplicated page titles, a typical workaround would be to use the following format for product details pages:

{brand} {product} {key feature} – {unique identifier}

The purpose of the unique ID is to ensure that varieties of the same product (e.g. sizes, colours) each have a unique page title. [You may have noticed GWMT also reports 'noninformative' title tags... strongly suggesting that the title remains a key source of information for Google's indexing algorithms]

For an average SEO making page title unique is just the start. Introducing an ID can guarantee uniqueness and introduce useful information (win-win!), but it can have unintended consequences (oops!).

The purpose of this post is to cover just one of these unintended consequences – misinterpreting a product ID for a datestamp.

The search result below shows an example of Google misinterpreting a product ID and using it as a datestamp in results:

Product ID used as date stampThe unique identifier 11171977 has been interpreted as a date. Now this listing carries the datestamp: 17 Nov 1977 which has been taken from the product ID (11-17-1977)

That’s inconvenient if you’re trying to sell this house, it looks like its been on the market since 1977 !

Here’s another example (pink camera)- this time it’s the model number that is being treated as a date:

Pink Camera product ID as date stampThe DSC-T70 8.1 in the title is being treated as if it means 1st August 1970 !

Dates in Content

When an article is posted on a blog, there is often a time-stamp associated with the post and prominently displayed at the top of the page. Google finds this on the page and the association is made, in this case, quite sensibly assuming your server time is correct and you aren’t manually fiddling the dates.

Less obviously, if you have a table or other date source within your content, there is a chance Google will take that date to be an appropriate datestamp. For instance, if you reference the last time a certain event happened, but fail to state the current date, then the entire article could be catalogued as the time of the referenced event rather than the current one.

Be careful to check your content for date references and help Google find an appropriate date for your content.

If you see a strange datestamp in SERPs, then look at these on-page elements for clues as to the cause of the mix-up:

a) Footer for software version numbers

b) Tables of reference data

c) References to dates within text (ensure the first date referenced is the date you’d like to have associated with the post)

d) Telephone numbers

 

Avoiding a bad first date:

Here are a few ways you can help make your first date with search engines more memorable:

a) Place the preferred date between the article’s title and the article’s text in a separate line of HTML, and remove all other date references from the body of the article to avoid confusions.

b) Clearly mark up the datestamp you want associated with the article using the Dublin Core meta tag:

<code><meta name="DC.date.issued" content="YYYY-MM-DD"></code>

or more completely:

<code><link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />
<meta name="DCTERMS.created" content="YYYY-MM-DD" />
<meta name="DCTERMS.modified" content="YYYY-MM-DD" />
</code>

c) Use the <publication_date> tag inside news sitemap xml files

d) If using OpenGraph, then add the article:published_time, article:modified_time tags

e) Using microformats and schema mark-up to ensure telephone numbers are parsed correctly (see Schema.org for these definitions), ie. not treated as dates.

f) Modify the way product IDs are displayed. Introduce characters or spaces to break up the ID so it becomes less ‘date-like’, yet retains uniqueness.

If erroneous Google datestamps have caused you problems with search relevancy, please let me know !

 

Enhanced by Zemanta

301 redirect URLs reappearing in Google – Phantom menace

Over the last few months, phantom 301 URLs have been showing up in Google search results.

What is a phantom 301 URL?

These are old URLs that have been previously 301d and dropped from the Google index, but which are now showing up again. (Read: 301 redirected showing up again after 1 year)

Example 1:

PropertyFinder was bought by Zoopla and 301 redirected correctly with a link map that matched appropriate pages to each other across domains.

Until recently, the PropertyFinder website would not show up for the search, as the site had been ‘de-indexed’ in Google. However, looking at the screenshot below (Jan 2012), it is clear that PropertyFinder now has a Phantom URL in first place in Google search results:

Phantom 301 URL

 

Example 2:

Dothomes.co.uk was bought by the DPG Group and redirected to FindaProperty.com

Again, as with PropertyFinder, the DotHomes website would not show up in results until recently. Performing a site: search for this old domain suggests Google has acknowledged the permanent redirect:

Google acknowledges the 301 redirect

However performing a direct search for the old domain (dothomes.co.uk) shows a mixed up result that:

  • displays the old 301d URL (dothomes.co.uk/)
  • uses the old 301d URL as destination for the current FindaProperty.com title
  • the current FindaProperty.com description and sitelinks

DotHomes shows in search results

 

Phantom 301s have been appearing more frequently recently (since Dec. 2011), and understanding the full implications of this change in Google search results is important if:

1. You have a website founded on a hotchpotch of old websites that have been 301′d

2. You are considering a redirection project, such as conversion to friendly URLs from unfriendly, parameter based URLs

 

What is the impact of a phantom 301?

Websites that have been bought out, or taken over and had their pages correctly 301 redirected to new sites several years ago are most affected by this change.

The historic 301d URLs from these old sites are re-appearing in search results, taking up positions they used to occupy before the 301 was put in place. The impact is at least threefold:

  • There is a new duplicate content risk for the page redirected to (as the redirection has ‘failed’)
  • The search results show an old entry, with a less optimised call to action (historic titles, URL & snippet)
  • Equity may not be passed to the destination URL (as the redirection has ‘failed’)

Before you say, ‘I know how 301s work, and what you are saying simply isn’t true’, read on!

How 301s used to work in the ideal world

What used to happen, is that Google would crawl the old URL, receive the 301 header response and new location, then pass ‘some’ link equity to the new URL, index it and de-index the old URL over a period of a few days.
The upshot is that your new URL would rank immediately, and the old URL would be gone forever…

What’s changed ? Is this Panda related ?

What has started to happen, is these old URLs have popped back into SERPs.
There have been no technical changes, these old URLs are still 301′d (no changes made for many months / years in some cases), but the they are being presented in SERPs carrying the old site name and titles, showing the old URLs and often having much lower quality CTAs.

So the smoking gun points to Google tweaking the algorithm…

It is as if 301 redirects are no longer guaranteed to remove a URL from the index.

If Google believes the old URL is the best result for the search, then it will show it.

Digression follows:

My personal hunch, is that Panda is strongly related to CTR testing and bounce rates. Sites that have suffered under Panda have tended to have the sort of content you’d ignore (packed with adverts, poor layout and spelling etc), or the URL in the search results that you’d usually skip over, because the last time you clicked that site, it was full of articles based entirely on spun content.

Panda nailed those sites, they were the sites people didn’t like to click on, or stay on.

The phantom 301 URLs could be the latest extension of presenting and testing the URLs that people prefer to click. They are back in the popularity contest, despite the permanent redirect. An old URL may have had the best CTR. The only way to find out, is to re-present the old URL in the search results and see how it fares. If it receives a lot of clicks, then the phantom menace may be there to stay, whether it is 301d or not.
This change may leave Google open to abuse and may change the way we work with old domains

Here are some ideas:

  • Create a lot of duplicate pages with poor titles and CTAs, then 301 them to competitor URLs in the hope that some of the ‘fresh’ pages create phantoms and devalue the competitors original URL.
  • Occupy more search positions by creating multiple URLs within a site (such as changing to friendly URLs), to allow ranking of the old unfriendly URL that has been 301d alongside the new friendly URL. Continue to add new URLs and 301 to old URLs to create a 301 farm. Google may show some of the new URLs alongside their duplicates for CTR testing.
  • Link build to old domains (if they were popular) that currently 301 to your site. These old domains are often recalled strongly and continue to be entered in search results, even when a company’s name has changed. For example, even though Santander bought Abbey, searching for ‘Abbey National’ still has a lot of suggestions, and shows phantom 301 results in the SERPs:
Abbey National Phantom search result

If you have any experience of this phantom 301 phenomenon and how it could be used to advantage, please get in touch.

Enhanced by Zemanta

fb_xd_fragment added to URLs with IE 7.0

Seeing URLs with fb_xd_fragment= reported in Google Analytics may not mean that your SEO is suffering, so don’t panic!

Your alarm might rise further when you see that the presence of the fb_xd_fragment= parameter causes the entire page content to become hidden whilst the page is loading. Still, don’t panic – investigate!

There are three main situations when panic is justifiable:

1. If you see that Google has indexed pages from your site with the additional &fb_xd_fragment= parameter, (Hint: use the site: and inurl: operators to check)

2. Users are reporting that they are seeing ‘blank’ pages on your site e.g. http://www.letsrun.com/forum/flat_read.php?thread=4349142 If you have come to this page looking for a solution as a user of a site, try to delete the ‘&fb_xd_fragment=‘ from the end of the URL and reload the page. if that does let you see the page then contact the webmaster and direct them to this post so they can implement a fix.

3. Since the confirmation that Google is indexing facebook comments (cf. Matt Cutts http://www.searchenginejournal.com/google-indexing-facebook-comments/35594/), the indexation of bad URLs from your site may grow as a consequence. Google may read the javascript on your site and generating these URLs to find additional pages to crawl.

Why are URL including &fb_xd_fragment= being visited in the first place?

The standard implementation of Facebook Javascript SDK for ‘Like’ buttons has been cited as the cause of fb_xd_fragment being appended to URLs for users of “certain browsers”.

It appears that Facebook Javascript SDK causes these ‘phantom’ visits when real visitor clicks on a ‘Like’ button (and overwhelmingly, the visits come from visitors using IE 7.0). It is a combination of the XFBML version of facebook plugins and IE 7 that gives rise to these URLs being generated by javascript and then reported in GA.

Now for a run through of the solutions to the fb_xd_fragment bug…

Here’s a recipe for various solutions depending upon the scale of your problem:

fb_xd_fragment solution A:

Use a custom channelURL file (channel.html) stored at root level. This is used to overload the FB.init() constructor by setting the channelUrl parameter value when FB initialises.The contents of channel.html are simply:

<script src="//connect.facebook.net/en_GB/all.js"></script>

The file above must be cached for as long as possible to provide a smooth user experience and avoid time delays.

This file can fix cross-site scripting issues (see http://developers.facebook.com/docs/reference/javascript/) – Facebook dev resources refer to this issue with tongue in cheek: “The channel file addresses some issues with cross domain communication in certain browsers.”

Finally, add the following code to your pages (on detection of IE 7.0):

<div id="fb-root"></div>
<script>
window.fbAsyncInit = function() {
   FB.init({appId  : 'MY APP ID',  status : true,  cookie : true, xfbml  : true, 
            channelUrl  : 'http://www.yourdomain.com/channel.html'});
};

(function() {
var e = document.createElement('script');
e.src = document.location.protocol + '//connect.facebook.net/en_GB/all.js';
e.async = true;
document.getElementById('fb-root').appendChild(e);
}());
</script>

 

fb_xd_fragment solution B:

Use javascript to undo the nasty side effects of the fb_xd_fragment= in order to make the hidden html visible once again. Use onLoad() to run the following script. NB. As with all scripts, this solution may not work every time, as is essentially a sticking plaster to undo what has already been done.

<!-- Correct fb_xd_fragment Bug Start -->
<script>
document.getElementsByTagName('html')[0].style.display='block';
</script>
<!-- Correct fb_xd_fragment Bug End -->

fb_xd_fragment solution C:

For those of you that would rather not use a custom FB.init and channelUrl, or add script to undo this ‘blanking’ effect, then the cheaper option is to use a redirect. Use 301 redirection to ensure that any URL requested with the fb_xd_fragment is redirected to the same URL only without this fragment.

 

fb_xd_fragment solution D:

Taking another step back from addressing the cause, you could try to prevent indexing of these pages by using the rel canonical link. Ensure that rel canonical is defined without the fb_xd_fragment on all pages that have Like buttons (use Browser/User Agent detection to set the canonical to keep page weight down, if preferred).

fb_xd_fragment solution E:

Use WebmasterTools Site Configuration / URL Parameter settings to set to ‘ignore’ the fb_xd_fragment parameter. This informs Google that URLs with this parameter are not important, and should not be indexed.

fb_xd_fragment solution F:

Fix (or more accurately, ‘hide’) the reporting. To keep these ‘phantom’ additional page views out of GA, add a filter in Google Analytics to strip all pages that include the fb_xd_fragment. Ensure that you have identified and fixed (if necessary) the cause before filtering these pages from GA! If you have implemented Solution A, then you don’t need to use a filter. The problem of reporting the ‘phantom’ pages should go away with Solution A!

 

As with most SEO work, pick the solution appropriate to your needs.

Footnote:

If you see visitors appearing to come from a Google cache with this parameter, it is likely that these visitors use iGoogle and they are following a cached link from their dashboard. Look at the rlz parameter in the cache string for G1 to identify the source as iGoogle:

iGoogle cache with Internet Explorer 7 rlz parameter

 

Enhanced by Zemanta

Browser page prefetching causing high bounce rates

Monitoring the pages with high bounce and taking decisions should be a standard part of a professional SEO day to day activity. Bounce rates in themselves don’t give the entire story however, it is nearly always necessary to dig a little deeper before a full understanding can be reached and action taken.

Here’s an example of where investigation sheds light on unusually high bounce rates:

On analysis, one page showed unusually high bounce rates – searchresults.aspx

Over time since 20th July 2011, the number of direct visits to this particular page has been steadily increasing – see screenshot from Google Analytics below.

The danger is that analysis of visits to searchresults pages alone could show a fake growth, as well as distorting bounce rate statistics… further analysis of the cause of this increase in bounce rate and action to remove these pages from any future reporting is essential.

 

 

As you can see, Safari is the the browser at the root of this particular evil.

The table above shows the majority of visits to the searchresults.aspx page are coming directly from Safari.

On analysis of the browser version, it turns out that there was a release of Safari 534 on 20th July 2011 so what we see here, is a steady take up of the new browser over time. This issue is likely to continue to get worse and already accounts for 2,000+ daily visits…

 

The Safari 534.50, 534.48.3 and 534.51.22 versions are all calling the searchresults.aspx page directly – as part of a prefetch cycle – and skewing the statistics in Google Analytics as a consequence.

Tip: Make sure you understand what is causing bounce statistics before taking decisions.

 

 

Disruptive advertising: The internet is not TV

When big brands want your attention, they are very skilled at getting it, especially on TV.

That’s mostly because TV adverts have been around for far longer than the internet and best practices have been honed over decades, and drilled into budding brand managers and media buyers.

But the world is changing, the internet is still in its infancy yet is re-writing the rule book when it comes to audience selection, reach and attention.

TV has, until Smart TV, followed rigid rules making the advertisers job relatively straightforward. The brand demographic lends itself to a select group of TV shows, a well bounded time of day making the best time to show your advert pretty much decided. All you need do is write the cheque and hope your brand awareness and sales rocket.

This all works because your audience is a sitting duck. Sitting on the sofa watching their favourite TV programme and every so often being bombarded with advertisements in the ad breaks. The louder, more repetitive adverts or those with an element of humour are those that disrupt the usual attempts to ignore the ad and get through to the mind of the target audience.

Loud, repetitive and humorous are disruptive techniques that work well for TV.

Then along came the internet and in the 1990s, the equivalent of the disruptive ad was the pop-up. Typically these ads were full screen and obliterated whatever you were trying to browse in Netscape.

More pernicious sites would fire hundreds of pop-ups at the visitor, preventing the 133MHz processors and Windows 3.1 machines being able to keep up with the rate of display. The only way out was to click the ad.

This was disruptive advertising on the internet and it wasn’t long before pop-up blockers were being used to stop this annoying experience.

Today, pop-up blocking is part of any good browser as standard, so there are very few sites that attempt this form of evil advertising – except for scam sites, malware sites and porn sites in the darker recesses of the net.

Advertising on the internet is no longer disruptive, it is very different.

Most large corporates have sites that can process transactions any time of day or night. Until smart TV arrives, this makes advertising a very different beast on the net compared to the disruptive techniques used on current ‘traditional’ TV.

The TOP 5 ways advertising differs on the net:

  •  Choice

  •  Conversions

  •  Community

  •  Calls to Action

  •  Counting

Taking each in turn, lets start with Choice.

When you are browsing the net it is an elective experience. You actively choose to go from site to site, or use Google / Bing / Yahoo to search. Advertising has to keep up with your moves and many sites will drop cookies on your machine to record the details of your visit – the products you looked at, the keywords you searched for, how close they were to making a sale. This information is used to prepare adverts you see on the next site to entice you back to a previous site and complete a transaction, or view alternative relevant products. The key is that successful adverts on the net are not distruptive. The adverts that work, are the adverts you choose to click on. The CTRs are recorded and the best adverts are used on subsequent visitors.

It is the choice on the internet that hones and tunes the adverts everyone is shown, from PPC ads to adverts shown through re-targeting. Choices made by thousands of people determine the winning advert, the one that appeals the most. This is rarely the most disruptive, often the most useful instead.

If you want your adverts to perform well, it is imperative that you provide adequate choice and let visitors decide which advert they like best. Unlike TV, the visitor is in control on the web and they can choose to leave a website at any time if they find the adverts get annoying, they simply go to a competitor website. Whereas, a website with adverts that are appropriate will find that visitors choose to use that site in preference to one where the ads appear distruptive. Google’s Adsense program is the direct result of the importance of appropriate advertising on a website.

Google’s recent Panda updates may be penalising sites that choose to run with disruptive or inappropriate adverts. Such sites repel visitors. Google has openly declared that it is working to promote sites that visitors like best in search results, those are the sites without off-putting advertisements.

 

The second difference is Conversions.

Conversions can be recorded and attributed to paid advertising campaigns, be they PPC, video, re-targeting or simply direct visits. This data is crucial to making effective advertisements. Unlike TV, there is a direct measurable impact of running a good advert against a bad one on the internet. So long as the attribution modeling for the various channels is appropriate, it is possible to tell exactly what works and what doesn’t when running adverts on websites.

After a TV campaign, dubious data is collected from questionnaires with leading questions about awareness and the likelihood of buying a product now you’ve seen the ad. At best, this is a vague finger in the air. Even with technology recording viewer metrics it is hard if not impossible to relate this TV activity to subsequent sales.

On the internet, conversion data is in black and white. PPC managers continuously tune campaigns to cull and grow keywords, tweak text, video and audio creatives. They not only know whether an advert worked, but they know how to make variants of a performing advert work even harder in a matter of days. TV advertising can touch that for an optimisation process!

But it doesn’t end there.

The TV shows had to be decided and ad space booked months in advance. If you’ve picked the wrong show for your demographic profile it is then very hard to change it. With the internet, if an advert performs poorly on one website, it can be removed in an instant and displayed on an alternative site (still fitting the same demographic profile).

The ability to constantly evolve adverts and the sites that the ads appear on based on performance (e.g. ROI or CTR), puts the internet in a very different place to TV.

 

The third difference is Community:

It doesn’t take long for a good advertisement to be shared around the world. Have you seen the VW advert with the child in the Darth Vader outfit?

A strong advert on television has people ‘looking out for it’. They may never see it if they happen to be part of the demographic that wasn’t being targeting on TV.

A strong advert on the internet can be seen be everyone. Stretching outside the target demographic and into new measurable communities in an instant. Whole new markets open up when this happens, and the data is readily available for use. Visitors to YouTube or other video sharing sites can be analysed and communities identified.

What brand wouldn’t like to know that it was a 80:20 male to female ABC1 demographic afterall, what if the data showed that many more women were interested in the product as a result of the advert being shared.

The community aspect of the internet is all powerful, with Facebook, StumbleUpon and Twitter leading the way in relating brand sentiment.

Monitoring the effectiveness of a TV advert through social media is possible, of course, but the TV advert alone doesn’t engage with communities.

Advertising on the internet (when done properly) can engage and build advocacy that will resonant in on-line communities for years.

 

The fourth major difference between TV and the internet is the use of Calls to Action.

With TV, the options are somewhat limited. Go out and buy it, is the message.

With the internet, CTAs can be much more subtle and elaborate.

The basic message is often ‘Come and buy it’ in primitive PPC campaigns, but there are more effective messages available too.

Visitors can be asked to:

1. Enter competitions

2. Tell a friend

3. Leave a review

4. Join a newsletter

5. Leave feedback

6. Come back and get a discount

7. Vote on polls

8. Complete questionnaires

9. Rate a product review for its usefulness

10. Consider buying related products at the checkout

I’ll stop there as there are so many CTAs possible on the internet. A good customer relationship strategy is essential to order and maximise the impact of each call to action.

Compare the internet the Calls to Action possible on the TV are embarrassingly poor.

 

 

The final difference is in Counting.

The king metric for the TV is TVRs. These TV ratings hope to predict how large the audience will be for the disruptive advertisment. The TVR count is a vague attempt at counting the numbers of people exposed to the advertisement. Some celebrity or live competition TV programmes attract millions of viewers, others only a handful.

Depending upon the demographic and hence the programmes selected, the TVR count can be very different.

Distruptive advertising holds true to the linear nature of markting and sales. The more people you can expose to the advertisement, the higher sales will be. The more you spend on advertising, the greater the sales in a linear relationship.

This linear relationship that works well in TV, doesn’t translate to the web, where people can choose not to watch the advert and skip to the next site, or return to Google to revise their search. The ability to avoid an advert by not clicking on it makes the internet a very different beast when it comes to counting and projecting sales. In simple terms, the relationship between exposure and sales seen in TV is a much less direct relationship on the internet.

The ability of a website visitor to choose not to watch an advert, not to click an advert and even to leave a website with unpleasant looking advertising breaks the linear relationship between marketing spend and sales. The linear nature is broken further due to the optimisation process possible on the internet. For no additional spend, conversion rates can be doubled or tripled by an experience PPC campaign manager. In fact, it is often a foolish pursuit to add budget to an already well-optimised campaign in the hope of gaining additional visitor and sales counts.

As with all optimisation processes, there is a sigmoid curve to follow and a sweet-spot to occupy to keep conversions up and costs down. Increasing the ‘count’ will only serve to reach the maximum spend with diminishing returns on investment.

At the start of a campaign on the internet, the adverts are poorly optimised (unless you are very lucky!). The numbers of visitors choosing to click on the advert, (as well as landing pages, quality of the host site etc.) will determine the cost of the advert itself. Over time, adverts will out perform others and through active management the best adverts (usually the adverts that lead to conversions) are retained and the costs controlled. Increasing the budget will initially invite more clicks and lead to more conversions. Beyond a certain point, however, the additional budget will simply cause the adverts to be more prevalent and exposed to those with less intent to buy and a greater propensity to click on sites that are less suitable.

In extremis, the desire to spend a very large budget in order to increase visitor counts could see adverts being shown across inappropriate networks with clicks costing more than the optimal amount and leading to reduced conversions. Such campaigns should be scaled back until the ‘sweet spot’ in the sigmoid curve is found once again.

It important to understand that visitors are not simply ‘counted’ on the internet. They are monitored, optimised and constantly changing depending upon a myriad of factors. Visitors may be more likely to convert on a Monday, for example, from certain sites, and through social media on a weekend. The relationship between marketing spend and sales is not linear on the internet, and optimisations are continuous to squeeze the most ROI.

This may mean not spending all the money if conversions are low because the sun came out and everyone stopped browsing the web!

The optimal ‘count’ on the internet is constantly changing. Adverts, positioning, bids can all be optimised to keep costs low and conversions high. The concept of a ‘count’ for the internet is an epiphenominon – its like asking: Exactly how many ants built that anthill?

I don’t expect ants to be predictable. Buying more ants may not have the desired effect – in fact it could affect the building of the anthill in unpredictable ways. By optimising location, soil and sunlight and providing the ingredients for an anthill, the right numbers of ants ‘will simply happen’ to build the anthill in the most effective way.

There’s no point counting the ants, when it’s getting the all building blocks together for the anthill that matters.

 

 

Brand Search Impressions Drop in Webmaster Tools

In mid August I noticed huge drops in brand term impressions showing in Webmaster Tools, but hardly any change in brand term click through volumes in Google Analytics.

 

Brand Terms Dropped in Google Webmaster Tools

Brand Terms Dropped in Google Webmaster Tools

What was going on?

Impressions have dropped by 66% !

The first thought was some sort of penalty for the brands had been applied, perhaps a Panda related change meaning that the brand no longer appeared in search results? However the number of clicks remains unchanged.

Furthermore, even though the volume of impressions had dropped by around 66% for brand terms, other organic search terms were completely unaffected.

If this was Panda, it was a ‘brand keyword selective’ Panda and that didn’t make any sense at all. From studying other sites, the affect of the Panda update has been seen to help or support known brands with very few exceptions. So any Panda related penalty would have dropped impressions for all keywords, not just brand terms.

Digging deeper…the date of this recorded impression drop was 16th / 17th August 2011.

This coincides precisely with the date that the enlarged sitelinks were released in the UK.

Before the expanded sitelinks were introduced, a typical brand search may return up to 7 entries in organic search results in Google. [Typical brands results showed up to 4 URL entries, however on occasion some dominant brands were able to claim 7 spots in the 1st page]

Since 17th August, a brand search gets just ONE slot in SERPs with MANY sitelinks beneath (up to 8 or 12 for dominant brands). The single entry is counted as an impression in GWT, and the individual sitelinks beneath are not included in the impression count. The total number of impressions recorded for a brand term search is therefore much reduced.

A drop of 50% means that the brand previously had two entries in the SERPs, where now it has just one.

A drop of 75% would mean that where the brand previously claimed four entries in SERPs, it now just has one.

So for a typical brand search, I’d expect GWT to be reporting drops of 50 – 86% for brand term impressions, but similar levels of click throughs and a higher reported average CTR.

TIP:

Exclude ‘brand’ terms when examining impression data spanning the date of this change in GWT. Brand search usually accounts for a significant % of traffic, so changes to sitelinks will skew your analysis. This is especially true if you are looking for GWT impression uplifts following SEO efforts… don’t let the Google’s modification to brand term impression drop spoil your analysis!