Episode 131 | Posted on

WordPress SEO Deconstructed by the Master with Joost de Valk

As a listener of this podcast, there’s a good chance that you know enough about SEO to realize that keyword density concern is a thing of the past (within reason, anyway). What may be less clear is what you should be worrying about instead. It’s easy to say that the key to good SEO is quality content. But what exactly counts as quality content? How do you know whether your content is good enough to optimize your site?

Today’s guest, Joost De Valk, is here to answer these questions. He’s the creator of the Yoast SEO plugin, which is currently active in over 8.5 million WordPress websites. We’ll talk about how his plugin helps you assess the quality of your content, but also take a deep dive into WordPress SEO in general.

In this Episode

  • [01:30] – Joost starts things off by talking about the best ways to maximise Yoast’s potential for SEO. He then talks more about readability and what it means.
  • [04:00] – What are examples of some of the actionable feedback that Joost has been talking about?
  • [07:21] – As algorithms dive into our copy more and more, they have a hard time digesting text that is hard to read. This means that easy-to-read copy will generally rank better.
  • [08:26] – Joost discusses keyword density, which isn’t relevant except in edge cases. He and Stephan then dive into the subject of keyword prominence.
  • [10:44] – Are there red herrings that listeners should look out for, or false information that should be corrected?
  • [14:09] – Stephan suggests a feature that identifies how many clicks away each page (or post) is from an external link source.
  • [16:45] – We move into other types of settings in Yoast that optimize SEO.
  • [20:50] – Joost points out that Yoost tries to go with decisions, not options.
  • [22:19] – We hear more about tags on posts in WordPress, with Joost pointing out that you should never have more tags than posts. He and Stephan then talk about using tags and categories.
  • [27:05] – Stephan clarifies another important point about no-indexing content, and then gives an example of a bad site map.
  • [29:40] – What does Joost think of the spider bites page that Stephan has just been talking about?
  • [31:20] – Joost discusses whether he sees a benefit of creating an HTML sitemap page on a typical website.
  • [34:00] – Stephan’s extreme preference is not to include the dates in the URL. Joost agrees that it’s a bad idea.
  • [36:56] – Does Joost have any particular accessibility features that may not be important for SEO but that he highly recommends?
  • [38:50] – Joost talks about some of the capabilities of his plugin in terms of JSON-LD, Schema, and so on.
  • [41:21] – We hear Joost’s thoughts on whether his video plugin or local plugin are must-haves for website owners.
  • [43:12] – Stephan moves onto talking about RSS optimization, which he explains is important for podcasts.
  • [45:10] – What is Joost’s take on scrapers and pursuing copyright infringement issues?
  • [46:24] – Pagination is stupid and should never be done, Joost says.
  • [47:16] – How can people reach out to Joost to work with him?


Welcome to yet another super geeky SEO episode. This is episode number 131. We’re gonna geek out specifically about WordPress SEO. Who better to talk about SEO than Joost de Valk. He is the creator of the Yoast SEO plugin which is currently active in over 8 ½ million WordPress websites. It is the SEO plugin for WordPress. Joost is a 36 year old web developer, SEO and online marketer. He’s also the founder of css3.info–the biggest CSS Three resource on the web. He founded it in 2006 and sold it in 2009. Joost, it’s great to have you on the show.

Thank you, thanks for having me.

Yeah. Let’s talk about the WordPress plugin first of all because it is after all running 8.5 million WordPress website. It’s a pretty important piece of software for a lot of people. Let’s talk about the best ways to utilize it to maximize its potential. What would be some of the settings and some of the configurations that would maximize the SEO when you have it installed?

The first and foremost thing that you have to do is use it.


When I say use it I mean when you have it installed, it gives feedback on your writing and it gives feedback as it gives some SEO tips. If you’re an experienced SEO, you might not necessarily need the SEO side of it but the readability side of it actually makes almost everyone I’ve met a better writer because it gives tips on how to improve readability, how to use passive voice less, how to use transition words, and all these things. What we find more and more is that the texts on the size that we work with ourselves and that we optimize the readability for actually tend to rank better as well because they are just more easily digestible, both by people and by search engines.

Let’s talk a bit more about readability and what that means. There’s this score that you’re using, there are other scores out there.

There are many different scores, I don’t necessarily think to one’s score. in English we have the flash reading score in our plugin which we mostly have because NASA at one point requested me to add it. We actually have a lot more in terms of actionable feedback. The problem with the score like that is that the score itself, it doesn’t really say what you have to improve.

Right, right. It’s Flesch-Kincaid.

Flesch-Kincaid, Reading Ease test, and there’s several variations of that in English. There are several translated versions of that for some other languages. In Europe, we have some other grading systems for texts, we try to do the ones that are most common in the language system we support which is a regrowing set.

How many are there?

Right now we do eight. We’re currently working on Russian which is a new challenge in many ways because it’s another alphabet, and we’re working on Japanese which is even harder because it doesn’t have spaces. Text recognition becomes a different beast. But we have four language scientists here. We’ll work on that almost full time on adding more on making more of those texts, looking at how can we further improve people’s writing and how can we give them actionable feedback that would actually make their texts easier to understand.

Can you give some of that actionable feedback like make the paragraph shorter, or sentences shorter, or use more–

Yeah. Sentences shorter, paragraphs shorter is not necessarily always a good thing although too long is a bad thing. But what we worry more about is sentence length, passive voice– especially in the English language, it’s very easy to do but it makes texts pretty hard to digest especially if you’re not a native English speaker. It’s funny because of course I’m not American or English, I’m Dutch. English is not my native language or nor is it anywhere close. Even for me, I find that people that have actually taken the time to work on a bit more that those texts are a lot easier for me to understand than the scientifical very long sentences, very passive voice texts that sometimes occur, or even worse the traditional writing that sometimes can happen when people are writing about a law or things like that. They tend to make very large sentences with lots of add on sentences and all these things and that make it harder to read for people.

What we worry more about is sentence length, passive voice– especially in the English language, it’s very easy to do but it makes texts pretty hard to digest especially if you’re not a native English speaker.

Yeah. Something that’s kind of a related to this is the use of related keywords or other words in the same topic space which shows that it’s kind of a comprehensive article instead of surface level. Let’s say you’re writing an article about lawn mowers and you don’t mention anywhere in the article lawns or lawn care or landscaping or summer or anything like that then that looks pretty surface level. Brian Dean calls as LSI Keywords–personally I don’t think LSI is a thing with Google these days.

There’s a lot of different terms around that; there’s LSI, there’s TF-IDF which is a lot more prominent in Europe. There are a lot of ways to look at that. I just think that people have to focus on writing good copy. If you did that then all of that will come by itself. That’s easier said than done. One of the challenges we have with Yoast SEO is that all of our analysis happens in your browser, in JavaScript, so we don’t have the processing power nor the indices on the server to do any analysis on related works because we just don’t have that data. We don’t have access to that data when you’re writing that copy. We don’t do any analysis on that. I would actually tell people to think about it, “Okay, so what is the topic space that you’re covering?” One of the things that we do in Yoast SEO Premium is you can have several keywords to optimize for. The way I’d always tell people to use that is to use that for synonyms and to make sure that you’re not overdoing it. We’re adding some more support for that in the coming year of development where actually one of the things we do which a lot of old school SEOs find laughable is the fact that we still calculate a keyword density, but it’s a very helpful thing to look at the text and say like, “Hey, I wanna rank for keyword expert. I haven’t actually used that keyword.” People will say, “You don’t need to use a keywords to rank for it.” But that’s not true. Certainly, it’s less true when you’re not writing in English. What we try to do is tell people about synonyms and tell people about how to write good text, how to use transition words which are things that they usually learn in primary school and then forget to actively use when they write later on.

Right. People get complacent with their writing stall and they don’t check in with the audience to make sure that they’re getting the full value out of it.

Yeah. As algorithms dive into our copy more and more, the algorithms have a hard time digesting texts that is hard to read as well because if it’s hard for a person, it’s gonna be hard for an AI because it’s something that it encounters less. Easier to read copy is probably also gonna rank better just simply because digesting the topic out of an easy to read piece of content is easier for an AI than in getting a topic out of a hard to read piece of content.

We’ll come back to AI because it’s gonna really affect SEO in the coming years, but let’s save that topic for just a little bit later. Keyword density, that is kind of a term that I don’t really mention, I haven’t for a very long time. It’s not something that you really wanna look at other than edge cases, I don’t think. Edge cases being like you haven’t used the keyword that you’re targeting at all on the page or you’ve used it a number of times and it looks really spammy.

Yeah. That’s pretty much about edge cases that we look for. There’s a lot of middle ground there that we don’t say anything about. I think that we currently trigger it like we give you what we call a green bullet so we give you a good score as like 0.5% keyword density and we start giving you a red bullet and when you go over like 3.5%. It’s a very relaxed test but if you have a keyword density of 4% in the text where you’ve seen those texts in the past and when they still worked but all of that doesn’t work anymore. We have to prevent some of that for people.

Yeah, yeah. Keyword prominence is something that’s still different from keyword density. That I think is more important in good SEOs. Let’s say you’re writing a page you copy and the keyword that you’re targeting, you forget to mention until the very last paragraph.


That’s not ideal. Is that something that your plugin will check for?

Currently, we check for the first paragraph and we check where they use it in the heading. One of the things that we’ll be doing, I think that’s about two months away, it’s almost done internally but we need to test the settings that we have to give. But the word testing, your keyword distribution check that will actually check whether you’ve used that keyword a couple of times in your text distributed across the entire text instead of just using the keyword three times in one paragraph and nowhere anywhere else.

That’s awesome. Looking forward to seeing that. Would you say that there are certain red herrings that our listeners should look out for, things that don’t really matter or there’s misinformation or disinformation around that. For example, H1 tags are taught as the best practice for SEO but for years and years, it’s not had an impact. People come up with these invalid experiments, these tests where they say, “I had H1 tags and it helped my rankings.” Actually what they did is they added keyword rich copy in a large font size at the top of the page and that increased the keyword prominence and then improved their SEO, their rankings. But H1s versus H6s or font tags, that doesn’t make a difference.

No. I agree on that part. It doesn’t make a difference for SEO. It does seriously make a difference for accessibility which is why I would encourage people to use H1-H6 tags in the proper way and use them to properly structure text.

Right. When you say in a proper way you mean like if you’re to extract just the H1 through H6s out of the page and turn that into kind of an outline…

You’d have a proper outline.

Right, right.

That’s important for accessibility reasons. I don’t think that any search engine really uses that as a ranking factor because it’s way too much work.

Right. Also, there’s not enough compliance across all these websites.

I agree with you that there’s a lot of tests like that where people say like, “This is very important.” I had a question today from someone, we do a lot of Ask Yoast videos which are a bit like the old videos where people ask questions and I’ll just give a straight up answer, it’s a lot of me shooting from the hip. But one of the questions I ask today was, “We do a lot of interviews should we use the H2 heading for the questions?” I’m like, “I don’t care. Just do good interviews. Focusing more on the content of your interviews than on which headings you’re using is probably gonna lead to a whole lot more viewers than doing any SEO on that.” The funny thing is that I think that we finally are getting to the point that we, as SEOs, are telling people what Google has been telling them for 10 years because now it’s finally true. Google can actually analyze that content properly and can reword pretty good content.


Of course, it has its wrongs.


But I truly do think–and seeing the WordPress world especially that just getting people to use a plugin like ours and focusing on whatever I’m actually running about, how am I gonna write good copy about that? Getting them to do that is way more valuable than thinking about which heading to use and all of these other tricks.

What would be some of the other red herrings in SEO that people should not be focusing on besides what we’ve already discussed?

XML sitemaps basically because they’re rubbish. Our plugin does them and people care a lot about them, I think that they do a lot more than they do. They’re important for discoverability but that’s it. Search engine uses them to discover your new content. If they weren’t gonna discover your new content without your XML sitemaps, something is extremely wrong. I don’t think many sites actually need the XML sitemap.

Right. If they’re doing them wrong, they’re actually doing more of a disservice to the search engines and themselves then if they didn’t have them at all.



But telling people that is hard.

Why is that hard? Because they just don’t wanna listen?

Because Google keeps telling them that they need an XML sitemap.

If you have a URL that’s essentially an orphan page, you’re not linking to it yourself and there are no external links pointing to it and you included in an XML sitemap and you expect it to rank here, you’re gonna be very disappointed.

Yeah. It will work. This is one of the reasons we’ve added in Yoast SEO Premium, we have an orphan page detector that will actually tell you whether a page doesn’t have any internal links.

That’s awesome.

Because that’s the sort of stuff that happens in CMS. But yeah, it’s hard. I can see why Google wants XML sitemaps and I can see why they’re useful for a very large site or for new sites that want to rank very fast. But that’s about it.

Alright. This would be a really cool feature. Don’t know if you’ll ever incorporate it in the Yoast Premium but I think it’s a good idea. Imagine a tool that would help you identify how many clicks away each page is or each post from an external link source. If it’s too many clicks away from an external link source or source of link equity then it’s not a very important page. I don’t think the search engines would look at it as an important page. If you have a lot of your content that falls under that more than two clicks away threshold I would say you have a pretty lousy website. What about having some sort of tool that checks that for you?

What we need to do to be able to do that is we need to have external data in terms of your link profile. We need to know where are you have incoming links.

Right. You probably need to have the website owner given API key for whatever tool, Majestic, whatever.

Yeah. Google Ad will expose this then this would be very cool to do in their API.

I wouldn’t expect Google to do it. I’d think we’d have to use just the third party.

To be honest, the data is in Google Search Console to some extent especially through an API is not that far away. But doing this with friends as Majestic seems like something that can definitely happen. I can see us doing that at the same time. I always try to make people focus on their stuff instead of looking at stuff outside of their site. There’s so much in SEO that focuses on all these numbers and things whereas we can just see which stuff is not doing well as well as other stuff in your site and try to optimize the stuff that’s not doing as well as the best stuff on your site and then go through that as you continue with cycle to make your site better. I think we’ll focus more on that in the short term and longer term, I can definitely see us do some integrations like that and see if we can get that data to create something meaningful. The problem you run in there is that most of these WordPress sites that run Yoast SEO run on relatively cheap machines so doing any type of the large scale analysis is not necessarily an option.

Yeah, yeah. That makes sense. Let’s move into more of the other types of settings in Yoast that would really optimize the SEO. Like for example no indexing certain page types or sections of the site like date based archives, or tag pages, and things like that.

Yeah. We have a lot of options for that. We’ve actually just on Yoast SEO 7.0 in which we’ve introduced something that’s entirely new which I liked a lot but I’m curious what the SEO community will think. I did an option to “disable attachment here or else entirely”.

Oh, I like that.

Always redirect the attachment URL to the attachment itself. Because it’s stupid that and that’s actually default for newer sites that install the Yoast SEO. At that point you won’t have media URLs anymore.

Just to clarify for those listeners who don’t understand what attachment URLs are. There are entire pages, thin content pages that WordPress will create that contain just simply the image or the piece of media that you uploaded.

Yeah. If you put in a title, then that will display there, and that’s it. It’s thin content in its most pure form.


It’s a leftover from how WordPress create uploads and uses media. It creates a URL for every image that you upload. In most cases, you’d expect to have a URL that has the image itself, it creates that as well. But it also creates a separate URL that is called an attachment URL that has nothing more on it than the title in that image. Those are stupid and they’re not needed. That’s why in Yoast SEO 7.0 we introduced under a search appearance media the option to say “redirect attachment URLs to the attachment itself”. When you do that, you don’t have any attachment URLs. I actually think that’s something that should be on in like 99.5% of websites.

Yeah. I would agree, for sure. Other types of thin content or duplicate content if you install plugin for example that’s printer friendly pages as an alternative or email this post to a friend then those all get indexed, that’s a bunch of duplicate content.

Yeah. Let’s say create archives for that or post types and you can very easily set them to not be visible in search results and then we noindex them and refer them on with the XML sitemaps. Some are those for author archives. I always tell people like, “If you have a blog, is that a personal blog?” They go like, “Yeah.” I’m like, “If this is a personal blog, are you the only one writing on it?” “Yeah.” “Why do you have an author archives? Because then your homepage is basically exactly the same as your author archives.” They’d be like, “Oh, yeah.” You can disable author archives entirely in Yoast SEO.

Yeah. Even if it’s a multi-author blog, a lot of times those author pages are really not very valuable.


It’s not like the page has a bio or history of that author with picture of them and so forth, it’s just another way to slice and dice the same repository of articles and blog posts.

Yeah. Usually, you can and should do a lot better than combining them by author.


You should have a category or a tag or a taxonomy that ties articles that belong together together. That’s what taxonomies are for, that’s not necessarily will author archives of work. Date archives are fallen into a similar thing that they can be very unusual on a lot of topic-based websites. If you have a news website, I actually think that their archives are very useful, but if you don’t have a news website, then I don’t really think that you need date archives.

You should have a category or a tag or a taxonomy that ties articles that belong together. Click To Tweet

Yeah. The date-based archives, even though with a news website, I think it’s only valuable if you’re looking at the most recent months or weeks and not like, I’m not gonna go to your blog and say, “Oh, I wonder what happened on this blog in May of 2014.”

I agree. That’s absolutely not useful. We have the option in Yoast SEO to disable date archives entirely and that redirects them to the homepage.

Yeah. All its redirects are 301?

Yeah, yeah.

Is there an option in certain use cases to make that 301 actually be a 302?

If you can code, then yes, if you can. We tend to be on to follow one of WordPress’s core philosophies which is Decisions, not Options. We try to reduce the number of options we have a lot.

Got it. Yeah. Date-based archives not useful, author archives not useful from SEO standpoint, usually from a user standpoint too.

Yeah. There’s a special case there, should you find your author archives useful because you’ve optimized those or you’ve make them look pretty etc., fine with that, if you want to rank through the author name, that could be something worthwhile in some cases. Then, there’s an option as well to show the author archives in search results but to not show the author archives for authors or users on your blog that do not have any posts. Because friendly enough, WordPress will create a URL, an author URL for everyone on your site, regardless of whether they have hosts or not. It’s friendly like that.

Yeah. That makes for a real mess. Then all that gets indexed and it’s a bunch of thin content makes your whole site look really not very nice to Google. Tag pages, I’m not a fan of these sites that just on a fly don’t put any thought into tagging and then they just use all this ridiculous overly-generic or overly-specific tag keywords and then it creates all these tag pages that are useless from an SEO standpoint and actually hurt your SEO kind of break that apart as well.

This happens a lot and we have of course in our existence review with a lot of websites. The conclusion so far is that if you have more tags than you have posts—which happens a lot, you’d think that wouldn’t happen but this happens a lot—then you’re using it absolutely completely wrong. A tag should combine pieces of content that belong together. A tag, if you have a series, you can combine them together with a tag. If you write about a specific topic 8 times a year or 10 times a year then I would put that in a tag. What we usually do is use categories for the main categorization of the site and use tags to be slightly more granular and be across categories. But having too many tags is a very poor user experience, whereas having good tag pages that are optimized specifically for a specific topic that you’ve written about a lot can actually be very useful.

Yeah. Let’s clarify for our listeners. Tag page, is this a category page versus let’s say just more generically a topic page, like for example, on The New York Times website, these topic pages like you put in the keyword Iraq into Google and the topic page on The New York Times will rank very highly and that includes recent articles that were about Iraq as well as a summary of what’s happening in Iraq, a little bit of not a Wikipedia style article.

It’s a good overview of their content and usually unimportant pages that actually has content specifically written for that archive page as well.

Right, right. I think that’s a great strategy and we can apply that with WordPress using category pages and to some degree tag pages, it’s so easy to make a mess of things just by shooting from the hip too much coming up with tag keywords.

Yeah. In larger organizations, I’ve done a lot of work for The Guardian. The Guardian has something that I think every major site should have. They have someone with–his job title is tag manager. No one in the company can create new tags except for him.

I like that.

It’s awesome because it means that the tags have meaning. Even though they have like 2,000 tags which would a body of content like The Guardian is not weird, those tags have meaning and those tags, when you go to one of those tag pages, they actually have posts that are related in a proper way. That’s useful. I think topics at nytimes.com is one of the things that Marshall Simmonds came up with way back in the day. It’s a very valid strategy. Optimizing category pages and whether it’s categories tags, topics, whatever you call them, however you create them technically is not necessarily nothing’s better than the other. In fact in WordPress itself, tags and categories underneath in the code are pretty much the same. It’s just how you use them is what matters and you have to do that with a bit of care. Your site structure, which is basically what that is, it creates a site structure for you, is something that you have to actively think what about and actively manage.

Right. Ideally, you wanna think through your taxonomy, the way that you categorize and subcategorize and sub-subcategorize all these topics and reflect that in your categories rather than in your text.

Yeah. The difference between categories and tags in WordPress is that categories are structured so they can have parents and children and tags cannot.


Again, it’s how you use it that matters more than which one you choose. But I would agree that if you’re doing this well from the beginning it’s probably easier with categories and subcategories.

Yeah. If it’s not really clean and well thought out, your tagging—I’m talking about tags as in WordPress tags not HTML tags—then just noindex all that mess, the tag pages are gonna really be more of a detriment than a benefit from an SEO perspective.

Absolutely. Although I will say that I consider using noindex for that sort of thing a short term solution and a long term solution would be actually cleaning it up. I’ve seen sites that had 200 posts and 4,000 tags and Google would crawl like 20 pages a day and their new posts would not get indexed because Google is busy crawling all these tag URLs.

Right. You’re squandering your crawl equity.

Yeah. Crawl budget.

Crawl budget–yeah, whichever term you like to use. Yes. Another thing that’s an important nuance about this is when you’re noindexing a large swaft of content, and let’s say that these are providing different pathways into your posts or pages, you’re noindexing and then following expecting the link equity to flow to these individual posts. But then, Google’s gone on a record to say that we actually treat a long term as a noindex, nofollow overtime, this will neuter the ability of these pages, the noindex will neuter that ability to pass link equity.

Yeah. It will actually slow down crawling an awful lot.


We used to have an option to noindex subpages of archives so you could noindex page two and further of any archive within WordPress, we’ve removed that option because we actually saw that Google started crawling those subpages less and less and as a result to that crawl the articles that were linked on those subpages less and less. There is really no need to do any of that anymore.

Yeah. That’s great. You mentioned the topics on The New York Times, that’s the topics.nytimes.com?


I’m gonna include links in the show notes to examples like this. The New York Times example here that I think is still silly is spiderbites.nytimes.com. It’s shocking to me, I use that as a worst case example on that best practices but the opposite. It’s a worst practice. First of all the name, and might as well call it as seospam.nytimes.com and then the usability of that is just atrocious like I’m gonna go into free archives and 1999 and then May and then part one. Crazy.

But doesn’t Spider Bites only serve like sitemaps?

Yeah. It’s an HTML sitemap. It’s not XML sitemap files, it actually provides links into old content that would be normally very, very deep into the site hierarchy. I think that’s the idea that, “Oh, we should do HTML sitemaps just because XML sitemaps are not enough because we’re not linking actively to these old articles from 20 years ago, etc. Any other way. This is the way in.” Then they call it Spider Bites. What do you think about that?

I like it, I like it. But that’s just because it’s so geeky. It’s so SEO geeky.

I’m not a big fan of stuff that is only there for spiders and not for humans. It’s not for human consumption at all.

No, what would be even more fun is that they actually would block Google from it and then link over to Bing or something like that. It will be such an SEO geeky thing to do.

Yeah, yeah. Anyway, HTML sitemaps versus XML sitemaps, do you see a benefit of creating an HTML sitemap page on a typical website?

If your site needs it, then you probably have horrible site structure.


In systems like WordPress where it’s so easy to categorize content, it shouldn’t be that at all.

Right. Let’s say that you decide to move some things around, delete some things, change let’s say category names and the slog changes, change post slugs, or the URL of some posts. The WordPress core handles the 301 redirect from the old URL to the new one automatically.

For a post, it does, yeah.

For what page types or content types does it not handle that?

For any category or tag.

Does your plugin handle that?

Yoast SEO Premium creates those redirects automatically, yeah. Yoast SEO Premium comes from the redirect manager that allows you to create redirects which in itself is cool, but I think the real strength of it is when it automatically redirects for things that otherwise would just be– excuse my language, f***ed up. It’s very easy to change a URL in WordPress even though we call it a permalink. If you change your URL in WordPress, then only with posts would redirect the old one to the new one and nowhere else. We try to catch all the other cases. We also, when you have Yoast SEO Premium and you delete a page, we’ll ask you where you want to redirect it to instead of just leaving a 404.

That’s nice. Awesome. Then you also remove that URL from the XML sitemaps?

Yes. The XML sitemaps actually in Yoast SEO, they’re not static files, we generate them on the fly.

Yeah. That’s based on whatever the current posts pages…

The XML sitemaps in SEO are entirely based on the content that’s in your sitemap at that point in time. We create a sitemap index and then have set sitemaps per post type and per taxonomy.

Yeah, got it. An XML sitemap is a canonicalization signal so as far as Google is concerned, if you’re providing non-canonical URLs, duplicate content, URLs that contain tracking parameters, UTM source, UTM medium, that sort of stuff, then you’re using XML sitemaps around.

Yeah. Absolutely. We use the same code in Yoast SEO to generate economically your own page that we used to create the URLs in the XML sitemap.

Great. Awesome. What about the year and month in the URL so you can set your permalink structure to contain that or not contain that, my extreme preferences is not to contain the dates in the URL structure, I think it prematurely dates your content, if you had a evergreen post that was amazing and it’s got 2012 in the URL structure, that’s not good.

I agree. It’s absolutely horrendous, at least all these other problems because Google still treats the web as though we’re in 1999. If you have slashes like that in your URL, it will take everything after a slash off and spider that URL regardless of whether your link can do it or not. I would definitely agree that dating your URL is detrimental. Luckily, if you have date in your URL, you can just go to your permalink settings in WordPress and take them out and WordPress will take care redirect by itself.

That’s cool. The dates in the URL are bad, what else in the URL structure would you recommend against that you keep seeing time and time again?

I would keep them as clean as possible and as short as possible. Don’t do anything work like .html at the end etc. all of that very old school but not needed.

What about the fact that your post title would be turned into your post slug, and if you write really long post titles, post names, that could be like a 15 word long file name essentially.

Yeah. It’s problematic. The funny thing is that Google is showing less and less URLs in its search results and more and more backgrounds so it’s becoming less of a problem. We’re actually in touch right now with Google to actually output, even when we don’t output a background, backgrounds are a feature in Yoast SEO that you can enable if your theme asks for it or it can add support for it to your theme but we’re talking to Google right at the moment about whether we should include the JSON-LD that we have for that so we have to structure what the background should look like on that page.

That’s a cool idea, I like that.

The problem is if I do that without talking to Google, I do it on an incoming website and I basically force them to support it. There’s a bit of a trade off there.

Do you have a bat phone that you can pick like–I don’t know if you ever saw Batman but it goes right to the Googleplex.

We have a good working relationship with all the search engines. At our scale, you need to talk to them relatively often.

Yeah, very cool. Any particular accessibility features that aren’t really important for SEO but that you highly recommend, we mentioned one already.

There’s one that always drives people nuts, we have a green bullet in our checks, in our SEO checks actually that only lights up when you have at least one external link in your post.

That’s funny. Okay.

I know it’s not because I think that you’re gonna be a better person or your content is gonna rank better because of that external link. But I do think that actually makes the web better.

Yeah. I agree. This is an important distinction when you are creating content, you’re maintaining a website, try to be a good citizen of the web and not just a taker but a giver. When you’re linking out to quality content that’s relevant or you’re part of that making the world in the web a better place, but it’s not gonna help your SEO. It’s a confusing thing for some newbie SEO people who are not into SEO at all, they think, “I need links so I’m gonna link out too bunch of stuff, I’m gonna link out to Wikipedia article that relates to this topic, etc.” That’s not adding any significant value, it’s not an SEO tactic, it’s kind of misguided confused advice.

Yeah. We like sometimes to put the SEO banner on things that actually make the web better but not necessarily rankings.

Yeah. Like using H1s, H2s, etc. in a way that makes sense and you wouldn’t fail the test if you turned in the outline in primary school. That makes the web a better place although it’s not gonna help really at all with your SEO.


Cool. JSON-LD, let’s just quickly talk about that and what that means because not everybody who’s listening is familiar with JSON-LD or even Schema, let’s talk about some of the capabilities of your plugin and just WordPress in general in terms of Schema Markup, JSON-LD, all that.

Schema is basically a structured market that clarifies what type of content something is or that clarifies specific bits about your site. Yoast SEO has a couple of things that it does under the hood that probably no one ever sees if they don’t look at the code. We output things like your site name and where to find the search pages on your site into a JSON-LD bit of code that goes into the head of your page. Normally user one see that with a spider, will definitely see that when it spiders a page. We do that for your social profiles too if you filter them out on the plugin. We basically we give Google all the data it needs to generate a knowledge graph block for you that you sometimes see when you Google a brand on the right head side. I think actually the Google, Yoast still see or nodge but our knowledge graph block on the right head side in Google, much of the data that is in that block can come from your site if you fill that avenue and you put it in JSON-LD blocks in your homepage.

Some of the data though comes from other places like Wikidata.

Yeah. You can influence the logo that’s there, the name that’s there, and the social profile, that’s about it. Almost everything else comes from other data sources that you might or might not be able to get into.


We do that. When you search for slightly bigger websites for the name of the slightly bigger websites, you’ll sometimes see a search bar within the search results and you could influence where that search bar leads to, whether that leads to a search internally on Google or a search on your site. What we automatically do is make that lead to a search on your site so that if someone use that, they don’t stay on Google but it come to your site and you can lead them the right way.

Yeah, that’s great.

We have some other addons to Yoast SEO that do more things so like a local SEO plugin that allows you to do all sorts of local business markup, it allows you to rank in Google Maps and other local search engines when you have a video SEO plugin that adds data for videos and all these output JSON-LD or other structure market but we’re converting all of it to JSON.

Would you say that your video plugin or your local plugin are must haves for website owners? Or is that kind of an unusual use case? I know you can’t unplug these additional plugins, but I think a lot of people don’t even know that those plugins exist, they just know that I need to have the Yoast SEO plugin installed.

Yeah. We need to do better at promoting our own stuff although we can’t complain, we’ve been growing it like 80% of the year for the last 6 years so we’re doing well.

You are, yeah.

No, our localized SEO plugin I think is very useful if you’re a local business with one or multiple locations. Our video SEO plugin is only useful if you really do share a lot of video content. Mostly it’s more of a social sharing winner than necessarily a search engine winner although it will help you show up in the video search as also in Google but in the past it was very easy to get video snippets into the normal searchers also with Google but those days are over.

Our localized SEO plugin I think is very useful if you’re a local business with one or multiple locations.

Yeah. That’s sad. I really like those days.

Yeah, it was very easy. But I think we might have made that slightly too easy with our video SEO plugin and that’s one of the reasons that it got killed.

Yeah. The rich snippets of pages from your site that have a video embedded on it would actually have a little image thumbnail of the video and get more visibility to your listening in Google. Pretty much no sites get that except for youtube.com and vimeo.com and a handful of other video sharing sites.

There’s a list of like 70. The funny thing is that there’s a couple of new sites that still get them. The Guardian still has them, for instance.

Wow. That’s cool. Hopefully this doesn’t alert any Google engineer who’s listening like, “Ah, The Guardian, I wasn’t meaning for that to still show up, let’s take care of that.

They make that list by hand.

Alright. Is that a list that’s public that you can find?

No, I don’t think so actually. I know it because we run a test when I was still at The Guardian, we run a test on which sites still have that.

Alright. Interesting. RSS optimization. That’s really important for example if you have a podcast, we have this podcast, I have my other podcast show, Get Yourself Optimized. Each of those podcasts have their own website and they’re of course WordPress based websites and I wanna have good optimized RSS feeds because if you don’t have an RSS feed, by definition you don’t have a podcast.


Let’s talk about the RSS optimization side of things.

Yeah, what do you wanna know?

What are some of the key elements that you want to optimize, people know that, “Alright, I need to get have good keywords in my title tag, I need to have a method description–although that doesn’t help you with your rankings, it’s good for click through.” There’s some basic fundamental SEO tactics for regular web content but when we’re talking about RSS feeds, there are other similarly basic optimization techniques like making sure you have good keywords in the item titles and in the site title and so forth. Do you have any tips for RSS?

No, not really. To be honest. It’s been a long time. We do some optimization within Yoast SEO but that’s mostly about linking back to the original post. Because doing that automatically makes scrapers far less effective. But that’s about the only thing that we do about RSS at this point.

Okay. That’s fine. Actually that’s a good segue-way to scrapers and some people get obsessed with going after each scraper site that’s stealing their content, even though Google knows that this a scraper site, it’s garbage and it will never rank it. Which your take on scrapers on using tools like Copyscape to look for copyright infringement, filling out the DMCA, take down notice and sending that to web host and to Google and so forth which you take on all that.

If it’s really hard then you do it. But be sure that you’ll only do it if it’s really hard on you.


I will suggest going into the Yoast SEO settings to RSS and to check whether you can do that automatically that we add in our settings things there that links back to your site. There’s like 33% to 50% of scrapers will then just automatically link back to your site which is usually enough to kill the entire problem.

Great. This is really important point here that Google can easily identify the originator, the source of the content—let’s say it’s a blog post that you wrote—if it’s appearing on 30 other websites and those are all scraper sites, they’ve stolen your content without your permission, and a third to a half of them are linking back to original post, that’s a very clear signal to Google that you’re the originator of the content.

Exactly. Yeah. Even though not all of them might pick that up, it actually fixes the problem.

Yeah. What’s your take on pagination, briefly mentioned rel=“prev” rel=“next” and I’m not a huge fan on the breaking up of a large article into like five parts or whatever in pagination.

Pagination within post is stupid and should never be done. It’s such a bad user experience that’s ridiculous.

Yeah. Pagination is not something that you should do unless we’re talking about category pages and page two, page three, page four etc.

Exactly. That being said, we have full support for both of versions of any Yoast SEO so we will do a proper rel=“next” rel=“prev” between pages and all of that on both on category archives but also on paginated posts.

Got it.

But from a usability perspective, I absolutely hate it.

Yes, me too. Awesome. This has been just packed with awesome knowledge and I really appreciate you taking the time to share your cutting edge advise with us. How would people reach out to you, your team, your company to work with you? Is that even an option? Or should they just install the plugin?

No. They should just install the plugin, we don’t do consulting. They should follow yoast.com or follow us on Twitter @yoast, me personally on Twitter that’s @jdevalk and that’s about it, just keep up.

Alright. Also watch your Ask Yoast videos.

Yeah, absolutely. They’re good fun.

Alright. Thank you, Joost. Thank you listeners, we’ll catch you on the next episode of Marketing Speak. This is your host, Stephan Spencer signing off.

Important Links:

Your Checklist of Actions to Take

☑ Focus on good and valuable content. It’s the first and most important step in site optimization.

☑ Prioritize link quality by doing a regular audit on content in my site. If I have a page without substance, my site will not look credible in Google’s eyes.

☑ Be aware of my keyword density. Overusing keywords may seem like I am writing for the bots instead of actual humans that need information.

☑ Use proper H1 to H6 tags for great and easy usability. This will make my blogs and articles look better.

☑ Disregard attachment URLs and disable it on Yoast so that it redirects to the attachment itself.

☑ Noindex links that don’t need any ranking such as archives, database and tag pages.

☑ Disregard date based and author based archives on my site. These archives are not useful from an SEO or  user standpoint.

☑ Refrain from overusing meaningless tags. Instead, use categories to classify content about a specific topic.

☑ Avoid duplicate content and copyright infringement with the help of Copyscape.

☑ Don’t forget to install the Yoast plugin on my WordPress Dashboard.

About Joost De Valk

Joost is a 36 year old web developer, SEO and online marketeer. In the early days of his career, he worked in several companies, ranging from enterprise hosting to online marketing agencies. This allowed him to work with several of the largest brands in the world. He founded CSS3.info – the biggest CSS3 resource on the web – in 2006 and sold it in 2009.

In 2010, Joost created Yoast, which focuses on software, training and services for website optimization. Team Yoast currently consists of more than 60 people around the world. Yoast SEO, Yoast’s main software product, is currently active in over 8,5 million WordPress websites.


Leave a Reply

Your email address will not be published. Required fields are marked *