Kaspar, it’s so great to have you on the show.
Stephan, thanks for having me. It is a real pleasure and honor.
Let’s talk about Google and how penalties work. Because some folks think they have a penalty and they don’t, some folks don’t realize they have a penalty, but they do. Then, there are different types of penalties, there are algorithm updates that might be misconstrued as penalties but they’re just core algorithm changes. Let’s set everybody straight on what exactly is a penalty, how do you tell, and all that good stuff.
That’s a mixed bag of a lot of really important questions. I’d like to dive in here and go over these one by one. But the first and foremost, the most important thing to clarify—because this is a very common misconception—is the fact that there is no such thing as an algorithmic penalty. Now, it is being tossed around in the industry as a term quite a lot and it stands basically for websites that do not live up quite with the site owner’s expectation in Google Search.
Nonetheless, all the experience during my time working at Google Search subsequently confirms it doesn’t exist. There are algorithms and manual penalties. Those can have a rather similar impact. They can feel like they are the same thing, but it is very important to understand that they are very different in the way they need to be handled.
Let’s start off with penalties as the real thing. Nowadays, Google is very transparent about what they refer to a little bit euphemistically as manual spam actions. That’s just another term for penalties. These are stopgap measures. Whenever algorithms do not work 100% as desired and websites get ahead of the game by cutting corners, that’s where manual penalties kick in.
I said initially Google was transparent about it because Google does communicate these by a Google Search Console quite clearly. There’s a number of different messages, all of which highlight what issues have been identified on- or off-page—and more off-page being, of course, all the time, the link building on-page—there is a variety of things that can be wrong according to Google such as the quality of content or maybe the type of data markup being used or the purpose of landing pages, specifically call out for web pages. All of these things may trigger a manual spam action and a penalty. They can overlap and there can be several penalties applied to the very same web site.
This is very important. There is a very dedicated process to having these lifted, to have the penalty removed, that is being referred to as a reconsideration request. One has to actually go through the process in order to get rid of the penalty, otherwise, it will be persistent and it will remain in place for an extended period of time, really way too long in order to wait and sit out for the penalty to actually expire. It’s also important to remember that even if a penalty expired after that long period of time if one wished to actually wait that long, it could be reapplied by Google if the reasons still stand their ground for the penalty being placed.Honestly, Google doesn't care about a website's ranking. What matters the most to them is the user's happiness. Click To Tweet
There is a very dedicated process that addresses the issue of a manual spam action applied to the website. Now the consequences of manual spam actions can be manyfold. It can be a very abrupt nosedive. A website almost disappearing overnight from Google Search to the extent that it still is visible as being indexed, but it doesn’t rank. The gradual decline is a possible alternative that happens as well.
What is also possible is that the website is either flagged as a website that isn’t really safe, another website is compromised, something that very much transfers or rather impacts the way that users see a website because users are not very much inclined to visit the website that is clearly flagged by Google and as a compromised one. Last, but not least there is also the option of losing search real estate. It should be less prominent with reviews and ratings. If you ask, all of that is a distinct possibility.
Of course, there’s also the worst type of penalty that is applied for major spam and that is actually a complete removal. The website in question will be deemed by Google so bad, it’s not to be seen anywhere and Google ignores the site: operator.
Now, as an opposite to those manual spam actions and those penalties, there are algorithms. Algorithms are mathematical models, calculations if you want. These do not serve the purpose of penalizing a website or boosting a website for that matter. There is no white or blacklist of any sort. But what they do is they calculate. They pick up only from signals and SEO relevant signals, the contents for one, and backlinks for another. If certain thresholds that are not of public knowledge are hit and exceeded, certain reactions are triggered.
For instance, if I were to drive around in my car and I would use a navigation system—Google Maps or anything else—if I pick and choose to go another route just because I don’t want to pay for toll, for instance, that fact alone wouldn’t break my car. It wouldn’t stop my car from going forward. I would just take another route. The signals would have changed, the input would have changed. That’s exactly what’s happening when algorithms kick in.
There is a handful of algorithms that are famous or rather infamous, and I suppose, Panda and Penguin being on top of that list. Panda is an algorithm, they had addressed content quality issues and being one that deals with backlink issues. In fact, there are hundreds and thousands of algorithms unnamed and never really known to the public that are out there used by Google. In fact, while you’re speaking, there’s probably releases being made.
What I’m trying to say here is that it is really not possible and not a good way of spending all the time as a professional in the industry to try and identify individual algorithms where they may have kicked in or whether they have been updated. Google also vary as much as they are transparent about these connections, they’d rather not share much information nowadays on the algorithms. That does translate and require some insecurity and confusion in our industry. But we also have to keep in mind that Google does these things not for us as an industry. They also don’t do these things because of us or against us. They’d rather introduce those algorithms in order to provide another service to the wider audience, to the public. It is not surprising therefore that there isn’t that much indication of going around in terms of the SEO industry, I said, very much a niche is intercepted.
Would you say that there’s misinformation that is being spread by Google engineers or Google representatives that is meant to confuse or at least take our eyes off of the thing that they don’t want us to focus on?
It’s an interesting question. Smoke screens of sorts. I have never experienced this happening during my active time at Google while I was the head webmaster of the outreach operation. It would have never crossed my mind that it’s maybe a viable option to intentionally deceive.
I still don’t think it’s happening nowadays, however, what may be happening, because there is so little communication in regard to new releases, is that rumors may spread around. If they are not being addressed, in my personal opinion, Google isn’t doing themselves a great favor here by just keeping the wrong profile and not addressing those misconceptions that are being contemplated by us in the industry. But then again, I suppose, Google does care more about the bigger picture, maybe general media rather than the small niche that Google represents.
Okay. There’s a lot of conspiracy theories out there in every industry and SEO is one of them. You hear a lot of rumors and things that Google is trying to spread disinformation or misinformation in order to confuse us or to make it less likely we’re going to be successful at our “manipulation.” But what you’re saying is you don’t think it’s the case that they’re just working to make the algorithms better. Rewarding the best content, most relevant, most authoritative content, and putting it at the top.
I actually do not think that’s the case. Again, this is my personal experience both as a user, but also having worked for a really long time on Google Search, I don’t think that rewarding the best website has ever been part of the consideration. I’d rather think that the idea is just to provide a genuinely good user experience.
The focus isn’t really on websites. For all I have seen, Google doesn’t really care which website ranks and they’re really indifferent about that. It is rather their users that is the overarching goal, the user happiness. After all, the entire Google business model is completely dependent. It’s a critical factor of the Google business model that users remain loyal to Google Search. If they were to go somewhere else, the Google empire would be facing a certain issue in terms of revenue generation.
After all, it is really an opportunity for every website to put their best foot forward, but one should not really expect gratitude, rankings, or any benefits from providing a really good website because that’s not the focus. The focus really should be entirely on users.
Generally—this has been particularly my experience while working with starting websites—if the focus is shifted from Google—what Google wants, what Google desires, how do we please Google? If that focus is shifted entirely towards users, how can we make our website better? How do we provide a more compelling and unique selling proposition for our users?
If that’s embraced as a strategy, it is really something that works. It is also rewarded by Google but not intentionally. It’s just rewarded by Google because Google tends to like websites that are popular with users.
I still don’t think that Google is intentionally providing misinformation. Of course, I can only speak of my personal view. I don’t have an existing special channel to my fellow peers.
By the way, when was your tenure at Google and when did it end?
I joined Google in 2006. That’s a century ago in internet times. I left after seven years so that was 2013. Nowadays, I still maintain a rather cordial and very friendly communication with my former peers and friends with Google. In fact, I met my spouse at Google. Very positive experiences, a lot of really good relations resulting from there. But it doesn’t translate me to having a special channel in terms of we’re about to release that thing. You need to know about that, that’s just not happening today. Integrity is really something that’s being valued.
But just to finish my thought, the communication could be much improved and I can promise that it is just that we have to think in terms of scales. I understand both sides. On one hand, in our industry, I’m really wishing for a higher level of certainty and more clarity on what’s happening on Google’s side. On the other hand, Google, a corporation, is essentially committing resources even though they are in a comfortable situation. This is committing resources to a non-revenue-generating channel that is addressing the needs of a minority, which we are. We have to see that as a fact.
The more resources they would commit, the bigger the chance that we’re going to be actually asking for more, but then quite potentially say, “Yeah, but you didn’t comment on that particular article,” or something. I understand why they aren’t committed to human resources.
What I’m missing currently is the situation that we used to have a couple of years back at the time when Matt was still around, where there was one person who could tell for sure with the complete authority of Google Search, “This is exactly what’s been happening. This is where the speculation ends.”
Now, Danny is around, that’s for sure. There’s a couple of really committed people that do not deal necessarily with spam, about the search such as John Mueller, but what’s possibly missing is really a person that would be the PR manager just for Google Search, if that’s a possibility. But that’s just my personal opinion.
Right. For our listeners who are not really entrenched into this world of SEO, Matt Cutts is the Matt you were referring to. He’s no longer there.
Correct. I believe he rejoined forces or was a governmental agency working for the US, I think.
Yeah. It’s the Digital something. You mentioned John Mueller or @JohnMu, that is his handle or nickname. You didn’t actually mention Gary Illyes yet, but he fits into the equation as somewhat of a Matt Cutts replacement, but he’s not nearly as vocal as Matt was. He does speak at some conferences and he does tweet, but he doesn’t answer in-depth questions to the degree that Matt did or have the same level of transparency.
Then you mentioned Danny, who is Danny Sullivan, who used to run the SMX conference and the Search Engine Land website with his business partner, Chris Sherman. Then he got hired away by Google and he became their search liaison. Those are the main characters that our listener needs to know about. Anyone else on the Google side that they should know about besides those guys?
I think the system being the main public face nowadays. There are if you take a look at the Google Webmaster Central Blog, there are sporadically blog posts from other authors. No one, however, is quite as vocal and quite as visible, and in particular, I had the opportunity, and again, this has been an honor working with John Mueller for a long time. We still run into each other at conferences. I’ve seen him last time at SMX in Munich.
No one else is quite as active in the industry as John Mueller. It deserves mentioning here that the work he does is really superb. He is, however, just one man. There’s only so much that can be done in terms of our outreach and communication, especially when it comes to one-on-one. He will be quite often approached with individual problems as I used to be and I know he’s just as good in that scalable approach.Algorithms are mathematical models. They do not serve the purpose of penalizing a website or boosting a website for that matter. Click To Tweet
Let’s go back to this idea of you get this manual penalty, a manual action. Your position is that there are no algorithmic penalties, there are just algorithmic updates. There are manual penalties and those are referred to by Google as manual actions.
Let’s talk about these different types of manual actions. You can have pure spam, you can have just unnatural links, all these different messages inside of Google Search Console depending on the type of manual action you’ve been hit with. Could you rattle off a handful of what some of these might be other than pure spam and unnatural links? Those are a couple of examples, but I don’t have the full list memorized.
I think I can. There are hack sites. That’s third parties abusing the website injecting content, something Google does highlight.
Yeah, malware-infected, etc.?
For instance, yes, or with content injected that is just abusive. There is user-generated spam, something Google does highlight. There are at least the spamming free hosts to the best of my knowledge, rarely applied, at least, I haven’t seen that happen very frequently in the last maybe two or three years. There is spammy structured markup, which essentially translates to losing state, if it’s applied.
Even though every single manual action or every single penalty can be lifted or can be removed, when that happens in the case of spammy structured markup penalty, it still takes a long time to get back in Google’s good graces. After six months, I suppose, that’s just a side note.
Now there are unnatural backlinks that are both applied for linking out. Google basically saying, “We believe you’re taking a commission or you’re taking money for linking to websites that don’t deserve to be linked to,” and there is, of course, we have probably most commonly applied penalty which is for building links, that’s for unnatural links to a website rather than from a website.
There are penalties in relation to content. You mentioned already your spam or major spam, the wording seems to be changing there. There are penalties in relation to cloaking or sneaky redirects, cloaking being displaying substantially different content, the boss as in comparison to users. A slightly nuanced penalty is being applied for thin content or content was very little, additional value. At the same time, there are penalties that are being applied for keyword stuffing on the content or hidden content.
There have been penalties for AMP content mismatch and redirects. These are very rare. I hardly have seen any. Most of the time, from these penalties, just mentioned we seek penalties that are being applied for building links or for having what Google says, “This isn’t good content, this is low-quality content, and your site isn’t going to be ranking quite as well.” This is the most common ones, nowadays, I suppose.
However, I also have to say, myself and my business partner and fellow former Googler, Fili Wiese, what we see with our clients is, of course, just a small representation of all the penalties that are being issued twice a year. It is not quite representative of the couple of hundred cases per year in comparison to the thousands and thousands that are being applied on a quarterly basis.
Yeah, understood. Now, you didn’t mention duplicate content as being one of those potential penalties or manual actions. Some people think that duplicate content is a penalty, but it’s not. It’s filtered. You might feel like you got penalized because you had duplicate content, but you didn’t.
Google wants diversity in the search results and if those search results look too similar to each other because the content is duplicate, maybe the titles are duplicate, the snippets, and so forth. That’s not a great user experience and it’s all about the user as you said earlier. Do you want to elaborate a bit on duplicate content?
This was spot-on, exactly my thoughts. Indeed, there is no penalty being applied for the duplicate content. It is something that Google basically doesn’t want to rank because it is duplicate content. Obviously, it doesn’t add value to Google Search. If they assume that it isn’t the original source, that type of content isn’t going to rank well. It may feel like a penalty, you said that already, but it is possible to figure out whether it is an actual penalty or not.
Now, in comparison to all those algorithms and filters, that may or may not feel like a penalty, that does not trigger any notification, whatsoever. Actual penalties or actual manual spam actions do trigger a Google Search Console message. Anybody in doubt about what’s happening to their website, can actually check that Google Search Console. One may be a fan of the new Google Search Console or not but it does provide unique insights. In that sense, it is very much recommended.
Yeah. This is an important point that every manual action that you’ll ever get or that you’ve ever gotten will be reported in the Google Search Console as long as it’s active and hasn’t been lifted. You’re not getting some of the information in terms of your manual penalties. You’re getting all the information. Google said that they report on all of them, is that right?
This is again something where I feel a little bit ambiguous about because, ideally, it is exactly what you just said. We have all manuals connections highlighted so we know what to do about them. Now, slip-ups happen with all types of software and all types of systems. There had been slip-ups with Google Search Console in the past. It is not quite impossible to say yes. There may be penalties that are being highlighted for whatever reason.
I’m a little bit hesitant to say yes, they show everything. I’d say they show almost everything. It is also quite possible for Google to change their internal policies without making that public. They don’t have to. Again, the general population of users wouldn’t care.
For all we know, nowadays, yes, all manual actions are being highlighted. Is it going to stay like that forever? It’s hard to say. Impossible to say, in fact. I would not really want to make a large emphasis on the fact that all messages are being highlighted all the time. I’m going to give you a good example of why that’s the case.
For instance, while working with clients—something you’re very familiar with—you may have a new website added to your given account. What we see quite often is that if this is a restricted account in Google Search Console, you wouldn’t see all past communication. In the old console, you wouldn’t actually see all past penalties that have been applied. I’m very hesitant to say it’s always the case. I’d say it is almost always the case, more frequently than not, but I wouldn’t say it’s 100%.
Okay. Past Google representatives have said that it’s 100% of the time but that was then. As you say, times change and today versus that time when they made that announcement, some policies could have changed quietly in the background and no announcement was made. It’s good for us to recognize that whatever information Google is giving us, what Google gives, Google can take away at any time.
I suppose, yes, they can. After all, it’s a private index. They can actually do what the hell they want with that index and Google Search Console is not a paid product so we aren’t the customers there. We’re users of something that is being provided free of charge. It’s almost an understandable approach even though it is regrettable if we’re not getting all the information that we wish we did.
Yeah. It’s important for listeners to understand that there’s this army of human reviewers that are looking at websites including our own and they’re called manual raters. They have these guidelines that used to be confidential that Google would try and keep quiet and keep internal and then those would get leaked. Then a new update would be made that would eventually leak as well. Then Google finally decided to make the quality rater guidelines public so that anybody can read them. I’ll include a link in the show notes to this huge PDF, it’s like 160-some pages or something.
There’s some interesting stuff in there about things like EAT—stands for expertise, authoritativeness, and trustworthiness—wrote in a Search Engine Land article about this that is worth reading. I’ll include that in the show notes. The YMYL category that refers to your money or your life type of websites that’s referred to in the quality rater guidelines and EAT in particular is applied to YMYL.
My question for you, Kaspar, how do you relate the signals that the manual raters are giving by saying this is high quality, low quality, this is failing to pass this test and that sort of thing? They’re not hitting a big red button to penalize your site, they don’t have that kind of control, they’re not part of the webspam team. But what they can do is provide signals that Google uses to determine if sites should get penalized or not and also if the algorithms need to get more sophisticated in spam detection and in reranking sites differently. Do you want to comment on that?Anybody in doubt about what's happening to their website should check their Google Search Console. Click To Tweet
I certainly do. A topic close to my heart and one that I feel rather passionate about. Let me talk about that for a moment because you said a couple of things that are really important to understand and how to differentiate. Yes, the document is being applied by, you said, an army. I’m not sure how many there are of those quality raters. These are not Google employees, these are people hired by third parties, external agencies to test on behalf of Google, applying those Google raters guidelines.
The purpose of those tests is to essentially stress test in new abilities or improved algorithms. It never translates to anything that is happening to life index rankings or it doesn’t translate or transpire to any actual action being taken on websites.
A rater review of any given website doesn’t have the power to penalize, as you mentioned correctly, and they also don’t have any means of communication to say, “I really dislike that website and something should be done about that,” or maybe “This website is in violation and something should be done about that,” that’s not the purpose of their work. Their work really is to apply the human ingenuity, the human factor, instincts along the lines with the guidelines to see if new software, if the new algorithm produced better results.
A machine learning algorithm or an AI with a huge data set makes that AI much more effective. If there’s, I call it an army, we don’t know how many, but it’s probably thousands or maybe tens of thousands and it changes. So, these contractors that are doing the quality rating, those are signals that then can be used by machine learning algorithms, correct?
There is absolutely potential for that. I have no doubts about it. I suppose there may be actually a better source just to train some algorithms. Again, this is my suspicion, not a confirmed fact. For instance, if I were to utilize data on a large scale, the disavow file that’s something I would consider a great source for data mining and for identifying both links, sell link, and link buying websites. But that’s just speculation.
However, what I’d like to say as well—I didn’t mention it previously; this is something that is really close to my heart—I profoundly care about not only just Google Search as a product that I like that I used to work on but also about our industry. The one single blunder that Google committed was to publish those raters guidelines without giving the broader context of what these are about and how they can be applied for webmasters and SEOs or if at all.
That’s the point for experts. For the top of the best, this is an interesting source to contemplate how Google sees things. As you said yourself, it’s a very thick document. It takes time to really dive into it. I have no doubt that for somebody like yourself, this is an interesting source. But for the average webmaster that is merely just trying to make their website better in order to improve the rankings over time to grow revenue resources, this is a source of confusion.
The number of times I was addressing the issue at conferences, speaking in public, or talking to peers and clients in person, this is happening all the time. People think like, “Yeah, we have to abide by those Google raters guidelines in order to make our website better.” This is completely not the case.
I find it, at a personal level, really unfortunate that the great work of a handful of people engaged in the webmaster outreach is being compromised by such releases without the broader context. In fact, these documents leaked before, I really shouldn’t have forced Google Search there because it’s creating more confusion than actually helping people to understand how to make their websites better. That question, what is being answered by Google, maybe not frequently enough, the answer to that question is certainly not on Google raters guidelines.
Yeah. The quality rater guidelines are very much an advanced document for an SEO and not meant for general consumption, not meant for just a website owner to consume. That’s why I would write articles like the one about EAT to help provide context and eliminate confusion. There’s a lot of misinformation about EAT, a lot of speculation that’s wrong about EAT and other aspects of the quality rater guidelines. A lot of conjecture and stuff that doesn’t help the website owner to rank their site better.
Now, you mentioned the disavow file as a great source of data for training a machine learning algorithm. We should probably give a little bit more context around that disavow file for our listeners and what that link cleanup process looks like. It’s potentially not just the disavow process of listing the websites that are spammy that are linking to you in this disavow file, but it’s also potentially an outreach process requesting or demanding that these spammy websites remove the link to your site.
There are tools out there that do that like Link Research Tools has Link Detox that works in conjunction with Pitchbox so that you can do outreach to make those requests to get links removed that are spammy. What’s your position on that and where does that fit into the manual penalty versus algorithmic update side of things?
Let’s start with the first and foremost important fact, link building works. It is a Google webmaster guideline violation yet it can work. Before that gets quoted out of context, it is still something that can also get a website very much in trouble. That’s the reason why people do link building to begin with and because that can be obviously a two-edged sword. Google does provide the disavow file for websites to disassociate themselves with things they do not feel like they want to be linked from anymore. Sites don’t repent or clean up their history.
The tool itself is a blessing, so to say. It can be very labor-intensive to be utilized especially those backlinks that go and the hundreds of millions or actually billions of backlinks which is not quite as in common. But it is possible to actually clean a backlink profile and get back into Google’s good grace.
I have a manual spam action removed. It is a very useful tool. The way it is being applied very frequently. One common mistake that we see quite often is to go for granular patterns rather than the domain operator leaving inevitably loose ends and some stuff that Google’s going to frown upon. Again, we could have an entire session talking just about that.
In the context of our conversation here, in regards to these manual spam actions, it is a useful tool that can address any website’s link concerns. I also want to say here, a big fan of Link Research Tools, a great source of data that we love to utilize among others. That is useful.
However, I also want to say removing links is tedious, very time-consuming, and in some industries, verticals, and languages it has become almost extortion. We’re not going to remove links unless you pay to be the answer to inquiries. For that reason, the disavow tool is probably more workable in terms of how it can be applied. I don’t have to ask a removal, I just disassociate my website from the links and move on. Of course, links can be removed. It makes a stronger reconsideration request case, it makes a better rationale.
All of these do not really reflect upon the potential that the disavow file is the data source represent for Google only, for no one else. No one else has that volume of all disavow files. Purely hypothetically speaking—again, I’m not privy to that information anymore—one can easily imagine that out of all the disavow file stuff that Google has collected over time, if they took a very small sample, say, the patterns that are all the time included, they take 00.01% run these against algorithms that are specifically tasked with identifying links and websites, run the results again through the Google rater tips, and see if those results can be reproduced and are meaningful. This can theoretically pose a great opportunity to build more algorithms that can identify the link.
There is just no way of disguising or stealthing link building. This is just not possible. You can disguise it from your competitors, but you won’t ever be able to disguise it from Google because obviously, they need to see it so it works. Because links are still very much an issue to Google, obviously, this is probably an opportunity that somebody a long time ago already considered and I’d be greatly surprised if they aren’t working on it and then again, it is unlikely they will come around and communicate anything that has time or even if there was a new release of the algorithm specifically built-in that scenario, it is unlikely that Google would actually confirm how it came into existence. This is just a thought experiment, but the potential is undoubtedly there.
Yeah, I would say it’s highly likely that Google is either already using that rich data source or will be in the future. Now, reconsideration requests, let’s talk a bit more about that. If you have been hit with an algorithmic update that took all your traffic away, that’s not going to be an option to file a reconsideration request. A reconsideration request is only for manual action. If you don’t have any manual action showing in Search Console, then you can’t even file a reconsideration request, right?
Correct. It gets even worse because if it happens to be a large website with a crawl budget that it’s either not managed at all or completely mismanaged. It can actually be a case where you don’t know how long it would take for Google to recrawl the website so they pick up on new and improved signals or you figure out it’s going to take a year or longer.
In this context, if I may share a very actionable, really quick advice with the listeners, saving server logs, saving and preserving them, not just saving them for a month or two, but literally saving and preserving server locks is a treasure trove of information for a lot of reasons. But for medium and large websites, it is the only way actually to actively manage their crawl budget and see how that’s panning out for the website.Google tends to like websites that are popular with users. Click To Tweet
It is also something that’s going to be very useful when a website has been presumably hit by either an algorithm or maybe it is a technical flaw that is causing the website to fail in search. In all these scenarios, what needs to be done is a technical audit of something that very well can be done within the capacity of the website webmaster, link website owner, and developers.
In that context, it is of course, of the greatest importance to answer the question that everybody’s going to be raising on the very first day. How long is it going to take for the website to recover? We can’t just ask Google. We can’t just send a reconsideration, of course. That option isn’t just available. It is an algorithm.
What needs to be known is how long is it going to take for Google to recrawl the website or whatever are the desirable bits and pieces of the website. That’s what needs to be done in every single instance. For small websites, it’s a smaller exercise, with large websites, this can become quite existential actually.
Now, you mentioned crawl budget, that’s a term that we throw around as SEOs, but for our listener, they’re maybe not familiar with that terminology. That refers to how much budget, I guess is the only word I can think of as appropriate here for Google to crawl your site. If you have a very large website with millions of pages, it may take a while for Google to crawl all of it, especially if your web server is not that responsive, Google doesn’t want to overwhelm it by sending too many Googlebot requests in short order.
Crawl budget helps to manage the server load and Google’s resources and not waste it or squander it on low-value websites. If you have very low PageRank, you’re not going to get much crawl budget and you won’t get crawled very frequently or very quickly. Also, if your server is not very responsive, it will be throttled that way as well.
Exactly. The latter point is one worth stressing, especially for low-performance websites that are essentially slow. Google’s going to be slow crawling them. Now, the combination of what you said, a very large website, millions of landing pages and a very low-performance server, that’s just deadly because whatever is happening on the website, good or bad, new products being released, the website is being improved, new markup being introduced, all of these things do not matter unless Google crawls that to see that due to reflect upon them. Improved rankings, you’re not going to see that. So essentially, without managing a crawl budget, it almost doesn’t matter what is being done with the website because Google may or may not be able to actually see that over time.
Now, one thing that I don’t think we discussed yet and it’s worth pointing out is when you get a manual action, it can be at the site level, or it could be at a page level, or a part of the site level. You can also get keyword level penalties as well, right?
Manual spam action can be applied, just like you said. It can be at the site level, on the root, and Google actually communicates that very frequently as well. It can be very granular in the subdomain directory, it literally can be an individual landing page. You have HTML documents, that’s all possible. There are no penalties, no manual spam actions that say, “We’re not going to let you rank for that yellow fluffy bunny is being the term that you want to rank for. The reason for that is XYZ.” No such penalty is currently in place. I have never seen anything like that being introduced or being communicated to Google.
However, what is happening quite a lot is that websites that use the rankings for the most commercial terms. If blue fluffy bunny is being the term we’re going to use now for the moment, if there’s been a lot of link building done on blue fluffy bunnies, that specific anchor including buy now, check here. It is very obvious from Google’s perspective that link building is intentional. There is the possibility that this is going to translate to either angular thick intervention, if you want, the thickness is going to get picked up because the sites looking around very low quality—and these are not exclusive from each other—or there is a manual spam action that literally just says, “Yes, you’ve got a penalty for link building,” so that’s unnatural links to your website, that’s the official term. Then it’s going to feel just about the same.
In each case, the remedy really is to come back and audit. Again, in this instance, it is the greatest importance to have a critical volume of data very individual to every website and the data provided by Google Search Console in this context is very useful. It is typically not sufficient to actually do an audit because it’s just not the full picture.
Again, Google Search tools we’ve mentioned here. Trust me, you’re going to have a great source of data and other tools. All of these data sources shall be combined so an audit can be performed to see what backlinks do we actually have that these websites that we paid for—or not—or somebody else paid for, have these being spread, did this situation magnified without our knowledge? Can we disavow those links?
In essence, to answer your question in a nutshell, it is possible that this effect kicks in, but Google will not be saying, “You got a penalty for a particular keyword.” It would be saying, “You’ve got a penalty for those links.”
Right. In SEO, they’re going to say potentially you got a keyword level algorithmic penalty, but that’s an imprecise way of stating it because it is not an algorithmic penalty, it’s an algorithm update and it’s felt at the keyword level because there was over-aggressive targeting of a commercial keyword with overabundance of anchor text. That’s a good summary of what you’re describing, is that right?
Spot on. Yeah, that’s exactly the case. Again, here, it’s not just the experience we’re talking about working. Literally applying this, those manual spam actions, it is also the experience. I’m very clear to working with clients removing those very same penalties or removing the very same effect we’re talking about here. It is literally just as you summarized.
All right. You mentioned doing an analysis or audit of the links. I do link audits for clients as well as technical audits, content audits, and so forth. When I do a link audit, I’m looking for unnatural patterns, a lack of diversity in the link profile and the link graph. If there’s an overabundance of low trust or low importance pages linking, if there’s an overabundance of a certain TLD that looks unnatural, anything that looks unnatural in the link profile, I’m going to spot with a link audit. I’m going to look for toxicity levels with tools like Link Research Tools’ Link Detox and report on all that with my analysis and my recommendations. Potentially then, we have to do a link cleanup and disavow process. What else do you want to weigh in with as far as what the link audit process entails for the way you do it or just what should be done as far as you’re concerned?
Patterns are very important. That’s just what you said. Anything that really strikes, that smacks of the attention you created including all legacy or old school stuff where occasionally the client would be saying, “Yes, but we build those 10 or 15 years ago.” Directory starts asking us in the Netherlands. All of these things need to be included. They are part of the overall picture. I’m not sure if I mentioned that we also look into the quality of images that are linking to the client-side. We also look into where else would they link to. Do those pages look like they are linked from selling links? This is, again, something we do factor in.What Google gives, Google can take away at any time. Click To Tweet
Overall, I think, the distribution and any patterns are the most important factors. I always say in doubt, it is probably better to disavow patterns when one isn’t quite sure. For starters, the client is going to sleep better, best latest practices, and also the disavow tool does not translate to any loss of the forthcoming traffic.
This is something that is very important to remember because it is literally just a tool that tells Google, “We only just associate ourselves with this website,” which is, by the way, a recommendation to Google. It is not a directive. Google may or may not abide to that recommendation. They wouldn’t penalize a website that has disavowed links that are suspicious anymore. You can’t do that, that’s part of their policy.
However, there is actually converging traffic coming from those undesirable spammy or maybe just a little bit suspicious, otherwise legitimate, but a little bit suspicious backlinks, that will continue to be forthcoming because the disavow file has no impact on that traffic, whatsoever. This is something that is very important. It opens up a conversation and it leads towards better consensus around what needs to be disavowed and how much we disavowed in my experience.
Yeah. I’m sure you’ve come across clients and I have, where they’ve over disavowed and it’s actually taking good links and throwing them out with the bathwater. They will lose rankings because those links that are valuable are being devalued.
Like you just said, I’ve come across these cases. We’ve literally come across clients that had been disavowing in their respective disavowed files, the rest of their portfolio as part of being very cautious.
Their own sites that are linking.
Their own websites and that precaution were obviously backfiring massively and unnecessarily. But that’s part of sharing best practice and educating. This is a great opportunity to highlight to the client and to the site owner. This literally doesn’t need to be done. It is absolutely legitimate to cross-link once on a portfolio. If in doubt, one should make sure that it’s obvious those are parts of the same portfolio. Those websites are being connected to each other rather than try to hide it. It’s something that can be used as a positive signal, even crosslinks with each other.
Cool. I know we’re out of time here. If you could point our listeners to one or two helpful resources in regards to penalties and what to look out for, what to do to mitigate, and ameliorate the situation if you do have a penalty, do you have, some checklist, worksheet, or article that you want to send our listeners to and then also your website is all that we should mention that in the episode as well?
Absolutely. Happy to do so. Because the topic has come up so frequently, I had the opportunity to submit evergreen content written specifically with this topic in mind, the Search Engine Land. That article, that specifically describes types of penalties, the impact they have and what to do about it. That’s part of evergreen content on Search Engine Land. I’ll be happy to share that as well.
I also penned a rather more recent article about a month ago on how manual connections, penalties, how they differ from algorithms. Again, this is on Search Engine Land. Again, this is something I’m very happy to share. I’m also very happy to receive the listener’s feedback if there are questions.
Anyone wishing to get in touch to talk about SEO, Google, you name it, we are easy to be found on searchbrothers.com. Also, you can find me on Twitter @kas_tweets, again, willing to be sharing subsequent to the show being aired, I suppose.
Yeah. All those links, we’ll include in the show notes. If you’re driving or you’re working out at the gym, don’t worry, we got you covered. Just go to marketingspeak.com/205 and you will be able to get directly to the show notes for this episode.
Thank you so much, Kaspar. This was a really great, informative interview. Our listeners are going to be much more empowered now to deal with penalties or to even just future-proof their sites against future penalties that could come to pass if they hadn’t listened to this episode. Thank you so much.
Stephan, thank you so much for having me. It’s been a real pleasure. I really enjoyed our conversation. If there are any follow-up questions from the audience, maybe we will have another opportunity to talk at a conference at the course I’m thinking or maybe another recording in a different topic as initial if that’s what you wish to be doing. I’ll be happy to contribute again. Thanks again for having me, a real pleasure.
- Kaspar Szymanski
- @kas_tweets – Twitter
- Google penalties and messages explained — Search Engine Land’s ultimate guide
- Google algorithms are not Google penalties
- Google Quality Rater Guidelines (PDF)
- There’s no shortcut to authority: Why you need to take E-A-T seriously
- Google Maps
- John Mueller
- @JohnMu – Twitter
- Matt Cutts
- Danny Sullivan
- SMX conference
- Search Engine Land
- Chris Sherman
- Google Webmaster Central Blog
- Google Search Console
- Fili Wiese
- Link Research Tools
Your Checklist of Actions to Take
Determine whether the drastic change in my site ranking is because of a penalty or a change in the Google algorithm. Examples of popular algorithm changes are Panda and Penguin. To find out which is which, let this article be my guide.
I will become familiar with Google’s manual raters’ guidelines and ensure I use them as rules for best practice on my site.
I will create an effective and convenient web design that enables visitors to browse efficiently.
Get reviews and testimonials from my clients to improve my authority in my niche.
Get rid of spammy content so that Google bots will find my site as trustworthy.
Clean out unnatural backlinks as they can harm the overall ranking of my website.
Avoid cloaking at all costs. This is considered black hat SEO and could result in a Google penalty.
Check for duplicate content within my pages and on other websites that may have plagiarized my content. Make sure that everything I publish is relevant and unique.
Consider the acronym EAT when developing my SEO strategy. It stands for expertise, authoritativeness, and trustworthiness.
Check out Kaspar’s site searchbrothers.com for more info on how to prevent a Google penalty and advice for sites that have already been hit by one.
About Kaspar Szymanski
Kaspar Szymanski is an information scientist with a passion for content and brand building, a former web spam fighter, a manual spam action expert, an SEO consultant specializing in backlink analysis, reconsideration requests and site recovery, an accomplished writer and conference speaker, and an aviator and marathon runner in his spare time.