Google has been secretly delisting articles from its search results, and a journalist only found out by complete accident. This was used maliciously already, but Google says the program has now been addressed.

This SEO trick lets someone delist specific web pages from the search engine usingGoogle’s Refresh Outdated Contenttool. This lets a site submit pages to be recrawled and relisted after an update. The whole issue was tied to capitalizing different letters in the URL within this tool, which ultimately caused the delisting.

Basically, when a link is given with capitals, Google’s system would get confused by the capitalization changes. This causes Google to think the page no longer exists, and then de-indexes all variations of that URL, even the legitimate ones.

Journalist Jack Poulson stumbled upon this problem when he was Googling for one of his own articles, and it just wouldn’t show up, even when he typed in the exact title. It turns out that in 2023, Poulson published an article about tech CEO Delwin Maurice Blackman’s 2021 arrest on a felony domestic violence charge. After Poulson published Blackman’s arrest records, the CEO tried to suppress the story in a number of ways, including lawsuits and DMCA takedown requests.

Ahmed Zidan, the Deputy Director of Audience for the Freedom of the Press Foundation, also had an article about Poulson’s fight against censorship being delisted from Google. When Poulson noticed his articles were gone, he alerted Zidan, who did some digging and figured out what was happening. Zidan looked in theGoogle Search Console(GSC), which is where website owners can tweak their site’s place in search results, and he found repeated requests to recrawl the article about Poulson and Blackman.

What’s interesting is that this apparently wasn’t told to anyone. Whenever I ask the GSC to index a page or something similar, I am sent emails with the request and the result. There’s no confirmation if there were emails sent to let website owners know that the articles were being recrawled, but there likely is now.

These requests started in May and ended in June, and in every single one, the capitalization of letters in the URL had been changed. For example, one request capitalized the ‘A’ in “anatomy,” and after that expired, another request capitalized the ‘n’ in “anatomy.” When Google tried to index these URLs with messed-up capitalization, it got a 404 error, and then, instead of just indexing the 404 page, Google would de-index all variations, including the live, legitimate article.

Luckily, since Google took this vulnerability down, it won’t affect anyone anymore. However, there’s no guarantee that every page affected by this vulnerability was brought back, just the pages that were found.