How to Spell Check a Website at Scale

Checking a single document for spelling and grammar errors can be quite difficult, but imagine checking a massive website with thousands of pages.

To be certain, spelling and grammar tools are ubiquitous for individual documents. However, the availability of tools is no guarantee of perfection.

Mistakes made

I submitted this article to Google Docs’ built-in spelling and grammar checking (Command-Option-X on a Mac) and Grammarly. Still, the editor will undoubtedly have plenty of opportunity for corrections and amendments. Likely suspects include articles (a, a, the), word endings, and typos.

Grammarly finds a spelling mistake in the proper name Screaming Frog.

Now imagine the same task on a large scale.

Here is a scenario. You just bought a blog with 17,000 posts describing do-it-yourself products. The idea was to use the blog to drive traffic to your online craft supply store. But you noticed that the previous owners had many grammatical and spelling errors.

You don’t like the idea of ​​checking 17,000 items individually. So what are you doing?

Here are some options.

Not too technical

If your technical skill is using software, there are a few options for spell-checking an entire site, including the 17,000-post DIY blog discussed above.

Screaming frog. Screaming Frog SEO Spider is an essential search engine optimization and keyword research tool. It will also check the spelling of an entire website.

The company has a detailed tutorial on the implementation of spelling and grammar analyses. Enable spelling and grammar checking, and like magic, SEO Spider will identify and report errors. You can also export a list of pages to update. SEO Spider also supports multiple languages.

This is a premium feature requiring the licensed version, which at the time of writing was £149.00 per year (around $195.95).

Screenshot of the Screaming Frog spellcheck page

Screaming Frog makes it easy to add spelling and grammar checking to crawls.

SortSite. by PowerMapper SortSite is a go-to tool for broken link monitoring and website accessibility testing. The tool also checks spelling, finding misspelled words and placeholders such as “lorem ipsum”. And, when configured, it can recognize unusual words or names.

A perpetual license for the desktop version of SortSite was $149 at the time of writing.

Screenshot of the SortSite homepage

SortSite is a powerful tool that also includes a good spellcheck.

Various online tools. A quick Google search yields many free online spell checkers. For example, Internet Marketing Ninjas offers a free spell checker for up to 1,000 pages. But the tool has a limited dictionary. It doesn’t recognize “podcast”, for example.

Technical

There are more options for full-featured spelling and grammar checkers through an application programming interface or command-line software. Both require more work to set up than SEO Spider or SortSite, but they can offer a more robust review.

Plus, it might be worth it for a blog with 17,000 posts.

In each case, you would pass to the API (or Aspell, below) the text of each page. This could be from a database connection, an export, or a web crawler. The API would then return a list of spelling and grammar errors.

Bing Spell Check API. Search engines such as Microsoft Bing need to understand the spelling and grammar of Internet users.

the Bing Spell Check API is driven by machine learning and goes beyond matching words in a dictionary. It is one of the best choices in terms of the quality of the results.

But it has limits. In “proof” mode, the API will only allow text strings of 4096 characters or less. That’s something like 800 words. Longer items should be split and sent in a few “transactions”.

Pricing is tier-based. In March 2022, one would expect to pay $7 for 25,000 monthly transactions.

WProofreader SDK. Using WebSpellChecker software development kit is analogous to deploying a jackhammer to drive in a nail, but it will definitely do the trick.

The SDK has components for adding spelling and grammar checks to apps, but for that context it also has a standalone HTTP API. Rates vary by usage.

Other APIs. Other API options beyond Bing and WebSpellChecker include GrammarBot, TextGearsand Perfect tense.

GNU Aspell. This command line spell checker is free and usually installed on a Linux system (which runs most websites).

Using Aspell will still require some coding, but relatively less than the other technical solutions above. Get the web pages in a text format, then write a script to call Aspell for each file.

Sherry J. Basler