About The Word Lists

Litscape.com Default Word List

The words we use.
The best word list ever, anywhere.

The Litscape.com default word list is the best word list ever, anywhere, because it is made up of the words we use. The goal was to get the best possible word list for our word finder tool to operate on. There are many word lists on the internet at various sources. These lists vary in the number of words and in the quality of words. Presumably these lists have been reviewed by many people over time, but my confidence level varies from list to list. There are many words that are so questionable that I can't see anyone using them in a written context. Marker words (strings of letters that look like words but aren't words) have been inserted by various groups in order to identify derivative lists.

In producing the Litscape.com default word list, we made the basic assumption that a word has higher credibility if it is used in a written context, in multiple and unrelated sources, and in reasonable frequencies. Because it was not practical to write sentences for every word in all of these lists, over half a million words collectively, we asked the following questions:

  • Has anyone used these words in a sentence?
  • Does the word appear in a written context anywhere?
  • How many sources does it show up in and how many times is it used?

We studied the text of thousands of public domain books and millions of webpages. The computer performs beautifully here, as it is very tedious and difficult for a human to gather and process this amount of data. While we can't study all sources, we can study a lot of sources and get a pretty good idea what real words are. By applying our proprietary algorithms to the data collected from thousands of book sources and and millions of web sources, we were able to establish the Litscape.com default word list, a word list with approximately 221,719 words.

Books that fall into the public domain are typically older books, and have likely undergone the scrutiny of an editor. Getting these books into a textual format involves a process known as Optical Character Recognition (OCR). This basically tries to identify each character on an image of a book page, and output a text-based representation of the book. This technology is not perfect and book printing is not perfect. The quality of OCR varies from source to source. Errors happen, but fortunately many words are transcribed correctly from the image form. Words resulting from OCR errors rank very low in our algorithms and essentially vanish. Public domain books are typically old books, and the state of things in medicine, science, and technology has changed dramatically over the years. A lot more words, technical words, are in play today than were used before 1922. The books gave us a pretty good set of words, though somewhat limited. The words from public domain books were only part of the equation. We needed another source.

According to NetCraft.com, in March 2014, there were 919,533,715 websites in the world. There are a lot of sites to choose from, on any topic you can imagine, and that adds up to a lot of sources and a lot of words, and not just old words. All websites are not created equally, and there are many spelling errors, intentional or not, and other nonwords, but that is all right. Some sites try to stuff keywords into their pages in an attempt to get search engine traffic. We can ignore pages that have words used in unreasonable numbers. We extracted and gathered words from millions of webpages just like we did for the books. Over millions of pages, the cream of the word crop floats to the top, and the nonwords stay at the bottom.

The Litscape.com default word list is not a static word list. It is not a derivative list. It is an ever-evolving list, with changes happening from the study of additional sources, user submissions, manual reviews, algorithm tinkering, and the ever changing language. If you can say it, you can write it. A word is a word if you can use it in a sentence. The Litscape.com default word list, in my opinion, is the best word list ever, anywhere, because it is made up of words we use.


The Scrabbleable Word List at Litscape.com

The scrabbleable word list at Litscape.com is a subset of words in the default word list that are scrabbleable. It has to be possible to make the words using the letters in Scrabble®, staying within the bounds of the letter frequency distribution, including blanks, and the words need to fit on the board. Because the Scrabble® board is 15 squares wide, you could potentially extend a word to that length, so the length of the words in this scrabbleable word list is limited to between 2 and 15 letters. Note that we have no official Scrabble® wordlists, so use your own judgement.


The Friendable Word List at Litscape.com

The friendable word list at Litscape.com is a subset of words in the default word list that are friendable. This means that they can be made using the set of Words With Friends™ letters staying within the bounds of the letter frequency distribution (blanks included), and they have to fit on the Words With Friends™ board. Because the Words With Friends™ board is 15 squares wide and you could potentially extend a word to that length, the length of the words in this friendable word list is limited to between 2 and 15 letters. Note that we have no official word lists for the game, so use your own judgement.


What is the significance of STRESSLESSNESS?

STRESSLESSNESS is the only word that we know of that you can make in one game but not the other.

Stresslessness requires 7 S tiles. Both games have two blanks. Words With Friends™ has 5 S tiles and Scrabble® only has 4. If you look at the word counts in the lists for each game, they are nearly identical, and this is why. All 209,203 words in the Scrabbleable list can also be made in Words With Friends™.


Word Censorship at Litscape.com

How many offensive words can you think of?

Litscape.com has scoured the internet for lists of offensive words, and compiled them into one big list. The end result was approximately 2500 offensive words. This list is truly the cesspool of the English language, and because of this, we will not make it available. It certainly isn't appropriate for viewing by children or those easily offended. If any of these words occur in the censored dictionaries, they are removed. The uncensored lists still have some of them, and because of this, the uncensored lists are not suitable for all audiences.

Our rule of thumb is that if little Cindy-Lou Who were to ask her mother what the word means, and if Mom couldn't come up with a respectable answer, in our judgment, then the word remains censored. If it is a proper anatomical term, the word is deemed suitable.

The word lists are best effort, with no guarantee of suitability.

Suggest a word, Censor a word, Remove a word.

Ideally, our censored word lists are meant to be suitable for all audiences, free from profanities, racial slurs, and other offensive words. Be forewarned, despite best efforts, we have a very small staff. Words may have been missed. If you spot any nasty words in our censored lists, please help with the effort and submit inappropriate words using tool at the above link.

We are forever striving to increase our word counts and to provide better search results. If you notice a word missing that you think should be displaying (English language only), please submit this word to us also. Likewise, if you see a word that shouldn't be there because it is not a word, let us know and we will check it.

To ensure the quality of results, all submissions are manually reviewed. Thank you for your cooperation in this endeavor.


Baby Names List

The names in the Litscape.com names list were obtained mostly from information made available by government agencies worldwide. This included census data, social security data, birth records, death records etc. They are first names (a.k.a. given names, forenames, personal names, Christian names) and middle names of individuals. Names flagged by our English language profanity filters were removed from the lists (best effort, no guarantee). There has been no profanity filtering in any other language, so always research a name before you name your child.

The name tools provide suggestions for names based only on letter sequences and combinations. Maybe you can find a name by merging the names or initials of the mother and the father. If you find an name you like, be sure to research it further.

We commonly use the terms girls names and boys names. Many names have been used for both genders. We can think of names as predominantly male or female, but the terms exclusively male or female do not seem to apply. Johnny Cash sang about A Boy Named Sue. It is not that uncommon. However, naming a baby of one gender with a name predominantly used for the other gender needs careful consideration. People are people and ridicule happens, so bear this in mind when naming your baby.


Summary of the Word Lists at Litscape.com

Word ListWord CountWord LengthsWord Source
Litscape.com Censored221,719 words2 - 52 lettersWords we use

Censored for offensive language
Enable Censored170,695 words2 - 28 lettersEnable Public domain word list

Censored for offensive language.
Enable Uncensored171,298 words2 - 28 lettersEnable Public domain word list

Uncensored, contains bad words.
Mammoth Censored343,463 words2 - 52 lettersAll in one list:
Litscape.com Default Word List
Enable Word List
Sowpods Word List

Censored for offensive language.
Mammoth Uncensored344,306 words2 - 52 lettersAll in one list:
Litscape.com Default Word List
Enable Word List
Sowpods Word List

Uncensored, contains bad words.
Scrabbleable Censored209,203 words2 - 15 lettersWords we use that are scrabbleable. A subset of words from the Litscape.com default word list.

Censored for offensive language
Scrabbleable Uncensored209,695 words2 - 15 lettersWords we use that are scrabbleable. A subset of words from the Litscape.com default word list.

Uncensored, contains bad words.
Friendable Censored209,204 words2 - 15 lettersWords we use that are friendable. A subset of words from the Litscape.com default word list.

Censored for offensive language
Friendable Uncensored209,696 words2 - 15 lettersWords we use that are friendable. A subset of words from the Litscape.com default word list.

Uncensored, contains bad words.
Names List144,359 names2 - 24 lettersExtracted mostly from government sources all over the world.

Censored for offensive English words.
No foreign language censoring has been done, so always research a name before you name your child it.
Best effort, no guarantees.

Baby Naming Tips and Name Finder Search Tools