About The Word Lists

Litscape.com Default Word List

The best word list ever, anywhere.

The Litscape.com default word list is the best word list ever, anywhere, because it is made up of the words we use. The goal was to get the best possible word list for our word finder tool to operate on. There are many word lists on the internet at various sources. These lists vary in the number of words and in the quality of words. Presumably these lists have been reviewed by many people over time, but my confidence level varies from list to list. There are many words that are so questionable that I can't see anyone using them in a written context. Marker words (strings of letters that look like words but aren't words) have been inserted by various groups in order to identify derivative lists.

In producing the Litscape.com default word list, we made the basic assumption that a word has higher credibility if it is used in a written context, in multiple and unrelated sources, and in reasonable frequencies. Because it was not practical to write sentences for every word in all of these lists, over half a million words collectively, we asked the following questions:

We studied the text of thousands of public domain books and millions of webpages. The computer performs beautifully here, as it is very tedious and difficult for a human to gather and process this amount of data. While we can't study all sources, we can study a lot of sources and get a pretty good idea what real words are. By applying our proprietary algorithms to the data collected from thousands of book sources and and millions of web sources, we were able to establish the Litscape.com default word list, a word list with approximately 138,374 words.

Books that fall into the public domain are typically older books, and have likely undergone the scrutiny of an editor. Getting these books into a textual format involves a process known as Optical Character Recognition (OCR). This basically tries to identify each character on an image of a book page, and output a text-based representation of the book. This technology is not perfect and book printing is not perfect. The quality of OCR varies from source to source. Errors happen, but fortunately many words are transcribed correctly from the image form. Words resulting from OCR errors rank very low in our algorithms and essentially vanish. Public domain books are typically old books, and the state of things in medicine, science, and technology has changed dramatically over the years. A lot more words, technical words, are in play today than were used before 1922. The books gave us a pretty good set of words, though somewhat limited. The words from public domain books were only part of the equation. We needed another source.

According to NetCraft.com, in March 2014, there were 919,533,715 websites in the world. There are a lot of sites to choose from, on any topic you can imagine, and that adds up to a lot of sources and a lot of words, and not just old words. All websites are not created equally, and there are many spelling errors, intentional or not, and other nonwords, but that is all right. Some sites try to stuff keywords into their pages in an attempt to get search engine traffic. We can ignore pages that have words used in unreasonable numbers. We extracted and gathered words from millions of webpages just like we did for the books. Over millions of pages, the cream of the word crop floats to the top, and the nonwords stay at the bottom.

The Litscape.com default word list is not a static word list. It is not a derivative list. It is an ever-evolving list, with changes happening from further study of other sources, user submissions, manual reviews, algorithm tinkering, and the ever changing language. If you can say it, you can write it. A word is a word if you can use it in a sentence. The Litscape.com default word list, in my opinion, is the best word list ever, anywhere, because it is made up of words we use.

The Scrabbleable Word List at Litscape.com

The scrabbleable word list at Litscape.com is a subset of words in the default word list that are scrabbleable (i.e they can be made using the set of Scrabble® letters and they have to fit on the scrabble board). Because the Scrabble® board is 15 squares wide and you could potentially extend a word to that length, the length of the words in this scrabbleable word list is limited to between 2 and 15 letters. Also, you have to be able to make the word using the letters in a Scrabble® game, staying within the bounds of the letter frequency distribution, including blanks. Note that this word list is not consistent with Scrabble® rules, in terms of the use of compound words, proper nouns and so on. Use your own judgment.

Word Censorship at Litscape.com

How many offensive words can you think of?

Litscape.com has scoured the internet for lists of offensive words, and compiled them into one big list. The end result was approximately 2500 offensive words. This list is truly the cesspool of the English language, and because of this, we will not make it available. It certainly isn't appropriate for viewing by children or those easily offended. If any of these words occur in the censored dictionaries, they are removed. The uncensored lists still have some of them, and because of this, the uncensored lists are not suitable for all audiences.

Our rule of thumb is that if little Cindy-Lou Who were to ask her mother what the word means, and if Mom couldn't come up with a respectable answer, in our judgment, then the word remains censored. If it is a proper anatomical term, the word is deemed suitable.

The word lists are best effort, with no guarantee of suitability.

Suggest a word, Censor a word, Remove a word.

Ideally, our censored word lists are meant to be suitable for all audiences, free from profanities, racial slurs, and other offensive words. Be forewarned, despite best efforts, we have a very small staff. Words may have been missed. If you spot any nasty words in our censored lists, please help with the effort and submit inappropriate words using tool at the above link.

We are forever striving to increase our word counts and to provide better search results. If you notice a word missing that you think should be displaying (English language only), please submit this word to us also. Likewise, if you see a word that shouldn't be there because it is not a word, let us know and we will check it.

To ensure the quality of results, all submissions are manually reviewed. Thank you for your cooperation in this endeavor.

Baby Names List

The names in the Litscape.com names list were obtained mostly from information made available by government agencies worldwide. This included census data, social security data, birth records, death records etc. They are first names (a.k.a. given names, forenames, personal names, Christian names) and middle names of individuals. Names flagged by our profanity filters were removed (best effort, no guarantee).

The name tools provide suggestions for names based only on letter sequences and combinations. Maybe you can find a name by merging the names or initials of the mother and the father. If you find an name you like, be sure to research it further.

We commonly use the terms girls names and boys names. Many names have been used for both genders. We can think of names as predominantly male or female, but the terms exclusively male or female do not seem to apply. Johnny Cash sang about A Boy Named Sue. It is not that uncommon. However, naming a baby of one gender with a name predominantly used for the other gender needs careful consideration. People are people and ridicule happens, so bear this in mind when naming your baby.

Summary of the Word Lists at Litscape.com

Word ListWord CountWord LengthsWord Source
Litscape.com Censored138,374 wordsCurrently 1-52 letters
No length limit
Words we use

Censored for offensive language
Scrabbleable Censored133,000 words2 - 15 lettersWords we use that are scrabbleable. A subset of words from the Litscape.com default word list.

Censored for offensive language
Scrabbleable Uncensored134,000 words2 - 15 lettersWords we use that are scrabbleable. A subset of words from the Litscape.com default word list.
Enable Censored170,736 words28 lettersEnable Public domain word list

Censored for offensive language.
Enable Uncensored171,328 words28 lettersEnable Public domain word list
Mammoth Censored294,335 words52 lettersAll in one list:
Litscape.com Default Word List
Enable Word List
Sowpods Word List

Censored for offensive language.
Mammoth Uncensored295,163 words52 lettersAll in one list:
Litscape.com Default Word List
Enable Word List
Sowpods Word List
Names List144,795 names24 lettersExtracted mostly from government sources all over the world.

Censored for offensive language.
Best effort, no guarantees.

Always research a name before you name your child it.