August 06, 2008

GUESS THE COMMONEST WORDS.

This is both frustrating and fun. Type in English words you think might be among the 100 most common, and if you're right, the world will appear in its box. Note that if you start typing in a three-letter word that happens to start with a two-letter word, the latter will appear instead, but you can then go back and start typing again and it will accept the longer word (if, of course, it's on the list). Warning: I'm pretty sure the list is flawed, because some of the words I (and people on the MetaFilter thread where I found it) tried have to be more common than a few of the ones they include. My score: 47. One person at MeFi claims to have gotten 74; I'm not sure I believe him. You have five minutes, and it's harder than you think.

Posted by languagehat at August 6, 2008 09:21 AM
Comments

I got 53, which is kinda weak since I used to work with stop word lists all the time--words you wouldn't normally want your search engine keeping track of, for example, like "the".

You are right that some of those words can't possibly be in the top 100. ("water"?)

They must have culled the list themselves from some small, slightly skewed sample.

Wikipedia has a list taken from billion-word-plus Oxford English Corpus.. following the link before taking the quiz is cheating!

http://en.wikipedia.org/wiki/Most_common_words_in_English

Posted by: Trey at August 6, 2008 09:52 AM

I got 53.

FWIW, I wouldn't count a and an as separate words and would probably count different forms of the same verb as the same (pronouns as well). I'd probably leave will and would as separate words though.

Posted by: michael farris at August 6, 2008 09:53 AM

That was fun! I got 61. Kicking myself over a few very easy ones.

Posted by: Karen at August 6, 2008 11:10 AM

Well, I only got 28, but I messed up a little in the middle. I wasn't looking at the screen, and then suddenly I saw that all the words I was typing had stayed piled up in the box. So I did the test again. The second time I got 27 (that included water, which I had read about). How come very isn't on it? A depressing ten minutes, but at least nothing bad is going to happen, like I'm going to have to stop practising medicine until I get 53 like everyone else-- not that I have been, I hasten to add.

Posted by: Crown, A. J. at August 6, 2008 11:41 AM

I lost interest when it didn't recognize "am" and "can't."

Posted by: rootlesscosmo at August 6, 2008 11:50 AM

I think I agree with Mr Farris. And I'm trying to suss out the reasoning over at Oxon whereby is is the same word as be, while him is a different word from he. IANALexicographer. Anyone?

Also for some reason I find it satisfying that one word in the top 100 is of Etruscan origin.

I got 46.

Posted by: komfo,amonan at August 6, 2008 01:17 PM

"IANALexicographer." I don't care what sort of exicographer you are.

Posted by: dearieme at August 6, 2008 01:29 PM

Trey: "I got 53, which is kinda weak"

As someone who got 31, I feel your pain.

I really don't think that people who work with words in a professional capacity should attempt this quiz because they're on a hiding to nothing if they do.

Posted by: Glyn at August 6, 2008 02:19 PM

So very true!

Posted by: language hat at August 6, 2008 03:34 PM

My score was 47 as well. It would have been worse but I've seen various lists of the most common English words before, which allowed me at least a few I wouldn't have thought of otherwise. I can just believe 74, barely.

I'm curious though, what's the Etruscan-originated word?

Posted by: Dee at August 6, 2008 04:35 PM

I got a 52. I don't know about the frequencies of some of these words, though. Maybe we should try putting them into the Washington University neighborhood database and getting their frequencies?
http://128.252.27.56/neighborhood/Home.asp

Posted by: M. Oxley at August 6, 2008 04:46 PM

@dearieme: WIN.
@Dee: Hint: a key ingredient in Soylent Green.

Posted by: komfo,amonan at August 6, 2008 05:45 PM

37. Once I was done with prepositions, pronouns and some basic verbs, total blank. So much so, that I didn't even get 'have'. I'll go hide in a corner now.

Posted by: bulbul at August 6, 2008 06:24 PM

Oh, Soylent Green is . . . people! I wouldn't have known that just a couple of months ago, but my American cultural knowledge is on the increase of late. My etymological dictionary takes it back only so far as the Latin "populus," and a few other sources hint at the possibility that it's original root is Etruscan . . . but maddingly to me, they never explain how they've come to suspect such a thing.

Posted by: Dee at August 6, 2008 08:04 PM

Dammit, the timer didn't work for me. I'm on a new computer, so it could be some scripting feature that is not properly configured yet. But I got 63 in about seven minutes, I suppose.

Posted by: Noetica at August 6, 2008 08:12 PM

I got 52.

Sour grapes aside, I think the design is flawed: I didn't get "is", though I remember typing it in very early. It turned into "I", the box cleared, and I ended up with an "s", which threw me off... and I forgot to type "is" again. A system where you type and hit enter would be more representative of "words you can think of" rather than "words you happen to type in [while possibly thinking of something quite different]".

Posted by: Matt at August 6, 2008 08:35 PM

Bitch, bitch, bitch. Next you'll be complaining they don't let you enter words in katakana.

Posted by: language hat at August 6, 2008 09:09 PM

Yeah, I'm saving that one for a blog post.

I guess I'm mainly just annoyed at the site for proving to me that I'm incapable of remembering the word "is" for more than 2 seconds at a time.

Posted by: Matt at August 7, 2008 12:40 AM

...I don't want to boast, but with 26 and 27 I've still got the lowest score, the only two in the twenties, in fact...

Posted by: Crown, A. J. at August 7, 2008 03:07 AM

I got 56. But this test is strange. How can 'get' not be there?

Posted by: Jordan at August 7, 2008 03:09 AM

46. There were a few I missed that I was sure I typed in, but in any case, that was fun.

Posted by: uneasy rhetoric at August 7, 2008 03:28 AM

how does the list compare with http://wordcount.org/ ?

Posted by: ubu at August 7, 2008 09:08 AM

Perhaps I was thinking of 'person', which seems to have a specific Etruscan word thought to be its source. The pre-Latin origins of 'people' seem to be lost. I'm guessing the Etruscan guess is due to the absence of contemporary IE cognates ... ?

Posted by: komfo,amonan at August 7, 2008 11:40 AM

Perhaps I was thinking of 'person', which seems to have a specific Etruscan word thought to be its source.

Yes, that is conjectured: against a common wishful etymology per sona that supposes a mask through which the actor sounds forth. A lovely thought. Would that it were so!

Posted by: Noetica at August 7, 2008 07:35 PM

Yes, so many folk etymologies are beautiful things!

Posted by: language hat at August 7, 2008 09:35 PM

45 here

Posted by: strosseI at August 8, 2008 08:20 PM

Most words are ultimately Etruscan.

Posted by: John Emerson at August 9, 2008 07:40 PM

Well, I got 72 (honestly) but that's only because I teach a course that features a discussion of this very thing. I missed most of their verbs, though.

Posted by: The Ridger at August 9, 2008 08:07 PM

Most words are ultimately Etruscan.

And by "Etruscan," I presume you mean Dravidian.

Posted by: language hat at August 9, 2008 08:25 PM