Today on Blogcritics
Home » Culture and Society » Science and Technology » Google: Three Times The Useless Information Of Any Other Search Engine

Google: Three Times The Useless Information Of Any Other Search Engine

Please Share...Tweet about this on Twitter0Share on Facebook0Share on Google+0Share on LinkedIn0Pin on Pinterest0Share on TumblrShare on StumbleUpon0Share on Reddit0Email this to someone

In Internet years, I’m a dinosaur. If you want to learn about dinosaurs, type “dinosaur” into google and you’ll get — as of this minute — a tickle over 16 million hits. I was able to examine the first few million, but beyond that I found it quite impossible to evaluate which were worthy of further pursuit.

By any account, 16 million is a lot of references. Surely, competent thinkers willing to press their sharp wit against the prospect could cut the fluff to find which among them was spot on. Within a few years, anyway.

History tells us that libraries used to have what they called a “Reference” section. Ordinarily, the books from this section could not be borrowed, so that their contents would be constantly available. In this ancient time, the inquirer would first consult an encyclopedia for a broad overview of a topic, then consult references from the encyclopedia article or card catalog until the topic had been exhausted or sufficiently covered.

It is important to consider how awfully difficult it is to get a book published, especially by a respectable publisher. Given the arduousness of the process, one could rest fairly assured that the books in the stacks and on the reference shelves would have been well-researched and written and would be, shall we say, reliable.

Let’s say that I wanted to publish a book about sandpaper. My first step would be to compile and devour every source that I could find. Then, I would write up a thesis and an outline, or most generally, provide a structure for my text. Structure in hand, I would write a chapter or two in an effort to work out ideas and style for potential publishers. Then, I’d start to query relevant publishers, pitching the book as a must-read for a reference-starved demographic,
thus guaranteeing bestseller sales.

Once accepted, I’d have probably six months to a year to deliver the manuscript. Then the editing process would begin, I’d get proofs to review, consult on cover design, and finally, a couple of years into the process, my book might be published.

On the other hand, if I wanted to write an Internet article about sandpaper, I’d write it, format it as a web page, and upload it to a server on which I purchase space.

It could be that my Internet article about sandpaper was the most intriguing sandpaper exposé ever written. But who would know? Even if it was well written, no one would have cross-checked my facts or evaluated the structure of my arguments. There would be no consensus about the quality and accuracy of my Internet article.

That’s just about where we stand with the 16 million pages that mention dinosaurs. Very little edit / review /oversight. Maybe incomplete sentences. Or speling mistook.

It’s not an Earth-shattering case that I’m proposing here: there’s little question that the Internet sacrifices quality for quantity, and leaves us, hungry knowledge consumers, with little capacity to sift through the rubble to find the edible crumbs.

Google’s recent claim to have “indexed” three times the number of pages of their nearest competitors is a less than impressive claim, since it implies that it will take even longer to sift through their triple-sized pile.

I think that it’s fair to describe their pile this way because, let’s be honest, most of the pages out there are junk. Hardly any are peer reviewed, most people edit their own work, and there is no consequence for passing off bad information.

What we really need is an encyclopedia.

And that’s just what we have in the form of Wikipedia, an altogether fantastic, peer-reviewed resource, with already over a million articles. What makes the Wiki so useful is that it is wide open. You and I can edit it at any time. The first temptation might be to think that this is a terrible editorial model, but it works. It works because articles go through what amounts to a perpetual peer review process. If erroneous information gets posted, not long thereafter, it will be corrected. Plus, you can look at old versions of every article and review the progress for yourself. If you are knowledgeable in a field, you can feel free to contribute that knowledge to the article. In sum, the openness of the process makes it self-correcting.

What amuses me most is how we have come full circle. The Internet was supposed to become an indispensable research tool, and it still is to some extent. But it’s not the infallible source that some of us felt that it could have been. The Internet’s fundamental drawback is its lack of structure — this is what separates a pile of unrelated articles from an encyclopedia. The structural deficiency, combined with a lack of peer review and editorial voice, renders much of the Internet’s available information suspect at best, drivel at worst.

When dinosaurs like me roamed the Earth, the encyclopedia, with its structure, its review process and its consistent voice was always the best general starting point for much research. When printed encyclopedias started losing their central place in reference sections, Internet searches tried to take their place. But with the way that the Internet seems to resist structure and elude relevant searches, I find myself right back where I started — and where I was always
comfortable — with the encyclopedia now called “Wiki”.

Ed/Pub:LisaM

Powered by

About Brian Sorrell

Writer, Storyteller, Philosopher, Expat, Father
  • http://paperfrigate.blogspot.com DrPat

    One of the big advantages of Firefox over IE is a search tool that lets me plug a search-phrase into “Wiki+Google” and get the best of both worlds.

    But, yeah, sometimes it’s like drinking from the fire hose.

  • http://www.scoopstories.typepad.com Scott Butki

    What a cynical attitude. That’s like saying the NYC library has three times as much useless info as a small library for you to sift through.

  • http://www.dorksandlosers.com Tan The Man

    The problem with Wikipedia is that there is no REAL authority over the information on it. They have changed this so that some topics cannot be edited and are starting to come up with a editorial review board.

  • http://www.thebluesmokeband.com Brian Sorrell

    Good heavens! Of course I thought of the library counter-example — that’s where the publishing discussion came from. The point is that much of the material catalogued by engines like Google is, well, crap.

    A good example is that if I want to learn about Kevin Trudeau’s snake-oil empire, a satire that I wrote and posted here is in the top 30 hits for his name! It’s just absurd — my unedited crap is that high on the list!

    And I’m not saying that Wiki is the end-all be-all solution here — it’s just amusing to me that it’s about the best that we’ve got on the ‘net, yet it represents a return to a tried-and-true structure rather than an advance in form.

    The move, as I see it, is a glimmer of recognition that authority matters on many weighty topics. That’s a healthy reality check, not cynicism.

    Brian.

  • http://geocities.com/nstix Nicholas Stix

    This is a lovely, elegant essay, just the sort that its author suggests I shouldn’t be able to find on the Internet. But find it I did — at blogcritics.

    “On the other hand, if I wanted to write an Internet article about sandpaper, I’d write it, format it as a web page, and upload it to a server on which I purchase space.

    “It could be that my Internet article about sandpaper was the most intriguing sandpaper exposé ever written. But who would know? Even if it was well written, no one would have cross-checked my facts or evaluated the structure of my arguments. There would be no consensus about the quality and accuracy of my Internet article.”

    You are romanticizing the pre-Net world. Had you written the same article for a daily newspaper, it would not have been fact-checked. Had you written it for a slick magazine, it might have. But what if you were a brilliant scholar of sandpaper in the pre-Net world, but had no editorial connections? Who would know? Your work might never be published. Or you might have pitched it to a “respectable” media organization or publisher, only to have it stolen, and published under someone else’s name. Thus did H.G. Wells steal Florence Deeks’ History of the World ms. via his publisher, Macmillan & Co., and although the cases are hardly equal, the New York Times once stole an article idea from me, assigned it to one of its top writers, and published his version as a Magazine cover story. With the Internet, I don’t waste my time pitching to organizations like the Times. I just post my finished article.

    You also speak of consensus, but consensus is irrelevant; it’s truth that’s important. Liars can arrive at a consensus; they do it all the time.

    Speaking of liars, let’s take the argument yet another step, to academia. You talk about peer review. Peer review is only as good as the peers doing the reviewing. In the natural sciences, academic peer review may be of value, but in the humanities, social sciences, and hate sciences, it’s a joke. The fact that a tenured professor gets an article or book published, is no reason not to take her/his/its research with a grain of salt. Academic hoaxes are today produced assembly-line style.

    “That’s just about where we stand with the 16 million pages that mention dinosaurs. Very little edit / review /oversight. Maybe incomplete sentences. Or speling mistook.”

    “It’s not an Earth-shattering case that I’m proposing here: there’s little question that the Internet sacrifices quantity for quality, and leaves us, hungry knowledge consumers, with little capacity to sift through the rubble to find the edible crumbs.”

    The Internet has nothing to do with our “capacity to sift through the rubble …” The more time you devote to a subject, the quicker you will be able to sift through the rubble. And until you’ve devoted a good deal of time to a subject, you’ll have no idea whether the information you’re being given is true.

    Kids attending universities today are no better off. Professors are supposed to have gone through a process that will weed out lazy, dishonest scholars, but today, things tend to go in the opposite direction — the tenure process weeds out anyone who isn’t a liar.

    How does one weed out google entries? I don’t know how one does it for sandpaper, but when I needed to know whom it was that H.G. Wells plagiarized, I googled under “H.G. Wells” and then under “plagiarism.” Voila!

    How does one determine the quality of entries? I’m glad you asked. With time, one figures out that certain sites or individuals are more reliable than others. And with the Internet, one can often make that determination more rapidly than one could with scholars. For instance, if I am doing research on crime, I expect an author publishing on the Web to provide links to such resources as the FBI’s uniform crime reports. If he doesn’t, immediately loses credibility. By contrast, when Sandra Stotsky wrote Losing Our Language, she was confronted by hundreds of articles by tenured education “scholars” such as Carl Grant that claimed to be based on years of “research.” Stotsky had to spend years in libraries digging through education journals, in order to determine that the “research” didn’t exist. Had the journals been on line, she could have done her research in a fraction as much time.

    As for encyclopedias, the only one I own is the Encyclopedia of New York City. It looks great. It was edited by Columbia University’s Kenneth Jackson. It is based on consensus. And yet, many of the articles it contains are dishonest, and there are crucial elements of NYC history that are simply not told (e.g., the rise of institutionalized black racism). Jackson hired people who shared his politics.

    Ultimately, there are no authorities. One must figure things out for oneself.

  • http://www.roblogpolitics.blogspot.com RJ

    “there’s little question that the Internet sacrifices quantity for quality”

    You have that reversed, I believe…

  • http://3point1.blogspot.com Yashin

    Just as Google’s extensive indexing can give you three times more useless information, it can also gives you three times more *useful* information.

    I use Google as a one of many research tools in my job, and the kind of information I’m looking for simply cannot be found in Wikipedia nor is it ever likely to be found there.

    Instead I have trawl through company annual reports, local news, country gazzeteers, industry papers, academic papers, press releases and the millions of public domain listings out there on the web.

    As an analyst, I have to be careful when taking any ‘facts’ at face value. A company statement could often be treated with just as much scepticism as a blog post!

    I find myself wading through Google listings on a daily basis, but I go back to Google because the other search engines consistently fail to deliver the same amount of *useful* information.

    A user who knows the key words they’re looking for and has a rudimentary knowledge of how the search engine works, can always filter out most of the extraneous pages when presented with 16 million returns.

  • Les Slater

    > How does one determine the quality of entries? I’m glad you asked. With time, one figures out that certain sites or individuals are more reliable than others.

    Or blog entries for that matter.

    Well, I do like Google so I went there and entered Nicholas Stix, no quotes or other qualifiers.

    I didn’t have to look far. Google’s most relevant popped up several that are sure to be the Nicholas Stix that’s posting here.

    I noticed a pattern that would fit Mr. Stix’s criterion of “figure[ing] out that certain sites or individuals are more reliable than others.” It has less to do with objective reality than who’s doing the ‘figuring’.

    Mr. Stix’s essays appeal to those with a certain preconceived notion of the world. Those that have these notions will have them reinforced. They are full of wild allegations that like other people of his ilk never make an objective verifiable reference.

    So it still boils down to the majority of info on the net is just crap. One can find thing one agrees with, it might make them more armed to argue such. It may make the crap smell a bit sweeter.

    One still has to test one’s knowledge, wherever they might get it from, against objective criterion.

    There is no easy road.

  • http://www.thebluesmokeband.com Brian Sorrell

    Rather than shed worthwhile hours of the day on point-by-point responses:

    “Ultimately, there are no authorities. One must figure things out for oneself.”

    This must be meant as a joke, or its author does not recognize that it is self-defeating. If there are NO authorities, then “oneself” cannot be an authority. If it is a serious statement, then it represents a relativistic position that has slipped into solipsism. There’s no fighting a committed solipsist.

    Brian.

  • http://geocities.com/nstix Nicholas Stix

    Rather than shed worthwhile hours of the day on point-by-point responses:

    “Ultimately, there are no authorities. One must figure things out for oneself.”

    This must be meant as a joke, or its author does not recognize that it is self-defeating. If there are NO authorities, then “oneself” cannot be an authority. If it is a serious statement, then it represents a relativistic position that has slipped into solipsism. There’s no fighting a committed solipsist.

    Brian.

    The statement is not self-defeating, relativistic or solipsisitic. The has traditional meaning of authority always referred to a powerr or knowledge outside of oneself.

    http://dictionary.reference.com/search?q=authority

    You also misused the term “solipsist.” A common error these days. During the mid or late 1990s, the misuse of “solipsist” instead of “subjectivist” or “narcissist” was popularized by some semi-literate journalists.

  • http://geocities.com/nstix Nicholas Stix

    Comment 8 posted by Les Slater on October 4, 2005 10:13 AM:

    > How does one determine the quality of entries? I’m glad you asked. With time, one figures out that certain sites or individuals are more reliable than others.

    Or blog entries for that matter.

    Well, I do like Google so I went there and entered Nicholas Stix, no quotes or other qualifiers.

    I didn’t have to look far. Google’s most relevant popped up several that are sure to be the Nicholas Stix that’s posting here.

    Why, if you “like” google, would you enter a name without quotation marks? Had you simply entered my name the standard way, with quotes, you would have found 38,000 entries, all but perhaps two of which would refer to me.

    I noticed a pattern that would fit Mr. Stix’s criterion of “figure[ing] out that certain sites or individuals are more reliable than others.” It has less to do with objective reality than who’s doing the ‘figuring’.

    Mr. Stix’s essays appeal to those with a certain preconceived notion of the world. Those that have these notions will have them reinforced. They are full of wild allegations that like other people of his ilk never make an objective verifiable reference.

    You’re a liar. Either you never read any of my essays, or you did and lied about their content. That would explain the vague character of your smear. Few if any journalists whose work is posted on the Internet make as many references that may be objectively corroborated, or link to as many external sources, which may in turn be objectively corroborated, as I do. However, the corroborating requires a work ethic and intellectual integrity, neither of which you have. And you are the last person to be complaining about “those with a certain preconceived notion of the world.” Don’t tell me: You’re open-minded and tolerant, but those who disagree with you are close-minded and intolerant.

    P.S. I wrote the above response before I read your cowardly, dishonest race-baiting of William Bennett. And that is yet another ground for annoyance — you put me in the position of having to defend Bennett.

    http://blogcritics.org/archives/2005/10/01/145336.php

  • Les Slater

    >> Mr. Stix’s essays appeal to those with a certain preconceived notion of the world. Those that have these notions will have them reinforced. They are full of wild allegations that like other people of his ilk never make an objective verifiable reference.

    > You’re a liar. Either you never read any of my essays, or you did and lied about their content.

    Below is excerpted from ‘911 – Four Years Later’

    WEBCommentary Contributor
    Author: Nicholas Stix
    Date: September 11, 2005
    http://www.webcommentary.com/asp/ShowArticle.asp?id=stixn&date=050911

    In New York, briefly after the London bombings, police were checking the bags of every fifth subway rider seeking to board trains in Manhattan. (I went to Manhattan about two weeks ago, and the practice was no longer in effect.) I’m not sure they were checking the bags of young Arab men, even when they were the fifth person. All an Arab homicide bomber would have to have done to kill everyone on a subway train would be to … look like an Arab homicide bomber. As in bearded, with a swarthy, Arabian complexion, wearing a heavy coat on a summer’s day, with lots of objects creating lumps around his middle and upper body, and maybe for good measure, muttering “jihad” and “Sheik bin Laden” to himself in Arabic. The NYPD officers, trained for years in “diversity seminars” to look away from individuals who fit criminal profiles and fight off their own judgment and experience, wouldn’t dare pull over someone who looked like a terrorist. That would be “racist.” In such a situation, after letting the terrorist pass onto the train unmolested, an officer might search the line for the most All-American, masculine-looking white guy he could find, and mercilessly hassle him, until the explosion knocked the cop onto the ground.

  • http://geocities.com/nstix Nicholas Stix

    Re: Comment 12 posted by Les Slater on October 5, 2005 09:09 AM

    And your point would be … exactly what? That it is intolerant to report the truth? Do you deny that since 911, law enforcement and security opersonnel hav egone out of their way to avoid searching people who meet the profile of suicide bombers, while going out of their way to harass octogenarian, non-Muslim grandmothers and Medal of Honor winners, in one case attempting to confiscate the Medal of Honor itself?

    Oh, I forgot. You don’t deal in facts or arguments, just innuendos, smears, and lies.

    I just hope you don’t present yourself to anyone anywhere as a knowledgeable, honest source of cultural information. At least when people see your posts here, they can see what you’re about.

  • http://www.thebluesmokeband.com Brian Sorrell

    I use solipsist in the philosophical sense, not a journalistic sense. I am a philosopher after all. In philosophy, we speak of solipsism as a position that holds that only the self can be verified (that’s a rough definition). The comment that the self is the only source of authority is, thus, solipsistic. Subjectivist and narcissist are quite different.

    And I suspect from your comments that you did not read my original post particularly closely. I suggest that you re-read it, keeping in mind what we, in philosophy, call the “principle of charity”. Really, read it slowly and you might pick up on the point.

    And you might want to refrain from admonishing the use of words that you do not actually know. Bear in mind also that dictionary.com is not a particularly good source for explanations of complicated philosophical positions.

    For a fascinating disucssion on solipsism (one of my favorite topics by the way), I suggest Wittgenstein’s “Philosophical Investigations” and to some extent, the later remarks in his “Tractatus”. It’s relevant to the points that I’m making in the article.

    Brian.

  • Les Slater

    > That it is intolerant to report the truth? Do you deny that since 911, law enforcement and security opersonnel hav egone out of their way to avoid searching people who meet the profile of suicide bombers, while going out of their way to harass octogenarian, non-Muslim grandmothers and Medal of Honor winners, in one case attempting to confiscate the Medal of Honor itself?

    Listen to yourself, you’re nuts. You might be prone to hyperbole, but in this case it exposes more of you than what you’re writing about.

    Do you really believe the shit you’re writing? Going out of their way to search out octogenarian, non-Muslim grandmothers and Medal of Honor winners? Finding a couple examples does not in any way justify such a sweeping generalization.

    You know nothing, you project the sick shit you believe.