by Frederick Rustam
Part Four, ORDINARY CITIZENS AS SCHOLARS
|Part One||Part Two||Part Three|
Questor Institute is a new, experimental technical school where
bright-but-poor high-school graduates on full scholarships spend
two years seeking to become wizards of Internet sorcery by studying
the science and philosophy of information retrieval from textual
databases such as the World Wide Web. In Part Three, "The Nearness
of You," Kevin and Marylou studied the tricky process of subject
qualification, learned the greater effectiveness of proximity
searching with the NEAR logical operator, and began a search
for an inspiring mountain.
Kevin and Marylou's studies at Questor Institute were centered on the
technology of Internet textword searching. The Institute also taught
them the historical and philosophical foundations of this primary
medium from which they retrieved information. Most of their exposure
to the Institute's Internet philosophy occurred in the course,
History and Nature of Computerized Information Systems.
"Today, we're gonna get orientated 'bout the 'Net," joked Kevin.
"Oriented," corrected Marylou, not allowing for his humor. She pointed
to the course outline displayed on their workstation screens. "Today's
lecture is supposed to be a brief introduction to Internet history
"The universal Internet is only a few years old," began the teacher,
who was the Institute's elderly Rector. "Yet it's already become
the most significant instrumentality for the storage and retrieval
of publicly-available information since the advent of the public
library. We all assume that technology drives and steers society
to some degree. Printing helped to change medieval society and bring
about the Renaissance by making more-diverse literature more-widely
available. Internet technology is changing today's global society by
making public information from a wide variety of sources and viewpoints
universally-available, and by making that information more-easily
"The Internet was not, as our media often say, devised by the Dept.
of Defense for military telecommunication after a nuclear strike on
our nation. The TCP/IP packet-switching standard---Transmission Control
Protocol/Internet Protocol, and the ARPAnet to make use of it---were
devised so that universities doing defense research sponsored by the
Advanced Research Projects Agency could quickly share their technical
papers and computer programs.
"Later, other government agencies and commercial outfits built similar
national data transmission networks, and for awhile it was uncertain
which technical networking standard would triumph over the others.
When the Europeans saw that TCP/IP might become the de facto
standard for an international network, they rushed to create their
own version and got that adopted by the International Standards
Organization---but too late. TCP/IP became the world's Internet
protocol because people wanted an international computer network
right now, and they chose to use the functioning ARPA
standard, instead of waiting for something better to develop.
"It wasn't until the National Science Foundation built a civilian
network on the ARPAnet standard that a national network for all our
citizens became possible. When the NSF relinquished operation of its
NSFnet to a three-level combination of long-distance telecom companies,
regional telecom companies, and local, commercial Internet service
providers, the national citizen's computer network we know as the
Internet was born. It's important for us to appreciate this fact: it
wasn't until the government transferred one of its taxpayer-financed,
civilian computer networks to free-enterprise operation that a true
Internet became available for all our citizens. Our current Internet
wasn't built by the Defense Department as a gift for the general
public, although that's what some civilian visionaries in ARPA
hoped their network would eventually become.
The Rector tapped his head as if to elicit thought by his audience.
"But what is our Internet, really?... It's much more than a bunch of
telephone lines and routers. It's a bunch of agreements, agreements
between millions of information holders to use the same methods of
formatting and transmitting information and to make that information
available everyone who can access it. The earliest of the info-
sharing technical agreements were email and 'FTP.' File Transfer
Protocol allowed users anywhere on the network to transfer their
data files and computer programs to others by using the same
"Then came 'telnet,' a protocol which allowed all Internet users
to retrieve data by remotely manipulating the databases of other
users. Such a database might be the computerized catalog of a big
university library, a place where bibliographic data about rare
books might be found. This general database-sharing protocol opened
a big door to information retrieval: the acquisition of information,
routinely and without formal request. We didn't have to ask somebody
for their information; we just connected to them on the network and
took what we wanted---and we did so because the holders of that info
wanted us to have it. It was this new, free-information mode which
established the spirit of today's Internet.
"To retrieve information by telnet, however, we have to know the
technical procedures by which individual info-holders operate their
databases. These operating procedures vary, often in slightly-but-
annoyingly-different ways; yet they have to be employed correctly,
or we can't retrieve the information."
"Then a researcher had a terrific idea. A British physicist, Tim
Berners-Lee---while working at CERN, a European research facility---
came up with a superprotocol for information-sharing on the Internet.
He called it 'HTTP, HyperText Transfer Protocol.' It's an agreement
by all Internet users to format much their Internet information in a
standard fashion and allow access to it by a commonly-employed means
of retrieval, transmission, and display. This protocol allowed the
establishment of the World Wide Web.
"HTTP also provided for site-to-site hyperlinks embedded within the
text of webpages. These links offer Web users an important way to
discover more information about a subject than they can find on the
pages of the website they're currently viewing. The importance of
hyperlinks can't be overstressed. Following them can be like following
guidemarks on the walls of a dimly-lighted tunnel system onward to the
"The advantages of a single Web-document protocol have since been
undermined by the proliferation of webpages in formats requiring
specialized software to view them. But the HTML format does have
limitations, and these can now be sidestepped by those document
authors who prefer more-versatile formats for their works while
making these works publicly-available via standard Web browsers.
"In March 1995, a milestone was reached when Internet Web traffic
first exceeded FTP traffic. HTTP was a quantum jump for the Internet
and a giant leap for humankind---one far-greater than traveling to
the moon, I assure you. It was HTTP that changed the Internet from
a techno-tool for an elite to an information system for the masses.
When HTTP spawned the World Wide Web, and commercial firms offered
the common folk access to it, the whole world went online to post
information and to retrieve the information of others.
"Wow," scoffed Kevin, too loudly. Marylou elbowed him. "Shhhhh."
"The Web is, however, a relatively 'freeform' database, in that its
only organization---if we can term it that---is that it exists on
webpages. Its information isn't naturally classified or grouped like
books in a library or want ads in a newspaper, where similar items
are gathered together for retrieval convenience. Nor are the Web's
webpages placed into neatly-packaged information containers like
those of structured databases, where categorized tables of data
elements can be retrieved individually or in combination.... But
this doesn't mean that Web information is almost-irretrievable,
as some critics have recklessly claimed.
"Millions of the world's people are accessing the Web, and they're
delighted---but frustrated, as well---by that vast information system.
Consider why this is so. In a library, when you want information about
a person, you can consult a general encyclopedia, a biographical
reference work, or a specific biography---depending upon the depth of
info you require. When you want in-depth information about any specific
subject, you can go to a place on the shelves where books about that
subject are located. Use of the library's subject catalog is desirable
but not always necessary.
"But on the Internet---unless you can directly access or follow a
hyperlink to another webpage of known relevance---you face the
terabytes of a vast, distributed megacorpus of text, universal in
its subject coverage, in which your desired subject is completely
scattered among billions of webpages in no order, and which is
retrievable only by means of textword indexes which more-or-less
tell you where subjects can be found---if you can successfully
search those indexes.
"There are some specialized search engines available which index
only webpages within a narrow area of specialization. But most Web
users prefer to use general search engines. These engines index
many more pages, they're updated more frequently, and they usually
offer more search features. This preference for general engines,
however, virtually guarantees users more retrieval frustration.
"The daunting challenge which the World Wide Web and the Usenet offer
information seekers is the reason for the establishment of Questor
Institute, and it's why you're here. In meeting this challenge
successfully, you'll find that you've often retrieved useful info
from those of your fellow citizens often labeled as 'amateurs.'"
"There are many Usenet newsgroups; and there are, in effect, several
Webs. There is the publicized and well-known Commercial Web, where
people are trying to make money. The Academic Web now educates many
more people than registered college students by making non-confidential
academic documents universally available. The Government Web is where
Big Brother interacts with the Internet---not to control it but to
merely use it as the rest of us do."
As he intoned this reality, the Rector rolled his eyes skyward and
clasped his hands in mock-prayer. "Thank you, Lord, for this blessed
decentralization of the People's Medium." The audience applauded.
"There's the Media Web, a means for our news/feature media to extend
their influence over us, even into a place where we like to think that
we've escaped their traditional, one-way communication. The Media Web
is a part of the Commercial Web, although it hemorrhages money at an
enormous rate. Nonetheless, the media strongly feel that they have to
have a Web presence. And that's good for information seekers because
media websites are a valuable source of timely information---but not
necessarily 'authoritative' information. Media error seems to be
amplified by the modalities of the Digital Age.
"But most of the Web---nobody can say for sure how much---is the
province of ordinary citizens. You've noticed this, I trust. The
Citizen Web is composed of millions of websites created by ordinary
people like you and me. After the government surrendered the data
transmission infrastructure of the Internet to commercial telecom
firms, citizens who were not 'empowered' by colleges, governments,
or corporations---or favored by traditional publishers---could
more-easily contribute their information to society's info-pool.
And they did. The first trickle of citizen participation quickly
became a flood which is quietly sweeping our old informational
hierarchies into the drainpipe of history."
The Rector assumed a conspiratorial stance. "But don't tell this
to the Information Elite. They still believe themselves to be the
sole creators and disseminators of of authoritative information."
Kevin whispered to Marylou, "I don't even have a homepage with an
authoritative picture of my cat."
"You will when you get a cat," she purred. "They're so lovable
we just have to tell everybody about them."
As if he'd heard that conversation, the teacher continued, "What
do ordinary citizens have to offer the Web that isn't offered by
academia, government, and commerce?... The same thing, really---
information---but without the customary authority those societal
institutions claim for themselves. Citizens have put up on the Web
billions of pages of text, sound, and graphics in which they have
an interest. I'm not just talking about homely 'homepages,' even
though personal websites can be useful sources of information.
"Many citizen websters are hobbyists who pursue their hobbies online
as well as offline. Hobbyist websites are often quite elaborate, and
their pages speak with an authority which most media pages lack because
the info comes from the heart as well as the head. 'Interest websites,'
those which reflect the interests of their entrepreneurs, are also
info-valuable. Even 'fansites,' where citizen fans exhibit their
enthusiasm for celebrities, can be useful sources of information.
"I once heard a popular radio musicologist introduce a medieval
selection as what sounded to me like 'Lo Mar May.' I was curious
enough to access the musicologist's website to find this title in his
program playlist. Although most of his numbered programs were listed,
a few were not, and #109 was one of those missing! For months, again
and again, I accessed the page to see if Program #109 had been added.
It hadn't been. Finally, I decided to 'go general,' that is, to make
a general search of the Web for another source of this guy's program
playlists. I did, and I quickly found a citizen's fansite about the
musicologist which had them. Checking Program #109, I found that the
title the musicologist had spoken was the French phrase, 'l'Homme
Arme'---'a-r-m-e' with an acute accent on the e---'The Armed Man.'
If that citizen website hadn't existed, I might still be waiting to
find out how 'Lo Mar May' was spelled. Such redundancy on the Web
is a good thing for information seekers.
"Our media, both offline and online, rarely refer to the Citizen Web.
When they do, it's usually to denigrate some part of it. Media folks
seem to feel they have to label 'amateur' Web information as being
unreliable. For a few years, the media relentlessly reported the
Internet as unreliable and dangerous. They did this because they
feared the competition this new medium presented. Newspapers feared
losing readers, television feared losing viewers, and these media
tried hard to stigmatize the new People's Medium---a medium which
they barely understood, and which today they still poorly know.
The Rector's hard, calculated look of contempt changed into a
wicked grin of pleasure.
"But when the media rushed onto the Web to gain a presence there,
their scare stories about the Internet greatly diminished. When
editors and reporters realized the value of the People's Medium
for their purposes, they became part of the Web community---but,
thank goodness, not a controlling part. Then, they realized they
couldn't write or say bad things about the Internet which might
discourage the computer-owning, educated class from accessing
media websites. But the Media Elite still keeps its collective
nose elevated well above the Citizen Web, even as they utilize
its information freely and often without attribution.
"So why do citizens spend their money to create informational
websites with elaborate pages of content? Ultimately, I guess,
they do it for personal accomplishment. But many feel strongly
about their hobbies, interests, and causes, and they desire to
express themselves on the Web to further these. In the end, who
cares why they do it?... Strong motives can produce information
distortion, but let's be thankful that so many citizens have
put their information on the People's Medium. As professional
information retrievers, you'll make use of data from citizen
webpages because it's handy and relevant.
"Academia, government, and commerce have no monopoly on scholarship.
You don't have to go to college or have an intellectual vocation
to become a scholar. Every citizen can be an armchair scholar or
even a deadly-serious one. You'll see a good example of this when
you discover the 'Witchfinder General' webpage in your homework
assignment." The Rector smiled, benignly.
"If you publish 'original' information---that's what you write by
your own efforts, not what you copy from others---then in Questor
Institute's view, you're a citizen scholar. Citizens now have the
People's Medium to display their scholarship, but they could enjoy
some recognition of their works by the intellectual hierarchs of
our society. In that vein, we hope to establish at Questor a Website
Excellence award program to recognize the outstanding contributions
of the citizenry to society's online information pool. Yes, I know
that there are a multitude of questionable Web awards out there in
cyberspace, but as Questor Institute becomes better-known, its Web
awards will, I trust, be the most-highly valued by their recipients.
"Today's homework assignment is this, then: locate a citizen webpage
which provides a short-but-good biography of the 17th-century English
Puritan witch hunter who styled himself as "The Witchfinder General"
---and which provides a complete list of his many victims and their
'disposal.' In your report about this site, answer these questions:
(1) Does the Englishman who put up this page seem to have
a college degree?
(2) How does his 'authority'---or seeming lack of it---
matter to you?
"I'd like to end today's session with an anecdote about scholarship.
In California, there's an academic who's a self-appointed Internet
'contrarian'---gadfly. In 1995, he wrote this in a national magazine:
What the Internet hucksters won't tell you is that the Internet
is an ocean of unedited data without any pretense of completeness.
Lacking editors, reviewers, or critics, the Internet has become
a wasteland of unfiltered data. You don't know what to ignore
and what's worth reading.
Logged onto the World Wide Web, I hunt for the date of the
Battle of Trafalgar. Hundreds of files show up, and it takes
15 minutes to unravel them---one's a biography written by an
eighth grader, the second is a computer game that doesn't work,
and the third is an image of a London monument. None answers
"Recently, I read that passage reprinted in a book on search engines.
So I logged onto the Web, and in 15 seconds---not minutes---
I retrieved a search-results page on which six of the eight items
clearly stated the date of the Battle of Trafalgar. I didn't have
to click on those items; I found the answer in their annotations.
How can it be that I was so quickly successful, and that academic
'expert' was not?... I don't know how he searched for the date of
the Battle of Trafalgar, but I searched like a Questor would."
The Rector wrote on the whiteboard:
"The battle of Trafalgar was fought on October 21, 1805."
<-REDUNDANT-> <-UNIQUE SEARCHPHRASE-> <---UNKNOWN---->
SEARCH INPUT: <"trafalgar was fought on">
"As you already know from your retrieval class, searching for
natural-language text requires that we employ natural-language
searchwords. Unlike Dr. Contrarian, I visualized a relevant text,
subject-analyzed it, and searched appropriately. How many of you
Questors would have searched this way?" Suddenly, the classroom was
filled with waving hands, not all of them honestly flourished, but
all of them strongly motivated.
"Of course, without examining the six search-return items I couldn't
establish an 'authority' for the date each offered. But I can assume
an 'inherent authority.' When six webpages agree on the date of a
historical event, this amounts to a practical authoritativeness for
that date. It seems unlikely that all six webpages would have gotten
it wrong, especially when there was no other date in the results.
"Many who take it upon themselves to harshly criticize the Internet
simply don't know the medium as well as they should. As Questors,
you'll study the Internet until it becomes 'second nature' to you.
You'll learn more than just how to retrieve from it---you'll learn
to respect it. Without that respect, the Internet surrenders its
secrets reluctantly.... When you approach a dog with love in your
heart, isn't it less-likely to bite you?"
"That was a powerful message," whispered Kevin. "I'm inspired."
Ignoring his sophomoric cynicism, Marylou replied, "So am I.
And it makes me feel good to know that I'm studying to enter the
latter-day equivalent of a medieval craft guild. When our class
graduates and shows the world what the word 'Questor' now means,
students'll be thronging to get in here. And we'll be remembered
as the Institute's pioneer class."
"Yippee! Circle the wagons!"
"That's not what I meant, Kevin."
"Hey sister, don't get me wrong. I can dig the guild thing."
THE END OF PART FOUR
Next: Part Five, "What Does Information Want?"
© 2002 by Frederick Rustam. Frederick Rustam is a retired civil
servant. He formerly indexed technical reports for the Department of
Defense. He writes science fiction for Web ezines as a hobby. He
studies and enjoys the Internet as a hobby.