It’s not Spiderman’s latest web slinging tool but something that’s more real world.
The Invisible Web (or The Deep Web) refers to the part of the Internet that’s not indexed by the search engines. Most of us think that that search powerhouses like Google and Bing are like the Great Oracle — they see everything.
Unfortunately, they can’t because they aren’t divine at all; they are just web spiders who index pages by following one hyperlink after the other. And, there are some places where a spider cannot enter.
Take library databases which need a password for access. Or even pages that belong to private networks of organizations. Dynamically generated web pages in response to a query are often left un-indexed by search engine spiders.
Search engine technology has progressed by leaps and bounds. Today, we have real time search and the capability to index Flash based and PDF content. Even then, there remain large swathes of the web which a general search engine cannot penetrate. The term, Deep Net, Deep Web or Invisible Web lingers on. However, of a misnomer they may be.
It’s not that you can’t access the invisible web at all. It’s just that you must use the right tools to do so. Here are ten online indexes and search tools you should hit.
This is considered to be the oldest catalog on the web and was started by started by Tim Berners-Lee, the creator of the web. So, isn’t it strange that it finds a place in the list of invisible web resources? Maybe, but the WWW Virtual Library lists quite a lot of relevant resources on quite a lot of subjects.
For instance, there are 300 sub-libraries with their own categories within the main library. The History sub-library is a good example.
You can go vertically into the categories or use the search bar. The screenshot shows the alphabetical arrangement of subjects covered at the site. Even as many deep web resources have come and gone, the WWW Virtual Library keeps on going even after 26 years.
This is the official site of the U.S. Government and the portal to all the public information you need on every federal agency or state, local, and tribal government. The site has an A-Z index of all topics on the portal and it’s a better way to pinpoint the information you want.
Apart from the direct access, use filters like “Only USA.gov,” “Images,” or “Videos” at the top of the page for more specific results. And while you are here, don’t forget the partner sites like Kids.USA.gov and Publications.USA.gov which are other specialized information mines.
The blurb on the home page says it all. The scientific search engine taps into 60 databases and over 2,200 scientific websites that cover federal science information including the latest research and development results. Try the advanced search engine page for a deep web search across the government databases that exist in the country.
The federal search tool can be your first doorway for multidisciplinary research that covers everything from agricultural information to the current trends in science education for schools in the U.S. It is also a primary source for searching Federally-sponsored opportunities and programs for STEM students.
The Map Topics and the images alone could be worth the price of admission. There is none because the government site is free. The job of the organization is to broadcast real-time or near real-time data and information on current conditions and Earth observations. But it is also a goldmine for academic and even casual research.
Try the National Geologic Map Database catalog. Query satellite photos and Earth images with the EarthExplorer tool. Or, lean forward to search 100,000+ scientific publications, and books. And there is so much more.
Search for articles in Open Access journals. These are academic papers that are available to anyone “without financial, legal, or technical barriers other than those inseparable from gaining access to the Internet itself.” In short, the knowledge is free.
DOAJ maintains quality control with rigorous peer review. The current repository has 9000+ journals with almost 2.5 million articles across all subjects. This information may not be available with a Google Search though Google Scholar may be able to access some of the information. But DOAJ is a better research tool as it neatly curated with a well-designed advanced search engine.
Studying literature? Try Voice of The Shuttle. It is a rich directory for of online resources on literature, the humanities, and cultural studies. The search directory has evolved to include topics like Sci-Tech and Culture, Cyberculture, and Technology of Writing to keep pace with the times.
It started as a support tool for the English Dept. of the University of California at Santa Barbara in 1994. Today, it continues to be updated and you can browse through both primary and secondary resources.
Though Google does offer medical information when you search with your symptoms, you need all the help you can get. RxList is a comprehensive database of US prescription medications. The index is a prescription drug encyclopedia, pill identifier, and pharmacy locator rolled into one.
With the rise of supplements, there is also a dedicated part of the site for vitamins, herbs, and dietary supplements. Each section has its own search tool and/or an alphabetical listing.
The medical resource is a quality offshoot of the WebMD network. The data comes from sources like the FDA, Cerner Multum, and First Data Bank, Inc.
This online encyclopedia started its life as a radio quiz show in 1938. Today, Infoplease is an information portal with a host of features. Using the site, you can tap into a good number of databases, electronic journals, almanacs, electronic books, thesaurus, atlas, bulletin boards, mailing lists, online library card catalogs, articles, and directories of researchers.
Think of it as the search engine for brick-and-mortar libraries across the world. The meta-catalog for 72,000 libraries in 170 countries can help you find any paper, book, thesis, videos, multimedia assets, and even museum artifacts stored somewhere.
The best way to use this massive database is with the advanced search tool. The information found here is also useful for creating citations for your research paper.
A search on WorldCat.org will return links to the resources in these databases. But to access these resources, you have to log in with a valid library membership. You can also use the “Ask a Librarian” feature to ask for help from librarians in charge.
This is a non-governmental resource for unclassified security documents. It is the largest repository of such documents outside the U.S. government. Set up to check rising government secrecy, the site uses a custom Google Search to give you access to more than 10 million pages of government documents.
The papers are primary source material for journalists, security evangelists, and researchers. A growing collection of Electronic Briefing Books are a smaller part of the documents but give you a curated look at U.S. national security, foreign policy, diplomatic and military history, and intelligence policy.
More Deep Web Sites Worth a Mention
- Free Lunch
- Clinical Trials
- Project Gutenberg
- The Library of Congress
- Internet Archive (Including the Way Back Machine)
- The National Gallery of Art
How Do You Surf the Deep Web?
It is difficult to pin down the size of the Deep Web. But it is estimated to be several times larger than the web we know so well. What is true is that the topic focus of the invisible web makes it a ripe area to hunt for information when we by habit barely click to Page 2 of the Google Search results page.
It is also important here to make a distinction between the “invisible web” and the “Dark Net”. The invisible web is within the reach of a normal web browser while the dark net is dominated by TOR sites (and disreputable anonymous services) that need some technical wizardry to access.
Just like general web search, searching the invisible web is also about looking for the needle in the haystack. Only here, the haystack is much bigger. The invisible web is definitely not for the casual searcher. It is deep but not dark because if you know what you are searching for, information is a few keywords away.
Do you venture into the Invisible Web? Which is your preferred search tool?
Image Credit: By Crispy Fish Images via Shutterstock