Mike Bergman, who is credited with coining the term, “Deep Web” said that searching on the internet today can be compared with dragging a fishing net along the surface of the ocean. You can catch a lot, but you are nowhere close to tapping into the real contents of the ocean’s depths. So it is with the internet. Most of the information on the internet is covered with so many layers of code and structure that standard search engines do not find it. It’s a huge amount of information; some estimate as high as 97 percent of the internet is not accessed by search engines and therefor not available to most people.
Concerns About The Dark Net
Some government agencies and some prosecutors are concerned that the internet has deep levels that are inaccessible except to those who have designed specific coded entry points. They feel there are active criminal dark net venues where illegal information is stored and traded. These dark net pages do not exist until they are created dynamically as the result of a specific code search, which opens them up and dynamically generates them.
There are quite a few ways information can be stored so that search engines will not find it. For example, non-html/text content, which is textual content encoded in multimedia files like image files or video, or files that are specifically coded so search engines will not read them. There are also a variety of limited access pages and sites that are code protected to avoid entry, search discovery and notice by other sophisticated coders.
The deep web is a world of code that is not connected to the well organized and accessible surface web. On the surface web everyone is identified by their entry point machines and their service provider’s information about each computer on their system. The deep web alludes these organizational features. To track the deep web requires a more detailed approach. One way of doing this is to search in two levels: first search by topics (e.g., health, travel, automobiles) and then use search for sub-topics according to the nature of the content underlying their databases.
Even with an alternative search strategy, the deep web offers difficult challenges. Deep web searches don’t produce URLs like a traditional search engine does. The difficulty lies in finding and mapping very different data elements from many disparate sources and then arranging them in a unified format that makes them accessible and meaningful. In a huge digital sphere that contains all kinds of unique coding and unusual information presentations especially designed to best display the content, search requires different strategies for each inquiry.
Researchers are exploring how the deep web can be crawled with an automatic system. The “Sitemap Protocol” (introduced by Google in 2005) and “mod oai” are mechanisms that allow search engines to explore resources on deep web servers. Both of these tools allow web servers to communicate the accessible URLs located on them, thereby providing a path between other data on the server and way to retrieve that information back to the surface web. There are special algorithms designed to compute the nature of content and relate it to a more searchable format. The process is slower than surface level search. The deep search methods account for a thousand queries per second to deep net content. It takes longer to pull the information into a format that is presentable.
Finding Your Way In
Having highlighted the difficulties of searching he deep web, doesn’t mean that it can’t be explored. There is no way to access the deep web with a standard web browser. Firefox and Internet Explorer won’t get you past the door. It takes a deep web browser to enter the dark and deep realms of the darknet. Tor is the name of a common deep net browser (probably the most recognized deep net browser). Anyone can download Tor and then begin using it, but it is a different system than standard search engines. When you enter the deep net, you are anonymous, which means your browsing habits and your location are not revealed to anyone else. The cookie idea used on the surface web isn’t in play on the deep web.
Basically, your activities on the deep web can not be monitored – nothing you do will be tied back to you. That is an attractive feature for many deep web users who no longer trust that there privacy is being respected on the internet. Obviously, revelations about the NSA and UK intelligence agencies methods of monitoring web traffic have turned a lot of people off. The deep web offers an alternative, but it also is a digital venue that can lead you into trouble. The FBI pays attention, as best they can, to what is happening on the deep web and you can be contacted by them when you arrive at deep web locations they monitor. If the internet seems like the wild wild west, then the deep web is a particularly large and uncharted part of that wild west.