To be sure, the quantitative estimates of the rates of overblocking apply only to those four commercially available filters analyzed by plaintiffs’ and defendants’ expert witnesses. Nonetheless, given the inherent limitations in the current state of the art of automated classification systems, and the limits of human review in relation to the size, rate of growth, and rate of change of the Web, there is a tradeoff between underblocking and overblocking that is inherent in any filtering technology, as our findings of fact have demonstrated. We credit the testimony of plaintiffs’ expert witness, Dr. Geoffrey Nunberg, that no software exists that can automatically distinguish visual depictions that are obscene, child pornography, or harmful to minors, from those that are not. Nor can software, through keyword analysis or more sophisticated techniques, consistently distinguish web pages that contain such content from web pages that do not.
In light of the absence of any automated method of classifying Web pages, filtering companies are left with the Sisyphean task of using human review to identify, from among the approximately two billion web pages that exist, the 1.5 million new pages that are created daily, and the many thousands of pages whose content changes from day to day, those particular web pages to be blocked. To cope with the Web’s extraordinary size, rate of growth, and rate of change, filtering companies that rely solely on human review to block access to material falling within their category definitions must use a variety of techniques that will necessarily introduce substantial amounts of overblocking. These techniques include blocking every page of a Web site that contains only some content falling within the filtering companies’ category definitions, blocking every Web site that shares an Ip-address with a Web site whose content falls within the category definitions, blocking “loophole sites,” such as anonymizers, cache sites, and translation sites, and allocating staff resources to reviewing content of uncategorized pages rather than re-reviewing pages, domain names, or Ip-addresses that have been already categorized to determine whether their