Problems at the USPTO: Databases (Part 3 of 4)

This articles is the second of a four-part series examining the USPTO’s role in administering the patent system. Other articles in this series: Part I: Introduction. Part II:Human Resources. Part IV: Experts and Expertise.

Last week I explored difficulties the USPTO faces because of limited human resources. The time constraints placed on examiners detrimentally affect the quality of the their work in judging the patentability of inventions. This week I am particularly interested in how these constraints limit each examiner’s ability to conduct an effective prior art search, in particular with relation to the databases they search for relevant prior art when examining a particular patent.

The United States Code limits patentability to inventions which have not been described in previously printed publications. Later interpretations of this provision allow the definition to encompass all documents that have been made sufficiently available to persons ordinarily skilled in the art. This definition is staggeringly broad, and includes not just USPTO-issued patents, but also peer-reviewed journal articles, documents available on the internet, technical specifications, theses, source code, product descriptions and advertisements, textbooks, and the list goes on. Furthermore, these documents are not limited to those available in English, as documents in other languages are also fair game. Patent examiners certainly have their work cut out for them.

In practice, examiners have only a short time to perform their prior art searches, and therefore evaluate only a small portion of possible databases for prior art. Common practice dictates that prior art falls into one of three categories: patent literature, foreign patent literature, and non-patent literature. A study by Bhaven Sampat of patents issued between 2001 and 2003 found that 67% of all sources cited were other issued patents and applications, while only 18% were non-patent literature and 15% were foreign patents. This article will focus on USPTO’s searches of US patent as well as non-patent literature, but foreign patent literature also forms a substantial portion of overlooked prior art.. The focus of patent examiners on searching for prior art amongst existing patents and the vast underrepresentation of non-patent literature is a significant factor contributing to the increased number of “bad” patents issued by the USPTO in recent years.

Patent Classification

Previous patents comprise--by a more than a factor of three--the large majority of references cited in USPTO patent literature. From this and other data, Sampat concluded that “examiner capabilities for searching U.S. patents far exceed their capabilities for searching other sources.” The main reason for this is that the USPTO patent database is convenient and easy for patent examiners to search, in large part due to a long-standing tradition of categorization by idea that the USPTO has developed for generations. For most of its existence the classification system has been a useful alternative to simple keyword searches, but categorization is a double-edged sword. While it does help avoid the pitfalls of varying terminology, it also imposes a rigid structure on ideas which increasingly limits the examiner’s field of search.

The USPTO classification system emerged organically from a need to search through similar ideas that may use different words or phrases to describe these ideas. Instead of searching the patent database for all possible keywords related to an idea, the examiner need only locate the relevant category for the idea and search patents within that category. The current system has over 400 classes, and nearly 150,000 subcategories. A recent Queen’s University paper examining patent classification reform elucidates the main problem with this classification by asking us to envision "a train with 1,000 cars, with each car representing a single utility, design or plant class. Within each car are dozens or hundreds of boxes representing subclasses. The order of the cars (classes) and the arrangement of the boxes (subclasses) within the cars depend on the theories of classification that existed when the class was last revised.”

These classes and subclasses impose an inherently limited database for the examiners to search. Because each examiner is so constrained in time, he will only look through some of those subclasses and not others. This is problematic because the classification system quickly becomes out-of-date as new inventions and ideas appear that may not fit conveniently into old systems of classification. Then, two similar ideas can easily be assigned different classifications, and examiners will now fail to locate relevant prior art, even within the database they are most capable of searching.

With so many classes and subclasses, it is easy to see how this classification quickly goes out of date--especially given the rate at which modern technology is evolving. This sort of out-of-date classfication does more harm than good, restricting the scope of examiner prior art searches. This classification system also form the basis of the USPTO search templates; guides examiners use to locate prior art amongst non-patent literature.

Non-Patent Literature

Non-patent literature is by far the broadest of the three identified classes of prior art, but at 18% of citations is vastly underrepresented. Many patentable inventions emerge from academic fields of study for which there is a well-established base of peer-reviewed literature. This wealth of literature should form the basis for prior art in these areas. This underrepresentation is indicative of the USPTO’s inability to properly search existing non-patent databases.

Take for example subclass 326 “Electronic digital logic circuitry.” The search template for this category lists nine non-patent databases for search. One of these sources is the IEEE Xplore database, which currently indexes more than two million records--more than a quarter of the number of all patents issued by the USPTO. This, together with the other sources indicated in the template, represent a body of prior art with at least as many entries as there are patents that could be relevant prior art for inventions in this field. These databases are much more extensive and difficult to search, so it is no surprise that patent examiners cite prior art from these sources much less frequently. Still, non-patent literature should form a substantial portion of prior art in many fields. The fact that it does not indicates disproportionate inattention from examiners.

What's missing

Even more important is what’s not included. Because, as we’ve seen, examiners are so pressed for time when examining applications they are unlikely to stray far beyond these templates. The Templates necessarily miss a significant amount of prior art. Even if they didn't, they could not remain perpetually up-to-date. From the WIPO Guide to Technology Databases: “The landscape of patent and non-patent databases is constantly changing, with new services coming onto the market and existing services being expanded, merged or discontinued.”

This difficulty is particularly prominent in fields where it’s not clear how best to search through the databases of patentable ideas. Pharmaceuticals, chemical processes, and software are good examples. In a paper by Jinseok Park, an experienced Korean Patent attourney commenting on the difficulties of prior art search for these fields, says sthat “indeed the USPTO may not even have access to the relevant non-patent databases in some new fields.” Park concludes that “a more sophisticated approach to non-patent information is essential to improve the quality of searches for computer-implemented inventions.” Park's conclusion is about software patents specifically, but the point applies to many other technical fields as well.

The USPTO cannot possibly perform an effective prior art search of the non-patent literature they are aware of, and are inherently limited in the databases which they do search in the first place. Sampat tells that patent examiners rely on applicant disclosures of prior art, but that we cannot guarantee that applicants will perform as thorough a prior art search as is necessary. Thus, the USPTO issues many (if not most) patents without performing a sufficiently thorough prior art search. The result is a system full of patents which should be invalid under section 102 of the U.S.C. and is therefore deficient in its role of protecting and promoting intellectual property rights.