Matching USPTO Patent Assignees to Compustat Public Firms and SDC Private Firms

This data project is a systematic effort to match assignee names on USPTO patent records, sometimes abbreviated or misspelled, to the universe of public firms in Compustat and all private firms that have been involved in alliances and M&A in SDC Platinum. The current coverage runs from 1976 to 2018 for Compustat firms and from 1985 to 2017 for SDC firms.

Our algorithm leverages the Bing web search engine and significantly improves upon fuzzy name matching, a common practice in the literature. This document presents a step-by-step guide to our searching and matching algorithm. All codes are publicly available on the GitHub page:

The data is used in my paper Technology, Information, and Firm Boundary” (with Danqing Mei). We plan to make this data publicly available after the paper is published. We acknowledge the research grant from Columbia Business School. Detailed documentation of the data construction algorithm is here.