Patstat beyond Europe. By Gianluca Tarasconi Madrid, 9/12/2010. An insight into Patstat data from patent authorities other than EP O. What is PATSTAT.
By Gianluca Tarasconi
An insight into Patstat data from patent authorities other than EPO
PATSTATstandsforEPO Worldwide PatentStatistical Database.Contains a snapshot of the EPO master documentation database (DOCDB) which contains data of about 90 national and international patent offices with different degree of coverage.Data include bibliographic data, citations and family links. This database is designed to be used for statistical research and requires the data to be loaded in the customer's own database.
Data from other patent authorities may help in:
Validate algorithms against other spellings/conventions;
Fill missing/correct data (FI address/city) using data from equivalents;
Use Patent Family(1) data to improve algorithms using other data to give a similarity score;
(1) For a list of patent family definitions see : C. Martinez Insight into Different Types of Patent Families, STI Working Paper 2010/2
6 different spellings for name, 3 different addresses
In this case name and city are better parsed in US equivalent patent data;
WO patent data confirm that correct address is 43111 Robbins street
US patent tells us A. stand for Antony
Patstat contains 92 application authorities;
45 are inside Europe;
47 are outside Europe;
Contains regional/international authorities (WIPO; ARIPO…);
Contains also ‘terminated’ authorities (DDR, URSS)
A) data coverage (% of coverage by year)
Are data from patent authority X 100% included into Patstat from year W to year Z ?
B) Data transmission delays
How long does it take a non EPO patent to reach in PATSTAT?
C) Completeness of geographic data
How is quality (and coverage) of address / city / country code ?
EPO gives partial informations
Total number of applications is given but not the % of total (EPO gives what it gets)
In patstat are reported from EPO 66219 Indian applications
Indian Patent office reports 28.882 applications filed only for 2006
We study time series 2003- 2008 for BR, CN, JP, DE, KR and IN compared to EP;
Graph differences suggest publication lags and data transmission lags differ from country to country;
Timeseries may also highlight ‘holes’ or changes of population (FI USPTO from 2000 onward)
Table for the TOP 20 by inventor count;
13 authorities have more than 80% of records with no country code;
12 authorities have 0% of address/city;
Anyway in many cases address data are inside first name field (FI: DE)
(data from patstat 09/2009)
Non EPO havecoverage, quality and ‘spelling’ thatmaychange a lotfrompatent authority topatent authority;
Data can beusedasaddictional source of information butnotasmain source (BONUS not MALUS);
EPO couldprobablyimprovequalityofthis data, especiallyadd more addresses (FI in april 2011 willrelease WO address data) is up tousersdemand more on thistopic.