Data and Methodology

Our Methodology                                                                                                                                                                                                                                          

We build on our academic expertise [1,2,3,4,5] in analyzing economic networks to aggregate the universe of seaborne shipment-level import data for the United States. Based on these data, we construct a comprehensive set of supply disruptions indices for the overall US imports and for its subcomponents by US Census region, by shipments’ country of origin, and by product category. The high-quality data and the derived indices are important for understanding the evolution of supply disruptions as well as for identifying the key forces driving these disruptions over time and across categories.


We analyze nearly 200 million transactions which span the universe of US seaborne imports since 2007. The raw data are updated nearly in real-time and represent bills of lading (BoL) which detail the cargo in the shipment, including shipper/consignee name and address, description of the goods, vessel name, weight, quantity and container information. We aggregate the BoL data to the importer-exporter-pair level and focus on ultimate parent companies of US consignees. We extensively inspect and refine the data to assemble a high-quality dataset that is consistent across several dimensions and is robust to outliers and redacting activity of firms. 

In creating this data, we build on our prior research in aggregating shipment-level data to study supply chain disruptions [1].

Index methodology

Our monthly index of supply disruptions is constructed to reflect activities by pairs of US importers and foreign exporters over time. We define "established" trade pairs as those that have traded sufficiently frequently at some point in our sample. In any month, we measure the "disruption rate" for a given Census region, product, or a country-of-origin category as the fraction of temporarily inactive, established trade pairs among all established trade pairs that were active in the recent past. We scale the indices so that each averages to zero prior to 2020. Identification of temporarily inactive trade pairs is also challenging toward the end of the sample as we do not get to observe which inactive trade pairs will recover in the future. We devise an imputation algorithm which leverages very stable relative recovery rates of trade pairs over different time horizons to construct a timely index of supply disruptions that can be updated in nearly real-time. 

There are various tuning parameters in our algorithm, e.g., how frequently the trade pair has to be active for to qualify for being established, how long the period of inactivity can be, etc. We construct a complete set of time series spanning all feasible combinations of tuning parameters. Each time series is then de-seasonalized and smoothed using a 3-month rolling window. We then aggregate all constructed time series to obtain the resulting index of supply disruptions.

The underlying data and methodology are described in more detail in the technical report

We update the index on the 16th of each month. 

Our Research