Data-finding guide · economics and social science primer

What panel data is, and where to find it

Panel data simply means putting observations of several units across several years into one table — for example, real GDP per capita for multiple countries over multiple years. It carries more information than cross-sectional data, which looks at a single point in time, or a time series, which follows a single unit. Below we explain the difference in plain language, then list the public panel sources commonly used in economics and social science research.

The short answer first

Panel data is data with multiple units × multiple points in time — for example, real GDP per capita for multiple countries over several consecutive years, with one row per country per year. The public panel sources commonly used in economics and social science research include the Penn World Table (PWT), the World Bank's World Development Indicators (WDI), OECD statistics, and the CEPII gravity database with BACI bilateral trade data, all of which are free to obtain. When you use them, the three things to check most carefully are definitions, units and missing values.

Panel, cross-sectional and time series — what's the difference

These three concepts are often confused. Here is each in a single sentence:

  • Cross-sectional data: a snapshot of multiple units at one point in time. For example, the GDP of each province in 2023 — a single year, comparing different provinces side by side.
  • Time series: the change of a single unit over multiple points in time. For example, one province's GDP over ten consecutive years — a single unit, followed over time.
  • Panel data: both multiple units and multiple time points — a combination of the two above. For example, the GDP of each province over ten consecutive years, letting you compare provinces side by side and follow them across years.

The advantage of panel data is that it can control at the same time for inherent differences between units (individual effects) and for common time trends (time effects). This makes regressions more robust and makes it better suited to analyzing causal questions about policies, institutions and other factors that change over time.

TypeNumber of unitsNumber of time pointsExample
Cross-sectional dataManyOneGDP of each province in 2023
Time seriesOneManyOne province's GDP, 2014–2023
Panel dataManyManyGDP of each province, 2014–2023

Public panel sources commonly used in economics and social science

The sources below are widely used by researchers and free to obtain. We have also written reference cards for several of them on our curated datasets page, setting out their fields, definitions and uses.

Penn World Table (PWT)

National accounts data maintained by the University of Groningen, covering real GDP, productivity, employment, population and price levels for over a hundred countries across many years. It is released under the CC BY 4.0 license and suits long-run cross-country comparisons of economic growth and productivity. See the reference card →

World Bank World Development Indicators (WDI)

The core collection within the World Bank's open data, covering around 220 economies and thousands of indicator time series. Its topics span economic growth, education, health, trade, energy, the environment and more. It is licensed under CC BY 4.0 and can be downloaded free by country and year, making it a foundational panel source for cross-country development research.

OECD statistics

The statistics repository of the Organisation for Economic Co-operation and Development, covering member and partner economies. Topics include the economy, the labor market, productivity, prices, education, health and population. Access is free and registration is optional, making it well suited to research on developed economies and cross-country comparisons.

CEPII gravity database and BACI bilateral trade

Maintained by France's CEPII: BACI provides product-level bilateral trade flow data classified by HS code, released free under the Etalab Open License 2.0; the gravity database brings together variables such as bilateral distance, population and GDP for use in trade gravity models. Both are common panel sources for empirical international trade research. See the reference card →

The traps you are most likely to hit with panel data

  • Inconsistent definitions: indicators with the same name from different sources may be measured differently (for example, nominal versus real, or whether certain sectors are included). Confirm each source's definition before merging.
  • Units and base years not aligned: cross-country economic data often uses different currency units and price base years, so direct comparison can be misleading. Standardize to the same unit and base year first.
  • Missing values not distinguished: some country-year cells in a panel are empty. Work out whether each is genuinely missing, not yet observed, or simply did not exist for that country and year — the handling differs completely. Do not treat them all as zero or delete them all.

Don't want to assemble the panel yourself? Leave it to us

Aligning panel data from multiple sources into a single table you can run a regression on directly often takes more time than you'd expect. If you would rather not handle definitions, units and missing values yourself, set out your research goals and the conditions they must meet, and we will start with a free data availability assessment: we run real searches on authoritative data platforms and judge matches and gaps item by item against your required list. Once there is a match, we can also organize the data so that field definitions are consistent, gaps are documented and the result is reproducible. Even if no fully fitting dataset is found, the search directions, approximate sources and item-by-item findings are presented faithfully for your reference.

See research services →

Further reading

Talk to us