Skip to contents

Package data

Household Budget Survey

The microsimulation model in which medusa is based is built up with the microdata from the Household Budget Survey (HBS), a common statistic in all EU countries which is increasingly standardized and which has relevant potential due the large amount of socioeconomic information that it collects. The HBS provides information about household final consumption expenditure on goods and services and information on some socioeconomic and demographic characteristics of each household. The HBS provides information at two levels: one for households and their expenditures and the other for household members. For more information about the HBS click here.

Data included in medusa

medusa includes pre-processed HBS microdata with some adjustments to enhance usability. These modifications are common to both Spain and EU data and include:

  1. Renaming some variables to make them more intuitive for users.
  2. Creating new socioeconomic indicators, such as income quintiles, deciles, ventiles, and percentiles — calculated at the national level for both Spain and EU data, and additionally at the EU level for EU data.
  3. Adding gender-sensitive variables:
    • Gender of the household reference person (already included in the HBS).
    • Degree of household feminization, calculated based on the proportion of women over 14 years old within the household:
      • FD1: 0–20% (lowest feminization)
      • FD2: 20–40%
      • FD3: 40–60%
      • FD4: 60–80%
      • FD5: 80–100% (highest feminization)

For a full list of variables available for distributional analysis, click here.

Spain: medusa includes Spanish HBS microdata for the period 2006–2021. The raw microdata can be downloaded from INE and processed using load_rawhbs().

EU: medusa does not include Eurostat HBS microdata, as these are confidential and access is restricted. The available waves are 2010, 2015 and 2020. To request access to the microdata, visit the Eurostat microdata access page. Once access is granted, the data can be processed using hbs_eu(). See the EU tutorials for details.

Expenditure variables

HBS expenditure data follows the Classification of Individual Consumption by Purpose (COICOP), an internationally recognised classification developed by the United Nations Statistics Division. This system categorises household expenditures into groups such as food, clothing, housing, water, electricity, gas, and other fuels. For more details on the COICOP classification click here.

Population census

Although the HBS covers a representative sample of the population, the total population does not coincide with the data collected in the National Accounts. Therefore, in order to make the HBS data consistent with the National Accounts data, the population data from the survey would first have to be adjusted. To do this, medusa takes the Eurostat census data as of 1 January and calculates adjustment coefficients. The population data is available here.

National accounting

Despite the fact that the HBS provides a very detailed image of the annual consumption of households, the aggregate costs of the survey are not aligned with the principles and data of the National Accounts. Therefore, sometimes (e.g. when price shocks come from a macro model), before the simulation the HBS data should be adjusted to make them consistent with the macroeconomic dimension. medusa takes the National Accounts consumption data and calculates an adjustment factor at the highest possible level of disaggregation.

  • Spain: The national accounting data used in medusa can be downloaded here.
  • EU: Household final consumption expenditure data by COICOP category is retrieved from Eurostat. The relevant dataset is nama_10_co3_p3, accessible via the Eurostat data browser or directly through the restatapi package.

Data processing in medusa

Spain

medusa includes pre-processed versions of the Spanish HBS data (2006–2021), so no prior data preparation is required to use the Spain functions. Users who wish to work directly with the raw data can do so using load_rawhbs(). The main modifications applied are:

  • Merging household and expenditure datasets.
  • Adding the socioeconomic and gender-sensitive variables described above.

Aligning expenditures with National Accounts is optional and only performed when users set elevate = TRUE in calc_di().

EU

medusa does not include pre-processed EU data. To use the EU functions, users must:

  1. Obtain access to Eurostat HBS microdata (see Preparing the data).
  2. Process the raw microdata using hbs_eu(), which handles merging, variable creation, and standardisation.
  3. Use the resulting dataset as input to calc_di_eu(), calc_ep_eu(), and calc_tp_eu().