Package data

Household Budget Survey

The microsimulation model in which medusa is based is built up with the microdata from the Household Budget Survey (HBS), a common statistic in all EU countries which is increasingly standardized and which has relevant potential due the large amount of socioeconomic information that it collects. The HBS provides information about household final consumption expenditure on goods and services and information on some socioeconomic and demographic characteristics of each household. The HBS provides information at two levels: one for households and their expenditures and the other for household members. For more information about the HBS click here.

Data included in medusa

Currently, medusa includes Spanish HBS microdata for the period 2006-2021 (soon to be updated for 2022). The original survey data has been minimally processed to enhance usability and incorporate additional socioeconomic variables. These modifications include: 1. Renaming some variables to make them more intuitive for users. 2. Creating new socioeconomic indicators, such as income quintiles (QUINTIL), deciles (DECIL), ventiles (VENTIL), and percentiles (PERCENTIL). 3. Adding gender-sensitive variables, aligned with medusa’s goal of facilitating gender-disaggregated analysis: - Gender of the household reference person (already included in the HBS). - Degree of household feminization, calculated based on the proportion of women over 14 years old (the age at which individuals are considered to have decision-making capacity) within the household: - FD1: 0-20% (lowest feminization) - FD2: 20%-40% - FD3: 40%-60% - FD4: 60%-80% - FD5: 80%-100% (highest feminization) For a full list of socioeconomic and demographic variables available for distributional analysis, click here.

Expenditure variables

HBS expenditure data follows the Classification of Individual Consumption by Purpose (COICOP), an internationally recognized classification developed by the United Nations Statistics Division. This system categorizes household expenditures into groups such as food, clothing, housing, water, electricity, gas, and other fuels. For more details on the COICOP classification click here.

Population census

Although the HBS covers a representative sample of the population, the total population does not coincide with the data collected in the National Accounts. Therefore, in order to make the HBS data consistent with the National Accounts data, the population data from the survey would first have to be adjusted. To do this, medusa takes the EUROSTAT census data as of 1 January and calculates adjustment coefficients. The population data is available here.

National accounting

Despite the fact that the HBS provides a very detailed image of the annual consumption of households, the aggregate costs of the survey are not aligned with the principles and data of the National Accounts, which builds its macroeconomic aggregates based on more complete sources of information. Therefore, sometimes (e.g. when price shocks come from a macro model), before the simulation the HBS data should be adjusted to make them consistent with the macroeconomic dimension. In these cases, medusa takes the National Accounts consumption data and calculates an adjustment factor at the highest possible level of disaggregation. To download the latest data click here.

Data processing in medusa

While medusa includes pre-processed versions of the HBS data, users who wish to work directly with the raw data can do so. The main modifications made to the survey data are: - Merging household and expenditure datasets. - Adding the new socioeconomic and gender-sensitive variables mentioned above.

However, medusa does not apply further adjustments unless explicitly requested by the user. For example: - Aligning expenditures with National Accounts is optional*and is only performed when users select this adjustment within the calc_di() function. - The adjustment process is handled by the function load_rawhbs(), which is included in the package.

Initially, we considered including the raw HBS microdata in medusa, but given the large file sizes, running the function would be too slow for users. Instead, we processed the data externally and included the modified version in the package. However, users can download the raw microdata from here and process it themselves using load_rawhbs().