The microsimulation model in which medusa
is based is
built up with the microdata from the Household Budget Survey (HBS), a
common statistic in all EU countries which is increasingly standardized
and which has relevant potential due the large amount of socioeconomic
information that it collects. The HBS provides information about
household final consumption expenditure on goods and services and
information on some socioeconomic and demographic characteristics of
each household. The HBS provides information at two levels: one for
households and their expenditures and the other for household members.
For more information about the HBS click here.
Currently, medusa includes Spanish HBS microdata for the period 2006-2021 (soon to be updated for 2022). The original survey data has been minimally processed to enhance usability and incorporate additional socioeconomic variables. These modifications include: 1. Renaming some variables to make them more intuitive for users. 2. Creating new socioeconomic indicators, such as income quintiles (QUINTIL), deciles (DECIL), ventiles (VENTIL), and percentiles (PERCENTIL). 3. Adding gender-sensitive variables, aligned with medusa’s goal of facilitating gender-disaggregated analysis: - Gender of the household reference person (already included in the HBS). - Degree of household feminization, calculated based on the proportion of women over 14 years old (the age at which individuals are considered to have decision-making capacity) within the household: - FD1: 0-20% (lowest feminization) - FD2: 20%-40% - FD3: 40%-60% - FD4: 60%-80% - FD5: 80%-100% (highest feminization) For a full list of socioeconomic and demographic variables available for distributional analysis, click here.
HBS expenditure data follows the Classification of Individual Consumption by Purpose (COICOP), an internationally recognized classification developed by the United Nations Statistics Division. This system categorizes household expenditures into groups such as food, clothing, housing, water, electricity, gas, and other fuels. For more details on the COICOP classification click here.
Although the HBS covers a representative sample of the population,
the total population does not coincide with the data collected in the
National Accounts. Therefore, in order to make the HBS data consistent
with the National Accounts data, the population data from the survey
would first have to be adjusted. To do this, medusa
takes
the EUROSTAT census data as of 1 January and calculates adjustment
coefficients. The population data is available here.
Despite the fact that the HBS provides a very detailed image of the
annual consumption of households, the aggregate costs of the survey are
not aligned with the principles and data of the National Accounts, which
builds its macroeconomic aggregates based on more complete sources of
information. Therefore, sometimes (e.g. when price shocks come from a
macro model), before the simulation the HBS data should be adjusted to
make them consistent with the macroeconomic dimension. In these cases,
medusa
takes the National Accounts consumption data and
calculates an adjustment factor at the highest possible level of
disaggregation. To download the latest data click here.
medusa
While medusa
includes pre-processed versions of
the HBS data, users who wish to work directly with the raw data
can do so. The main modifications made to the survey
data are: - Merging household and expenditure datasets. - Adding the new
socioeconomic and gender-sensitive variables mentioned above.
However, medusa
does not apply further adjustments
unless explicitly requested by the user. For example: -
Aligning expenditures with National Accounts is optional*and is only
performed when users select this adjustment within the
calc_di()
function. - The adjustment process is handled by
the function load_rawhbs()
, which is included in the
package.
Initially, we considered including the raw HBS microdata in
medusa
, but given the large file sizes, running the
function would be too slow for users. Instead, we processed the data
externally and included the modified version in the package. However,
users can download the raw microdata from here
and process it themselves using load_rawhbs()
.