The simulated insurance data contain information about medical charges billed to 1338 Americans during 2003.

insurance

Format

a data frame of 1338 observations containing the following variables:

age

age (in years)

sex

sex of individual, male or female

bmi

body mass index (in kg/ sq.meter)

children

number of dependent children

smoker

logical, yes if the person is a smoker and no otherwise

region

categorical variable, the geographical location within the USA, one of southwest, southeast, northwest or northeast

charges

individual medical costs billed by health insurance (in USD).

Source

Lantz, Brett (2003), Machine Learning with R, Packt Publishing.