Eric French and John Bailey Jones, "On the Distribution and Dynamics of
Health Care Costs", Journal of Applied Econometrics, Vol. 19, No. 6,
2004, pp. 705-721.

The data used in this paper were extracted from the first five waves of the
Health and Retirement Survey (HRS), which is available online at
http://hrsonline.isr.umich.edu/.  The data are contained in two zip files,
hcdata-ascii.zip and hcdata-stata.zip. Both contain the same 34,893
observations, but in different formats. The actual data files are called
healdn1.txt, an ASCII file in DOS format, and healdn1.dta.

The data are ordered as follows:  

Column 1:   age
Column 2:   ageold (=1 if age>65)
Column 3:   assets
Column 4:   capy (capital income)
Column 5:   dead
Column 6:   dentisth(household dentist visits)
Column 7:   drtimes (head's doctor visits)
Column 8:   drtimesh (household doctor visits)
Column 9:   drugch (drug costs)
Column 10:  epins (employer provided insurance)
Column 11:  faminc (bottomcoded family income)
Column 12:  hhinc (family income)
Column 13:  hosph (hospital nights)
Column 14:  hrscoh (hrs cohort)
Column 15:  indnum (person id)
Column 16:  insnone (no insurance)
Column 17:  inspriv (private insurance)
Column 18:  ipremh (insurance premia)
Column 19:  lassets (log assets)
Column 20:  linc (log income)
Column 21:  lowinc (income<5000)
Column 22:  male
Column 23:  married
Column 24:  medcare (no insurance other than medicare)
Column 25:  medcosth (household medical costs)
Column 26:  medcosthr (household medical costs)
Column 27:  Medicaid
Column 28:  nodrugh (did not take prescription drugs for financial reasons)
Column 29:  nursing  (nursing home nights)
Column 30:  nursingh (household nursing home nights)
Column 31:  ooph (out of pocket medical costs)
Column 32:  outsurgh (outpatient surgery)
Column 33:  wave
Columns 34-38:  wave1-wave5

Below is a list of the programs used to generate the results shown in the
paper.  All these programs are contained in the file hcprgs.zip.

STATA FILES

momentdm.do: creates momentm.out, plus other, unnecessary, stuff.  (This
  program is coded so that medcosth=250 if medcosth<250.  Changing the bottom
  coding rules from medcosth=250 if medcosth<250 to drop if medcosth<=0 and
  changing the .out file to momentz.out gives us the HSZ coding of the data.)
  Turn the .out files into GAUSS (.dht) files using DBMSCOPY

healep11.do: creates hccosts2.txt and johndat.txt (ASCII).  Load johndat.txt
  into GAUSS, convert into *.fmt using GAUSS's "save" command.

healep12.do: creates hccosts3.txt and johndatz.txt (ASCII), the HSZ-coded
  data.  Convert johndatz.txt into *.fmt format using GAUSS.

GAUSS FILES

healcv2x.gau:  runs all models 1 at a time

healcv2xx.gau: runs all models in one big batch

healcv2xz.gau: the same as healcv2x.gau, but also has a
  MA(1)+AR(1)+permanent person-specific random effect

mleest4: analyzes upper tail of cross-section (N.B.: looks for files in
  d:\hccosts\data)

mleest5: analyzes upper tail of cross-section, using HSZ-coded data

gradp2 and hessp2: robust numerical gradients and hessians.  Put in same
  directory as mleest4.

grdata4:  graphs output of mleest4

ghinter2:  equivalent differential calculations and graphs 

ghinter3: equivalent differential calculations and graphs, using HSZ-coded
  data

hclyr: estimates one-year analog to complete fitted lognormal model (to
  utilize 1 million simulations, we had to run this on a UNIX mainframe).

hclyr2: estimates one-year analog to complete standard lognormal model

hclyrz: estimates one-year analog to complete fitted lognormal model, using
  HSZ-coded data

hclyr2z: estimates one-year analog to complete standard lognormal model,
using HSZ-coded data
  
hcshks5:  simulates lifetime health cost histories




