Explanation of Data Standards for Race, Ethnicity, Sex, Primary Language, and Disability

HHS examined current Federal data collection standards, adequacy of prior testing, and quality of the data produced in prior surveys, consulted with statistical agencies and programs, reviewed Office of Management and Budget (OMB) data collection standards and the Institute of Medicine (IOM) Report Race, Ethnicity, and Language Data Collection: Standardization for Health Care Quality Improvement and built on its members' experience with collecting and analyzing demographic data iii. HHS also paid special attention to current data collection policies for major HHS surveys and those of the Census Bureau.

The following criteria guided the development of data standards for each of the required variables:

  1. Evidence-based and demonstrated to have worked well in practice for national survey data collection.

  2. Represent a minimum data standard, with agencies permitted to collect as much additional detail as desired, provided that the additional detail could be aggregated back to the minimum standard.

  3. Standards mandated by Office of Management and Budget (OMB) would serve as the starting point for any minimum standard.

  4. Standards would be for person-level data collection, where respondents either self-report information or serve as the most knowledgeable respondent for all persons in a household survey.

I. Data Collection Standards and Rationale for Selection

A. Race and Ethnicity

The starting point for the race and ethnicity data collection standards is OMB's current government-wide standard, issued in 1997 after a comprehensive public engagement process and extensive field testing. The principles underlying these government-wide standards are described below. The justifications for these principles are described by OMB in detail at http://www.whitehouse.gov/omb/fedreg_1997standards/.
  • Self-identification is the preferred means of obtaining information about an individual's race and ethnicity, except in instances where observer identification is more practical. The surveyor should not tell an individual who he or she is, or specify how an individual should classify himself or herself.
  • To provide flexibility and ensure data quality, separate questions for race and ethnicity should be used wherever feasible. Specifically, when self-reporting or other self-identification approaches are used, ethnicity is asked first, and then race. The standard acknowledges that this standard might not work in other contexts (e.g., administrative records.)
  • The specified race and ethnicity categories provide a minimum set of categories except when the collection involves a sample of such size that the data on the smaller categories would be unreliable, or when the collection effort focuses on a specific racial or ethnic group.
    • The OMB minimum categories for race are: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White.
    • The OMB minimum categories for ethnicity are: Hispanic or Latino and Not Hispanic or Latino.
  • When self-reporting or other self-identification approaches are used, respondents who wish to identify their multi-racial heritage may choose more than one race, there is no "multi-racial" category.
  • OMB encourages additional granularity where it is supported by sample size and as long as the additional detail can be aggregated back to the minimum standard set of race and ethnicity categories.
  • Any other variation will have to be specifically authorized by the OMB through the information collection clearance process. In those cases where the data collection is not subject to the information collection clearance process, a direct request for a variance should be made to OMB.

The categories for HHS data standards for race and ethnicity are based on the disaggregation of the OMB standard used in the American Community Survey (ACS) and the 2000 and 2010 Decennial Census. The data standard for race and ethnicity is listed below. Race and ethnicity data collection applies to survey participants of all ages.

Ethnicity Data Standard Categories
Are you Hispanic, Latino/a, or Spanish origin
(One or more categories may be selected)
  1. ____No, not of Hispanic, Latino/a, or Spanish origin
  2. ____Yes, Mexican, Mexican American, Chicano/a
  3. ____Yes, Puerto Rican
  4. ____Yes, Cuban
  5. ____Yes, Another Hispanic, Latino, or Spanish origin
ArrowThese categories roll-up to the Hispanic or Latino category of the OMB standard


Race Data Standard Categories
What is your race?
(One or more categories may be selected)
  1. ____White
  2. ____Black or African American
  3. ____American Indian or Alaska Native

Arrow
These categories are part of the current OMB standard


  1. ____Asian Indian
  2. ____Chinese
  3. ____Filipino
  4. ____Japanese
  5. ____Korean
  6. ____Vietnamese
  7. ____Other Asian

Arrow


These categories roll-up to the Asian category of the OMB standard


  1. ____Native Hawaiian
  2. ____Guamanian or Chamorro
  3. ____Samoan
  4. ____Other Pacific Islander

ArrowThese categories roll-up to the Native Hawaiian or Other Pacific Islander category of the OMB standard

Rationale for Race and Ethnicity Data Standards

As a result of the 1997 HHS data inclusion policy, the basic OMB standard is already included in most HHS data collection initiatives. The new HHS data standards for race and ethnicity include additional granularity, but all categories roll-up to the OMB standard. However, because additional granularity in the race and ethnicity categories is important for documenting and tracking health disparities, large federal surveys such as the National Health Interview Survey (NHIS), Current Population Survey (CPS), and the ACS have implemented such a more granular strategy, particularly for Hispanic and Asian subpopulations.

Accordingly, the new data standards for race and ethnicity are a slightly modified version of the ACS and Decennial Census questions. These items provide additional granularity for Hispanic (four additional categories) and Asian subpopulations (7 additional categories) beyond the OMB minimum standard categories. The race and ethnicity categories for the ACS and recent Decennial Census have been tested and structured to increase response rates, validity, and reliability iii. The more detailed ACS and recent Decennial Census race categories roll up to the OMB standard five categories: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. As with OMB standards, respondents are also instructed to mark all categories that apply (i.e. they may be able to select more than one racial category). The ACS and Decennial Census ethnicity categories roll up to the OMB standard categories: Hispanic or Latino and Not Hispanic or Latino iv v. Respondents are also able to select more than one ethnicity category. The recommended standard is in conformance with the methods, logistics, practices and limitations of HHS major surveys, where population estimates are the goal.

HHS agencies may request permission from OMB during the Paperwork Reduction Act clearance process to add a write-in option of "other" to interviewer-administered surveys. This respondent-specified race must then be coded by the agency to the OMB and HHS standards before results are publically reported.

B. Sex

The data standard for sex is male and female. Sex data collection applies to survey participants of all ages.

Sex Data Standard
What is your sex?
  1. ____Male
  2. ____Female

Rationale for Sex Data Standard

For the purpose of this report, the category of sex was defined as biologic sex. Sexual orientation and gender identity were considered as separate concepts. The Department has developed a data progression plan for collecting sexual orientation data and has conducted gender identity data collection listening sessions.

C. Primary Language

The standard for primary language is a measure of English proficiency. The recommended question is based on that used on the ACS. The question applies to survey participants aged five years and above.
Data Standard for Primary Language
How well do you speak English? (5 years old or older)
  1. ____Very well
  2. ____Well
  3. ____Not well
  4. ____Not at all

The primary language data standard represents a minimum standard and the question and answer categories cannot be changed. Additional questions on language may be added to any survey as long as the minimum standard is included.

Optional Granularity

For agencies that wish to collect data on the specific language spoken, the Data Council recommends collecting data on language spoken at home. The recommended survey items are used in the ACS (see below). Collecting this additional information would be optional and at the discretion of the agency, if information on specific language was desired.

Data Collection for Spoken Language

  1. Do you speak a language other than English at home? (5 years old or older)
    1. ____Yes
    2. ____No

    For persons speaking a language other than English (answering yes to the question above):
  2. What is this language? (5 years old or older)
    1. ____Spanish
    2. ____Other Language (Identify)
For agencies that desire to collect information on specific languages beyond Spanish, and have sufficient sample sizes to support such estimates, HHS would publish on the HHS website a list of the ten most prevalent languages spoken in the U.S., as reported by ACS. These would roll up to the "Other Language" category, and provide technical notes to assist in coding. Spanish as a category is reported about 60 percent of the time in the ACSvi.

Rationale for Primary Language Data Standard

The survey item selected for the minimum standard is based on the ACS, which assesses both English proficiency and language spoken other than English, and has been collected by the Census Bureau since 1980.

For statistical, planning, analytical and research purposes, disparities have been associated with English language proficiency rather than specific language spoken. For clinical purposes relating to an individual, specific language and proficiency would both be needed. This recommendation is consistent with language recommendations from the Institute of Medicine report Race, Ethnicity, and Language Data Collection: Standardization for Health Care Quality Improvement.

Several HHS surveys currently collect data on language or English proficiency primarily in the preliminary screening phase of in person or telephone interview surveys for administrative purposes in surveys, to determine how or in what language the interview would be administered. It is not the intent of this standard to disrupt those screening practices.

D. Disability Status

The six item set of questions used on ACS and other major surveys to gauge disability is the data standard for survey questions on disability. Note the age thresholds for survey participants for the different disability questions.

Data Standard for Disability Status

  1. Are you deaf or do you have serious difficulty hearing?
    1. ____Yes
    2. ____No

  2. Are you blind or do you have serious difficulty seeing, even when wearing glasses?
    1. ____Yes
    2. ____No

  3. Because of a physical, mental, or emotional condition, do you have serious difficulty concentrating, remembering, or making decisions? (5 years old and older)
    1. ____Yes
    2. ____No

  4. Do you have serious difficulty walking or climbing stairs? (5 years old and older)
    1. ____Yes
    2. ____No

  5. Do you have difficulty dressing or bathing? (5 years old and older)
    1. ____Yes
    2. ____No

  6. Because of a physical, mental, or emotional condition, do you have difficulty doing errands alone such as visiting a doctor's office or shopping? (15 years older and older)
    1. ____Yes
    2. ____No

The six-item disability standard represents a minimum standard and the questions and answer categories cannot be changed. Additional questions on disability may be added to any survey as long as the minimum standard is included. If the ACS changes the disability questions in the future, HHS will revisit the standard and modify as necessary.

Rationale for Disability Data Standard
The six item set of questions used on the ACS and other major surveys to measure disability was developed by a federal interagency committee and reflects the change in how disability is conceptualized consistent with the International Classification of Functioning, Disability, and Health. The question set defines disability from a functional perspective and was developed so that disparities between the 'disabled' and 'nondisabled' population can be monitored. The question set went through several rounds of cognitive and field testing and has been adopted in many federal data collection systems. OMB has encouraged the use of this question set by other federal agencies conducting similar population studies due to the extensive testing used in the development of these measures, including the findings that alternative measures did not test as well. Cognitive testing of these questions revealed that the six questions must be used as a set to assure a meaningful measure of disability vi.


iOMB (Office of Management and Budget). (1977). Statistical Policy Directive No. 15, Race and Ethnic Standards for Federal Statistics and Administrative Reporting.

iiIOM (Institute of Medicine). (2009). Race, Ethnicity, and Language Data: Standardization for Health Care Quality Improvement. Washington, DC: The National Academies Press.

iiiAlberti, N. (2006) The 2005 National Census Test: Analysis of the Race and Ethnicity Questions. Final Report, 2005 National Census Test Analysis. U.S. Census Bureau

ivOffice of Management and Budget. (1997a) Recommendation from the Interagency Committee for the Review of the Racial and Ethnic Standards to the Office of Management and Budget Concerning Changes to the Standards for Classification of Federal Data on Race and Ethnicity, Federal Register: 62: 36873-36946, July 9.

vOffice of Management and Budget. (1997b) Revisions to the Standards for the Classification of Federal Data on Race and Ethnicity, Federal Register: 62: No.210, October 30.

viShin, Hyon B. and R. Kominski. (2010). Language Use in the United States: 2007, American Community Survey Reports, ACS-12. U.S. Census Bureau, Washington, DC.

viiBrault, M, S. Stern, D. Raglin. (2007). Evaluation Report Covering Disability, American Community Survey Content Test Report P.4. U.S. Census Bureau, Washington, DC.