High Quality Curated Data for
Healthcare AI - Free to download

Open Data initiative is dedicated to making quality healthcare data accessible to developers, researchers, and analysts passionate about solving AI problems in healthcare.

Physician Dictation Audio Files

A set of 16 hours of audio, dictated by physicians describing patients' clinical condition and plan of care based on physician-patient encounters in the hospital/clinical setting.


Verbatim Transcribed Text Files

A set of transcribed documents corresponding to the dictation audio dataset. Transcription has been done verbatim, as required to train speech recognition acoustic and vocabulary models.