With cost savings and efficiency improvements high on the agenda of every healthcare provider, many people have investigated voice recognition technology. Recently, new technology has allowed users to replace the dictation of clinical notes to tape with digital voice technology while retaining the voice file manually transcribed from audio to text. This article looks at the best uses for each technology and explores their relative strengths and weaknesses.
Voice Recognition (VR)
VR allows voice input to be reflected as text in real time, and can perform the conversion to text from a stored voice file. The most common implementation of this in healthcare is when a clinician can speak into a microphone, see what is said appear on a screen as text and edit that text through simple voice commands.
Digital Dictation (DD)
DD allows voice files to be captured in a digital format. Simply, this technology is based on Digital Voice Recorders (DVRs) replacing analogue tape machines. Once a message is recorded in digital form and loaded onto a computer system, it may be used for transcription just as tapes are used currently.
Medical Transcription (MT)
This is the act of retrieving a voice file and manually transcribing it to text format while applying appropriate medical knowledge to verify the content. MT is practiced widely across healthcare and is based primarily on tapes recorded by clinicians played back and transcribed by medical secretaries.
Who does what?
The common method of producing a note to be stored in a patient’s record and disseminated to an intended recipient is as follows:
- Clinician (author) dictates note using a hand-held tape recorder.
- Author places tape, along with the relevant patient’s medical records folder, in secretary’s in-tray.
- Secretary retrieves tape, places in player and rewinds to find start of relevant file.
- Secretary opens Word document and types in the clinician’s notes, referencing various hospital systems and the patient’s medical records folder as and when necessary.
- Secretary saves the document to a specific folder or virtual drive for later retrieval, prints a hard copy and places it in the author’s in-tray for approval.
- Author retrieves the document, makes any necessary alterations in pen and returns the document to the secretary’s in-tray.
- Secretary performs the necessary alterations.
- Repeat steps 5, 6, and 7 as many times as necessary until the author is satisfied.
- Author signs the document and returns it to the secretary’s in-tray.
- Secretary retrieves the document and prints another copy for the patient’s medical record.
- Secretary folds the signed document, places in an appropriately addressed envelope, seals the envelope and places in the external mail.
- Secretary places copy of document into the patient’s medical records folder and sends for storage.
Hopefully, many recognise parts, if not all, of that process as being only too familiar. But how can technology speed up the process and how can it save money?
What can VR do?
Voice Recognition can completely change this process by allowing the document to be produced as the author dictates. This cuts out the majority of the secretary’s role, a time consuming and costly iterative process, and results in a document that can be stored and disseminated electronically.
But it’s not quite that easy. In principle, the process works, but in practice there are some flaws. Background noise when dictating can reduce accuracy of transcription by the VR software. The best environments are where the author can work with few interruptions – which is why the greatest penetration of VR technology into the health service up until now has been within the radiology sector.
Secondly – and the eternal bane of a VR salesperson’s life – it takes time to train VR systems to recognise a particular author’s voice. This length of time can never be quantified, and it may be hours, days or even months, depending on the user’s accent, clarity, tenacity, and other factors.
VR takes up the author’s time in that they are dictating, reading and approving all at the same time. This is fine in some situations, such as when analysing a PACS image, but impossible in most clinical environments where time is of the essence. But should the author be willing to persevere with the training and has the time and environment in which to work, VR can prove effective.
What can DD do?
Digital Dictation enables audio to be recorded in a digital formatof a far higher quality than that available from audio tape. This allows quicker, more accurate transcription. This digital voice file can then be downloaded to a computer system for storage, transcription and dissemination and the DVR automatically cleared for reuse.
The use of DD overcomes the issues involved with tapes being unreadable, ageing, getting overwritten or being lost, and can also save time as the secretary doesn’t have to wait for rewind when beginning transcription. Another great advantage is that, once the voice file has been stored on the system, it can be retained for as long as required.
So what are the drawbacks of DD? Again, lots of background noise can interfere exactly as when using analogue tapes. The DVRs themselves can be misplaced and can be a bit fiddly. Indeed, a number of manufacturers are increasing the size of their devices so as to make them harder to lose and make room for larger buttons.
On the whole, they are a great improvement on tape machines and, because the files created are able to be stored and managed electronically, they give rise to far better, and more cost efficient, workflow management of the transcription process.
What can MT do?
Medical Transcription transforms a voice file into a text file and is performed by a skilled transcriber who listens to the voice file and types what is heard while applying their medical and/or clinical knowledge to ensure its veracity. Many transcribers will also add value by performing additional tasks ranging from some elements of clinical coding through to ‘filling the gaps’ left by the author within their dictation.
By aligning MT with DD and introducing an automated workflow, the 12 step process described above can be made far more cost effective and timely.
Once the transcription has been performed, it can be presented to the author for approval. An electronic signature can then be added automatically and the transcribed document can be disseminated electronically. Both voice file and transcribed document can then be held on the system indefinitely.
In the past, outsourced transcription has had some bad press but it is ubiquitous in the US – the industry is estimated to be worth over $10 billion per year. In the UK well-qualified, capable medical secretarial expertise is becoming more costly and harder to find.
The key has to be quality and, if a transcription services company that provides proven quality can be allied with DD and sound workflow management, then cost savings can be up to 50%. Even though competent Medical Secretaries will always be needed, outsourcing offers the advantage that transcription deadlines can be met even when volumes are peakingor in times of staff sickness or holiday periods.
India is currently the major supplier of outsourced MT services and the competence and ongoing training of MT staff there is recognised by the industry as being of paramount importance.
No recommendation can be made without taking account of the individual’s needs. If the environment allows, VR can work well, but the author must accept that it may take some time before they are able to use the technology and that the initial learning curve might be quite steep. Also, VR can be a relatively expensive solution with a comparatively large up front cost.
DD and MT should be used together for best results and need a sound workflow management process to realise greatest benefit. Outsourced transcription can offer enormous cost savings but clinicians will always need secretaries and they will remain part of the transcription process, if only to check the quality of what is received from the outsourced service provider.