DEV Community

Cover image for Transform Your Speech into Text with the Power of OpenAI and useWhisper
Ra Baryon Neutrino
Ra Baryon Neutrino

Posted on

Transform Your Speech into Text with the Power of OpenAI and useWhisper

This article was generated using ChatGPT from README.md

Demo

Are you tired of spending hours transcribing speech into text manually? Are you looking for a way to save time and increase accuracy? If so, you'll want to check out useWhisper, a React Hook for OpenAI Whisper API that comes with speech recorder and silence removal built-in.

Introduction

Transcribing speech is a common task in many industries, including journalism, entertainment, and customer service. However, the process can be time-consuming and often leads to errors. With useWhisper, you can quickly and accurately transcribe speech into text using the power of OpenAI.

Getting Started

Getting started with useWhisper is easy. First, install the package:

npm i @chengsokdara/use-whisper

or

yarn add @chengsokdara/use-whisper

Once you've installed useWhisper, you can start using it in your React app. Import the hook:

import { useWhisper } from '@chengsokdara/use-whisper'

Next, call the hook within your component, passing in your OpenAI API token:

import { useWhisper } from '@chengsokdara/use-whisper'

const App = () => {
  const {
    recording,
    speaking,
    transcript,
    transcripting,
    pauseRecording,
    startRecording,
    stopRecording,
  } = useWhisper({
    apiKey: env.process.OPENAI_API_TOKEN, // YOUR_OPEN_AI_TOKEN
  })

  return (
    <div>
      <p>Recording: {recording}</p>
      <p>Speaking: {speaking}</p>
      <p>Transcripting: {transcripting}</p>
      <p>Transcribed Text: {transcript.text}</p>
      <button onClick={() => startRecording()}>Start</button>
      <button onClick={() => pauseRecording()}>Pause</button>
      <button onClick={() => stopRecording()}>Stop</button>
    </div>
  )
}
Enter fullscreen mode Exit fullscreen mode

You're now ready to start using useWhisper to transcribe speech!

Configuration Options

One of the benefits of useWhisper is its flexibility. It provides several configuration options that you can use to customize your speech recognition experience.
The available options are:

  • apiKey: Your OpenAI API key (required).
  • autoStart: Automatically start speech recording on component mount.
  • customServer: Supply your own whisper REST API endpoint
  • nonStop: If true, recording will auto-stop after stopTimeout. However, if the user keeps speaking, the recorder will keep recording.
  • removeSilence: Remove silence before sending the file to OpenAI API.
  • stopTimeout: If nonStop is true, this becomes required. This controls when the recorder auto-stops.

Methods and States

In addition to the configuration options, useWhisper also provides several methods that you can use to control speech recording:

  • recording: Speech recording state.
  • speaking: Detect when the user is speaking.
  • transcript: Object returned after Whisper transcription is complete.
  • transcripting : Remove silence from speech and send request to OpenAI Whisper API
  • pauseRecording: Pause speech recording.
  • startRecording: Start speech recording.
  • stopRecording: Stop speech recording.

Transcript Object

The transcript object returned from the hook contains the following:

  • blob: Recorded speech in JavaScript Blob.
  • text: Transcribed text returned from Whisper API.

Source Code

GitHub logo chengsokdara / use-whisper

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

useWhisper

React Hook for OpenAI Whisper API with speech recorder, real-time transcription and silence removal built-in


  • Demo

  • Real-Time transcription demo
use-whisper-real-time-transcription.mp4

  • Announcement

    useWhisper for React Native is being developed.

Repository: http://github.com/chengsokdara/use-whisper-native

Progress: chengsokdara/use-whisper-native#1

  • Install

npm i @chengsokdara/use-whisper
yarn add @chengsokdara/use-whisper
  • Usage

import { useWhisper } from '@chengsokdara/use-whisper'
const App = () => {
  const {
    recording,
    speaking,
    transcribing,
    transcript,
    pauseRecording,
    startRecording,
    stopRecording,
  } = useWhisper({
    apiKey: process.env.OPENAI_API_TOKEN, // YOUR_OPEN_AI_TOKEN
  })

  return (
    <div>
      <p>Recording: {recording}</p>
      <p>Speaking: {speaking}</p>
      <p>Transcribing: {transcribing}</p>
      <p>Transcribed Text: {transcript.text}</p>
      <button onClick={() => startRecording()}>Start</
Enter fullscreen mode Exit fullscreen mode

Contact me

for web or mobile app development using React or React Native
chengsokdara@gmail.com
http://chengsokdara.github.io

Conclusion

useWhisper is a game-changer for anyone who needs to transcribe speech into text quickly and accurately. With its built-in speech recorder and silence removal, you can save time and reduce cost. Whether you're a journalist, customer service representative, or anyone who needs to transcribe speech, useWhisper is the tool for you. Try it out today!

Top comments (1)

Collapse
 
nevodavid profile image
Nevo David

Been cool seeing steady progress in stuff like this - really cuts through the headache of manual transcribing.