Create account

DEV Community

Ra Baryon Neutrino

Posted on Jun 4

Transform Your Speech into Text with the Power of OpenAI and useWhisper

#react #whisper #openai #nextjs

This article was generated using ChatGPT from README.md

Demo

Are you tired of spending hours transcribing speech into text manually? Are you looking for a way to save time and increase accuracy? If so, you'll want to check out useWhisper, a React Hook for OpenAI Whisper API that comes with speech recorder and silence removal built-in.

Introduction

Transcribing speech is a common task in many industries, including journalism, entertainment, and customer service. However, the process can be time-consuming and often leads to errors. With useWhisper, you can quickly and accurately transcribe speech into text using the power of OpenAI.

Getting Started

Getting started with useWhisper is easy. First, install the package:

npm i @chengsokdara/use-whisper

yarn add @chengsokdara/use-whisper

Once you've installed useWhisper, you can start using it in your React app. Import the hook:

import { useWhisper } from '@chengsokdara/use-whisper'

Next, call the hook within your component, passing in your OpenAI API token:

import { useWhisper } from '@chengsokdara/use-whisper'

const App = () => {
  const {
    recording,
    speaking,
    transcript,
    transcripting,
    pauseRecording,
    startRecording,
    stopRecording,
  } = useWhisper({
    apiKey: env.process.OPENAI_API_TOKEN, // YOUR_OPEN_AI_TOKEN
  })

  return (
    <div>
      <p>Recording: {recording}</p>
      <p>Speaking: {speaking}</p>
      <p>Transcripting: {transcripting}</p>
      <p>Transcribed Text: {transcript.text}</p>
      <button onClick={() => startRecording()}>Start</button>
      <button onClick={() => pauseRecording()}>Pause</button>
      <button onClick={() => stopRecording()}>Stop</button>
    </div>
  )
}

You're now ready to start using useWhisper to transcribe speech!

Configuration Options

One of the benefits of useWhisper is its flexibility. It provides several configuration options that you can use to customize your speech recognition experience.
The available options are:

apiKey: Your OpenAI API key (required).
autoStart: Automatically start speech recording on component mount.
customServer: Supply your own whisper REST API endpoint
nonStop: If true, recording will auto-stop after stopTimeout. However, if the user keeps speaking, the recorder will keep recording.
removeSilence: Remove silence before sending the file to OpenAI API.
stopTimeout: If nonStop is true, this becomes required. This controls when the recorder auto-stops.

Methods and States

In addition to the configuration options, useWhisper also provides several methods that you can use to control speech recording:

recording: Speech recording state.
speaking: Detect when the user is speaking.
transcript: Object returned after Whisper transcription is complete.
transcripting : Remove silence from speech and send request to OpenAI Whisper API
pauseRecording: Pause speech recording.
startRecording: Start speech recording.
stopRecording: Stop speech recording.

Transcript Object

The transcript object returned from the hook contains the following:

blob: Recorded speech in JavaScript Blob.
text: Transcribed text returned from Whisper API.

Source Code

chengsokdara / use-whisper

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

useWhisper

React Hook for OpenAI Whisper API with speech recorder, real-time transcription and silence removal built-in

Demo
Real-Time transcription demo

use-whisper-real-time-transcription.mp4

Announcement

useWhisper for React Native is being developed.

Repository: http://github.com/chengsokdara/use-whisper-native

Progress: chengsokdara/use-whisper-native#1

Install

npm i @chengsokdara/use-whisper

yarn add @chengsokdara/use-whisper

Usage

import { useWhisper } from '@chengsokdara/use-whisper'
const App = () => {
  const {
    recording,
    speaking,
    transcribing,
    transcript,
    pauseRecording,
    startRecording,
    stopRecording,
  } = useWhisper({
    apiKey: process.env.OPENAI_API_TOKEN, // YOUR_OPEN_AI_TOKEN
  })

  return (
    <div>
      <p>Recording: {recording}</p>
      <p>Speaking: {speaking}</p>
      <p>Transcribing: {transcribing}</p>
      <p>Transcribed Text: {transcript.text}</p>
      <button onClick={() => startRecording()}>Start</

…

View on GitHub

Contact me

for web or mobile app development using React or React Native
chengsokdara@gmail.com
http://chengsokdara.github.io

Conclusion

useWhisper is a game-changer for anyone who needs to transcribe speech into text quickly and accurately. With its built-in speech recorder and silence removal, you can save time and reduce cost. Whether you're a journalist, customer service representative, or anyone who needs to transcribe speech, useWhisper is the tool for you. Try it out today!