This open source project provides a self-hostable API for speech to text transcription using a finetuned Whisper ASR model. The API allows you to easily convert audio files to text through HTTP requests. Ideal for adding speech recognition capabilities to your applications.
Key Features
- Accurate Speech Recognition: Uses a finetuned Whisper model for high-quality speech-to-text conversion.
- Simple HTTP API: Easily transcribe audio files through simple HTTP requests.
- User-Level Access: Manage usage with API keys for different users.
- Self-Hostable: Deploy your own speech transcription service for privacy and control.
- Optimized for Speed: Utilizes a quantized model for fast and efficient inference.
- Open Source: Fully transparent and customizable to fit your needs.
Installation
To install the necessary dependencies, run the following command:
# Install ffmpeg for Audio Processing
sudo apt install ffmpeg
# Install Python Package
pip install -r requirements.txt
Running the Project
To run the project, use the following command:
uvicorn app.main:app --reload
Get Your Token
To get your token, use the following command:
curl -X 'POST' \
'https://innovatorved-whisper-api.hf.space/api/v1/users/get_token' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"email": "example@domain.com",
"password": "password"
}'
Example to Transcribe a File
To upload a file and transcribe it, use the following command: Note: The token is a dummy token and will not work. Please use the token provided by the admin.
Here are the available models:
tiny.en
tiny.en.q5
base.en.q5
# Modify the token and audioFilePath
curl -X 'POST' \
'http://localhost:8000/api/v1/transcribe/?model=tiny.en.q5' \
-H 'accept: application/json' \
-H 'Authentication: e9b7658aa93342c492fa64153849c68b8md9uBmaqCwKq4VcgkuBD0G54FmsE8JT' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@audioFilePath.wav;type=audio/wav'