298 lines
6.0 KiB
Markdown
298 lines
6.0 KiB
Markdown
# Edge TTS Web UI
|
|
|
|
A Progressive Web App (PWA) for converting text to speech using Microsoft Edge's online TTS service.
|
|
|
|
## Features
|
|
|
|
- 🎙️ **Text to Speech**: Convert any text to natural-sounding speech
|
|
- 🌍 **Multiple Languages**: Support for 100+ voices in various languages
|
|
- 🎛️ **Voice Customization**: Adjust speed, volume, and pitch
|
|
- 📱 **PWA Support**: Install as an app on any device
|
|
- 💾 **Offline Support**: Service worker caching for offline usage
|
|
- 📝 **History**: Keep track of recent generations
|
|
- ⬇️ **Download**: Save generated audio as MP3 files
|
|
|
|
## Installation
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.8 or higher
|
|
- pip (Python package manager)
|
|
|
|
### Setup
|
|
|
|
1. Navigate to the web directory:
|
|
```bash
|
|
cd web
|
|
```
|
|
|
|
2. Install dependencies:
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Start the Server
|
|
|
|
```bash
|
|
python server.py
|
|
```
|
|
|
|
Or with custom options:
|
|
```bash
|
|
python server.py --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
Options:
|
|
- `--host`: Host to bind to (default: 0.0.0.0)
|
|
- `--port`: Port to bind to (default: 8000)
|
|
- `--reload`: Enable auto-reload for development
|
|
|
|
### Access the Web UI
|
|
|
|
Open your browser and navigate to:
|
|
```
|
|
http://localhost:8000
|
|
```
|
|
|
|
### Install as PWA
|
|
|
|
1. Open the web UI in a modern browser (Chrome, Edge, Safari, Firefox)
|
|
2. Look for the install prompt or click "Install App" button
|
|
3. The app will be added to your home screen/app drawer
|
|
|
|
## API Endpoints
|
|
|
|
The server provides the following REST API endpoints:
|
|
|
|
### GET /api/health
|
|
Health check endpoint
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"service": "edge-tts-api"
|
|
}
|
|
```
|
|
|
|
### GET /api/voices
|
|
Get list of all available voices
|
|
|
|
**Response:**
|
|
```json
|
|
[
|
|
{
|
|
"Name": "en-US-EmmaMultilingualNeural",
|
|
"ShortName": "en-US-EmmaMultilingualNeural",
|
|
"Gender": "Female",
|
|
"Locale": "en-US",
|
|
"LocaleName": "English (United States)",
|
|
...
|
|
}
|
|
]
|
|
```
|
|
|
|
### POST /api/synthesize
|
|
Synthesize speech from text
|
|
|
|
**Request Body:**
|
|
```json
|
|
{
|
|
"text": "Hello, world!",
|
|
"voice": "en-US-EmmaMultilingualNeural",
|
|
"rate": "+0%",
|
|
"volume": "+0%",
|
|
"pitch": "+0Hz"
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
Returns MP3 audio file
|
|
|
|
**Parameters:**
|
|
- `text` (required): Text to convert (max 5000 characters)
|
|
- `voice` (optional): Voice name (default: "en-US-EmmaMultilingualNeural")
|
|
- `rate` (optional): Speech rate from -100% to +100% (default: "+0%")
|
|
- `volume` (optional): Volume from -100% to +100% (default: "+0%")
|
|
- `pitch` (optional): Pitch from -500Hz to +500Hz (default: "+0Hz")
|
|
|
|
### POST /api/synthesize-with-subtitles
|
|
Synthesize speech with subtitle generation
|
|
|
|
**Request Body:**
|
|
Same as /api/synthesize
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"audio": "base64_encoded_audio_data",
|
|
"subtitles": "SRT formatted subtitles",
|
|
"format": "mp3"
|
|
}
|
|
```
|
|
|
|
## File Structure
|
|
|
|
```
|
|
web/
|
|
├── index.html # Main HTML page
|
|
├── styles.css # Styles and theme
|
|
├── app.js # Client-side JavaScript
|
|
├── manifest.json # PWA manifest
|
|
├── sw.js # Service worker
|
|
├── server.py # FastAPI backend server
|
|
├── requirements.txt # Python dependencies
|
|
├── icon-192.png # App icon (192x192)
|
|
├── icon-512.png # App icon (512x512)
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Development
|
|
|
|
### Running in Development Mode
|
|
|
|
```bash
|
|
python server.py --reload
|
|
```
|
|
|
|
This enables auto-reload when you modify the code.
|
|
|
|
### Testing
|
|
|
|
Test the API endpoints using curl:
|
|
|
|
```bash
|
|
# Get voices
|
|
curl http://localhost:8000/api/voices
|
|
|
|
# Synthesize speech
|
|
curl -X POST http://localhost:8000/api/synthesize \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"text":"Hello world","voice":"en-US-EmmaMultilingualNeural"}' \
|
|
--output speech.mp3
|
|
```
|
|
|
|
### Customization
|
|
|
|
#### Update Icons
|
|
|
|
Replace `icon-192.png` and `icon-512.png` with your own icons.
|
|
|
|
For best results, create:
|
|
- 192x192 PNG for mobile devices
|
|
- 512x512 PNG for high-resolution displays
|
|
|
|
#### Update Theme Color
|
|
|
|
Edit the `--primary-color` variable in [styles.css](styles.css):
|
|
|
|
```css
|
|
:root {
|
|
--primary-color: #2563eb; /* Change this color */
|
|
}
|
|
```
|
|
|
|
Also update `theme_color` in [manifest.json](manifest.json).
|
|
|
|
## Browser Support
|
|
|
|
### PWA Features
|
|
- ✅ Chrome/Edge (Desktop & Mobile)
|
|
- ✅ Safari (iOS 11.3+)
|
|
- ✅ Firefox (Desktop & Android)
|
|
- ✅ Samsung Internet
|
|
|
|
### Service Worker
|
|
- ✅ All modern browsers
|
|
- ❌ IE11 (not supported)
|
|
|
|
## Troubleshooting
|
|
|
|
### Port Already in Use
|
|
|
|
If port 8000 is already in use:
|
|
```bash
|
|
python server.py --port 8080
|
|
```
|
|
|
|
### Icons Not Showing
|
|
|
|
Make sure `icon-192.png` and `icon-512.png` exist in the web directory.
|
|
|
|
### Voices Not Loading
|
|
|
|
Check the server logs for errors. The server needs internet connection to fetch voices from Microsoft's API.
|
|
|
|
### CORS Issues
|
|
|
|
The server is configured to allow all origins for development. For production, update the CORS settings in [server.py](server.py):
|
|
|
|
```python
|
|
app.add_middleware(
|
|
CORSMiddleware,
|
|
allow_origins=["https://yourdomain.com"], # Update this
|
|
...
|
|
)
|
|
```
|
|
|
|
## Deployment
|
|
|
|
### Production Considerations
|
|
|
|
1. **Use a production ASGI server**: Uvicorn with multiple workers
|
|
```bash
|
|
uvicorn server:app --host 0.0.0.0 --port 8000 --workers 4
|
|
```
|
|
|
|
2. **Use a reverse proxy**: nginx or Apache for SSL/TLS
|
|
|
|
3. **Set environment variables**:
|
|
```bash
|
|
export EDGE_TTS_HOST=0.0.0.0
|
|
export EDGE_TTS_PORT=8000
|
|
```
|
|
|
|
4. **Update CORS settings**: Restrict to your domain
|
|
|
|
5. **Enable HTTPS**: Required for PWA installation
|
|
|
|
### Docker Deployment
|
|
|
|
Create a `Dockerfile`:
|
|
|
|
```dockerfile
|
|
FROM python:3.11-slim
|
|
|
|
WORKDIR /app
|
|
|
|
COPY requirements.txt .
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
COPY . .
|
|
|
|
EXPOSE 8000
|
|
|
|
CMD ["python", "server.py", "--host", "0.0.0.0", "--port", "8000"]
|
|
```
|
|
|
|
Build and run:
|
|
```bash
|
|
docker build -t edge-tts-web .
|
|
docker run -p 8000:8000 edge-tts-web
|
|
```
|
|
|
|
## License
|
|
|
|
This web UI is built on top of [edge-tts](https://github.com/rany2/edge-tts).
|
|
|
|
## Contributing
|
|
|
|
Contributions are welcome! Please feel free to submit issues or pull requests.
|
|
|
|
## Credits
|
|
|
|
- **edge-tts**: The underlying TTS library by [@rany2](https://github.com/rany2)
|
|
- **Microsoft Edge TTS**: The text-to-speech service
|