TTS stands for Text-to-Speech – a technology that transforms written text into spoken language. A previously created voice clone reads the text aloud, delivering natural-sounding voice output with ease.
STS stands for Speech-to-Speech – a form of speech synthesis that uses an existing voice recording as input. The voice clone speaks with the same intonation and speed as the original, preserving the natural rhythm and expression.
STT stands for Speech-to-Text. This feature converts spoken audio into written text, automatically recognizing the language used in the recording.
A voice operation refers to either creating or editing a voice clone or a voice design/remix. Each subscription includes a specific number of voice operations, which counts toward your monthly quota. You can find the exact number included in your subscription details.
A voice clone is a virtual reproduction of a real voice, created using AI technology. It mimics the tone, style, and character of the original speaker so naturally that it sounds like the real person is talking.
Each subscription includes a curated library of high-quality voices. In addition, every plan comes with a quota for custom voice clones that users can create themselves.*
Voice design is a feature that allows users to create custom voices from scratch by describing them through text prompts. You can specify attributes such as age, gender, accent, tone and emotion. This enables you to generate entirely new, realistic voices tailored precisely to your needs.
Voice Remixing allows you to modify a cloned or custom-created voice by entering a text prompt. You can specify attributes such as age, gender, accent, tone, and emotion, making it easy to adjust a voice to fit your exact needs.
Voice Remixing is available for voices in the VoiceWunder library. It is not supported for shared voices.
With the Voice Sharing plan, a voice talent can make up to 20 different clones of their own voice available to other VoiceWunder users with Studio plans – for example, to capture different emphasis styles or tonal variations. This may help optimize workflows, especially with producers a voice talent works regularly.
Granting access is simple and fully managed by the voice talent. To enable access, the talent requests the studios’s VoiceWunder ID and enters it in their user area, where access can also be revoked at any time with equal ease.
While standard VoiceWunder speech synthesis does not create any logs, voice sharing provides a transparent usage record that is visible only to the voice talent. This record shows exactly which user generated which text with the shared voice, and when – almost as if the talent had been present in the recording studio.
The voice talent is solely responsible for granting the user the necessary usage rights and for obtaining the user’s prior authorization for activity logging before granting access. All agreements regarding such rights and authorizations are concluded directly between the voice talent and the respective user. The sharing of voice clones is strictly limited to a talent’s own voice and is available exclusively with the Voice Sharing subscription. Voice Sharing is available exclusively for users with a Studio subscription.
VoiceWunder offers a range of default voices in its Voice Library, tailored to each subscription plan.
Basic Plan: Includes a curated selection of professional-quality default voices.
Studio Plan: Includes all Basic voices plus an additional extended voice library featuring exceptionally high-quality voices.
For both the Basic and Studio plans, commercial use of the default voices is permitted and royalty-free – even after the subscription has ended.
For best results, use the highest possible audio quality – free from background noise. We recommend a loudness level of approximately -23 dB to -18 dB RMS, with a True Peak of -6 dB to prevent distortion. The recording should be spoken in a consistent tone and volume throughout. The more consistent the input, the more natural and coherent the voice clone will sound. A recording duration of 1–3 minutes is usually sufficient.*
Each subscription includes a specific number of coins, which are used for different forms of speech synthesis:
Unused coins expire at the end of the month, and the coin balance is refilled at the beginning of each month according to the subscription plan.
Due to legal requirements, it is essential to obtain the explicit consent of the respective speaker before creating a voice clone. Use of the plug-in or platform is permitted only with material for which the user holds the appropriate rights or authorization.
All AI speech synthesis from VoiceWunder must be labeled. We will comply with the EU AI rules once they are finalized, expected by late 2025 – current EU requirements are not yet defined. Please also follow platform-specific rules (e.g., YouTube, TikTok).
If the voice output doesn’t meet your expectations, try the following adjustments:
Use advanced mode and emotional tags. Emotional tags are inline cues placed in square brackets (e.g., [sigh], [excited]) within the text you want to synthesize. They guide how the voice clone speaks by adding emotional, non-verbal, or stylistic elements. To enable advanced mode, open the settings page and select “Advanced Mode” under “Speech Generation.” Once enabled, you can also choose emotional tags from a pop-up list by clicking the icon in the lower-right corner of the text box. You can also try specifying a language here if the voice has a particular accent or coloration.
To achieve the best results with Speech-to-Speech (STS), consider the following guidelines:
If the voice you designed doesn’t sound quite right, try these tips to fine-tune the result:
If the voice you remixed doesn’t sound quite right, try these tips to fine-tune the result:
Yes. With a paid plan, you can install the plug-in on multiple workstations (Basic: up to 3, Studio: up to 5). However, the plug-in can only be active on one workstation at a time. To switch devices, simply close the plug-in window on your current workstation before opening it on another.
All voice output generated using voice design/remix or the provided voice clones can be used without restrictions – including for commercial purposes – if created under a Basic or Studio subscription. This usage remains valid even after the subscription ends.
For custom voice clones you create, commercial use depends on your individual agreement with the respective speaker.*
Please note: Commercial use of any material generated under a Free subscription is strictly prohibited.
Any use that violates ethical standards or applicable laws is strictly prohibited. This includes, but is not limited to:
The plug-in is compatible with any internet-enabled Mac or Windows PC running Avid Pro Tools (version 2025.6), Steinberg Nuendo/Cubase (version 14), Apple Logic Pro (version 11.2 with Rosetta enabled), Presonus Studio One Pro (version 7) or Cockos Reaper (7.43). A high-speed internet connection is strongly recommended, as audio transmission may involve large data volumes.
You can easily cancel your subscription at https://voicewunder.ai in the user area under “Subscription.”
After the subscription period ends, your account and all voice clones and voice designs you created will be deleted.
Your personal data is used exclusively for generating the requested speech synthesis and is never utilized for model training purposes. All data is transmitted securely using SSL encryption.
We offer European data residency and adhere fully to GDPR and CCPA regulations. For your security, our team will never request your password.
Currently up to 74 languages are supported: Afrikaans, Arabic, Armenian, Assamese, Azerbaijani, Belarusian, Bengali, Bosnian, Bulgarian, Catalan, Cebuano, Chichewa, Croatian, Czech, Danish, Dutch, English, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Kirghiz, Korean, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malay, Malayalam, Mandarin Chinese, Marathi, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian,Sindhi, Slovak, Slovenian, Somali, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh.
All prices are net prices, additional VAT (19%) may apply.
The products are only available to commercial customers.
Powered by VoiceWunder® & ElevenLabs.
To give you the best experience, we use technologies like cookies to store and access device information. Consenting allows us to process data like browsing behavior. Without consent, some features may not work properly.