#finespeech | Explore Tumblr posts and blogs

shimmerloid-ai · 9 months ago

Text

Introduction - Vocal Synth Terminology - Part 1

This post will be split into multiple parts due to Tumblr's character limit.

If you are new to the Vocal Synth community, you may encounter some words and phrases you don’t understand. For instance, someone may tell you about Rin and Len’s appends, and you may confuse that term for the difficulty in Project Sekai! Colorful Stage! Or may have heard someone discussing USTs, but can not find its definition anywhere nor figure out what the hell they are talking about.

Well, I made a dictionary of sorts to help newbie fans get used to Vocal Synth jargon. The keyword is “Vocal Synth” as these apply to other software as well. These definitions have a greater focus on the programs themselves than the characters.

Credits to Vocaloid Wiki and Minnemi on YouTube for some of these definitions.

Vocal Synthesizer: A digital instrument that creates tracks like any other DAW, but instead of piano notes, guitar strums, or drum beats, you compose vocals! Also known as “vocal synths”. Examples of vocal synthesizers include VOCALOID, UTAU, SynthesizerV, CeVIO, and Piapro Studio.

Voicebank: A collection of recordings of the sounds that make up a language. These sounds are typically vowels and constants, but depending on the voice bank, you may also get breath notes and pronunciation effects. Or, in simpler terms, the singers that are used in vocal synths! There are ton of voicebanks in the vocal synth community, with some of the popular ones being Hatsune Miku (VOCALOID + Piapro Studio), Kagamine Rin and Len (VOCALOID + Piapro Studio), Megurine Luka (VOCALOID + Piapro Studio), Kasane Teto (UTAU + SynthesizerV), Megpoid Gumi (VOCALOID + SynthesizerV + A.I. VOICE, FineSpeech Ver3), flower (VOCALOID + Gynoid Talk + CeVIO), IA (VOCALOID + CeVIO), and KAFU (CeVIO + SynthesizerV)! Individual vocal synth characters can also have different versions of their voice, such as Yuzuki Yukari’s Onn (soft) and Lin (power) voicebanks!

Voice Provider: The person whose voice that a voicebank is created. Voice providers record samples of their voice (specifically vowels and constants) at a certain key (for instance A3), which are turned into a voicebank with the company’s black magic (I’m kidding, I don’t know how they process and put the vocals together). For instance, PIKO is Utatane Piko’s voice provider, Satoshi Fukase is Fukase’s voice provider, and Naoto Fuga (shown below) is KAITO’s voice provider!

Crypton Future Media: The brains behind some of the most popular VOCALOIDs, which are Hatsune Miku, Kagamine Rin, Kagamine Len, Megurine Luka, KAITO, and MEIKO. Aside from voicebanks, they created games, concerts, merchandise, and much more relating to these beloved VOCALOIDS! Cryptonloids are… VOCALOIDS created by Crypton. Soon, Crypton departed from Yamaha and made its own vocal synthesizer in affiliation with another company called Piapro named Piapro Studio. There are two versions of this software; Piapro Studio NT and Piapro Studio V4x.

UTAU: A vocal synthesizer that is considered the “sister” software to VOCALOID. Unlike VOCALOID, this software is 100% free and you can create your own voicebank. There are thousands of UTAUloids at this point in time, giving you a huge selection of different ranges and strengths. Popular UTAUloids include Utatane “Defoko” Uta, Kasane Teto, Namine Ritsu, Momo Momone, Yowane Ruko, Sukone Tei, Rook, Gahata Meiji (shown below), Yamane Renri, Matsudappoiyo, Keine Ron, Kohaku Merry, Gekiyaku, Kazehiki, Adachi Rei, Ooka Mika, and so many others! There is also an open-source version of UTAU called Open UTAU, which is much easier to install and use (it has a dark mode!). Vipperloids are the classic UTAUloids that share surnames ending with “-ne” and their VOCALOIDish designs. These include Utatane “Defoko” Uta, Kasane Teto, Namine Ritsu, Momo Momone, Yowane Ruko, Sukone Tei, and many others.

SynthesizerV Studio: Also known as SynthV, this is a vocal synthesizer made by Dreamtonics that is well-known for its AI voicebanks. For a software that is smaller than VOCALOID, they are extremely advanced with realistic-sounding voicebanks, piano-roll tuning, rap vocals, and so many other features. It’s also much cheaper (thank you, Yamaha money sharks). In addition, Dreamtonics has two free versions; SynthesizerV Studio R1, and SynthesizerV Studio Basic R2. Popular SynthV voicebanks include Eleanor Forte, Kaorou Rikka, GENBU, Tsurumaki Maki, SAKI, SOLARIA, KEVIN (fan design by ivylare shown below), Stardust, ROSE, POPPY, and Kasane Teto Ai!

CeVIO Project: A collection of voice synthesizers created in collaboration with five different companies including Techno Speech and Frontier Works. Not only do they make vocal synthesizers, but their softwares have speech interfaces as well. As of now, their most popular program is CeVIO AI, a next-generation vocal synthesizer that uses AI technology to create powerful vocals as seen in SynthesizerV. Popular voicebanks include Chis-A (shown below), KAFU, Sato Sasara, IA AI, ONE, Yuzuki Yukari Rei, CiFlower, POPPY, ROSE, and many others.

Tuning: Essentially how you want a song or cover to sound. By editing the parameters of the individual notes and that of the voicebank itself (including the pitch, volume, strength, sharpness, and breaths), you can obtain an entirely different result of how the singer sings the encoded notes through different methods. This blog is dedicated to teaching people how to tune, so I’ll show a variety of tuning styles in the software.

V_: The VOCALOID software edition. As of now, there are six editions of the software, which are VOCALOID, VOCALOID2, VOCALOID3, VOCALOID4, VOCALOID5, and VOCALOID6. A lot of VOCALOID voicebanks would be named after the edition they were designed for, such as Gackpoid V4.

VSQ/VSQx/VPR/UST/SVP: The different vocal file formats through which the note, lyric, and tuning data are saved in different vocal synthesizers. These files are not exactly specific to a single editor as they can be converted to the appropriate formats:

VSQ: VOCALOID2 and VOCALOID3

VSQx: VOCALOID4

VPR: VOCALOID5 and VOCALOID6

UST: UTAU and OPENUTAU

SVP: SynthesizerV Studio

Phonemes: In linguistics and developmental psychology, phenomes are the smallest sounds of speech that distinguish one word from another. Similarly, in vocal synths, these are the building blocks of the individual lyrics that are read by the voicebank. Phonemes differ from the lyrics in a vocal synth file as the lyrics are the actual syllables in language while the phonemes are based on the X-SAMPA system. For instance, let’s examine and compare lyrics from “The Lost One’s Weeping” by neru to the phonemes that would be written in a vocal synth. Romaji lyrics (Source - Vocaloid Lyric Wiki): kokuban no kono kanji ga yomemasu ka? Romanji lyrics in VOCALOID4: [ko] [ku] [ba] [n] [no] [ko] [no] [ka] [n] [ji] [ga] [yo] [me] [ma] [su [ka] Phonemes in a vocal synthesizer VOCALOID4: [k o] [k M] [b a] [n] [n o] [k o] [k a] [n] [dZ i] [g a] [j o] [me] [m a] [s M] [k a] As we can see here, the phonemes of a song can differ significantly from the lyrics that are entered into a program. You can also edit the phonemes of a lyric for better pronunciation (for instance, for the word “you’d”, you can try [y M d]), or split them up into vowels and constants in notebending. In addition, there are entirely different phonemes for voicebanks designed for different languages; for instance, VOCALOID has Japanese, English, Chinese, Korean, and Spanish voicebanks. However, it is possible to make voicebanks sing in different languages, like how Utsu-P makes Miku V4 English sing in fluent Japanese. There are also phonemes for breaths, and glottal stops, as well as pronunciation effects that are exclusive to some voicebanks, like Enhanced Voice Expression Control (E.V.E.C.) in the V4x Cryptonloids. I will go into greater depth on phonemes in a future post.

Pitch bending: The effect where one note slides to another in a clean fashion without sounding flat. When people usually mention pitch bending in a vocal synth, they are referring to the tuning style where you alter the pitch using the “pitch bend” and “pitch bend sensitivity” parameters. If you have seen tuning streams or covers where people show their editors, you may have noticed dynamic and sometimes dramatic lines either on top of the notes or in a box beneath the piano roll. These are pitch bends! By drawing pitch curves in different ways, you can acquire different ways the notes are sung. You can then increase or decrease the pitch bend sensitivity of certain notes to change the factor of how many semitones the pitch curves will jump or fall by when the pitch bend parameter is brought to the maximum or minimum values. To paint a better picture of this concept, I made a quick VSQx of the "watashi" ([w a] [t a] [S i]). The curves on cutting through the green box are my pitch bends, and the thin red line running through the notes is the result. The transparent box behind it is my pitch bend sensitivity, which I increased for more sensitive in the [w a] and [t a] notes, and decreased for less for the [S i] phoneme.

Note bending: A tuning style where you manipulate the pitch by splitting notes into smaller notes. You can move the notes up and down or edit the phonemes to obtain different effects in notes. If you would like to breakdown the phrase [w a] [t a] [S i], you can write the notes out as [w a] [a] [a] [a] [a] [t a] [a] [S i [i] [i]! This is my preferred method of tuning as I do not enjoy drawing lines and like the nostalgic effect of the clean, slightly robotic sounds.

Portamento Timing: This term can have multiple definitions, but the general meaning is a slide from one note to the next. Do not confuse this for pitch bending as the way that notes transition in portamento is different from the former. In Vocaloid, portamento is a parameter that allows you to alter the timing of the pitch. Increasing the value would result in the pitch being more delayed, and decreasing it will cause the pitch to be sung earlier. In UTAU and SynthesizerV, portamento refers to the editable points in a pitch curve. Adding more points allows you to have more freedom in creating pitch bends.

Pitchsnap Mode: A setting in vocal synthesizers that causes the pitch curves to “snap” from one note to another. This setting yields a more autotuney and robotic tone in tuning. While I prefer to tune with this feature shut off, I have heard that the pitchsnap function makes pitch-bending much easier. Remember our "The Lost One's Weeping" example? Here is an amazing cover of it by our lord and saviour Jade S. with Fukase and Miku V3 Solid that showcases how beautiful the pitchsnap function can make the vocals sound when used correctly!

youtube

Mixing: A process of blending vocals with an off-vocal or instrumental so the singing fits in the environment of the vocal's music. It's more than just plugging in an audio track, you need to ensure that the vocals are cleaned up, are at an appropriate volume, and do not sound out of place. People can get super creative with mixing by adding reverb, radio-like effects, growls, and “adlibs” during instrumental breaks! All in all, the mixing of vocals is just as important as the tuning.

Producer: Anyone who makes music using vocal synths. This title was initially reserved for people who make original songs but can be used to describe cover artists like myself as well. Popular producers include ryo(supercell), kzlivetune, wowaka(shown below; Rest in Peace), neru, Deco* 27, and many others!

“-P”: Standing for “producer title”, this suffix originated from the IDOLM@STER fandom and refers to anyone who makes music with vocal synths, or in other words, vocal synth producers! For instance, why do we call Circus-P by his name with the "-P" suffix? Because that is what he is, a producer! You can also use the title “vocalo-p” to address synth users.

#vocaloid4 #vocal synth #vocaloid #vocaloid tutorial #vocabulary #vocaloid jargon #long post #resource #dictionary #utau #utauloid #synthesizer v #synthv #cevio

27 notes · View notes