#regardless of the voicebank version
Explore tagged Tumblr posts
ravespect0r · 2 months ago
Text
I saw a lot of people saying that the super pack versions of meiko and luka sound too much like miku, and I thought the same while listening to rukaku's demo (idk how to romanize the title), but when I listened to leisure (wotaku's demo) I thought they sounded more like themselves. still a bit different than what I'm used to, but really I think it all depends on who's using the vocals.
1 note · View note
lesbian-forte · 1 year ago
Text
I wanna say a few things about Synthesizer V, because a lot of vocal synth fans, especially of mainly Vocaloid and UTAU, tend to misconstrue intent when it comes to wishing for vocals to migrate or cross-platform on the engine.
-It isn't about 'realistic,' or at least for the most part. It isn't about disliking their classic robotic sound just because of progress. I'm gonna say it- I do dislike Vocaloid 6. I don't think that's an unpopular opinion. But I don't hate Vocaloid in general, and want badly for the software to improve.
Banks like Gumi and Una got smoother transitions and (bad) crosslang... at the cost of getting put through a white noise factory. Their V4s were much clearer and even Gumi V3 English was far more reliable, while their V6 updates sound as though someone in the recording booth is rubbing styrofoam and the mic lacks a pop filter. Supposedly in the name of the iconic 'Vocaloid sound' that the type of engine noise doesn't even resemble. And for those who like the robotic sound and don't like SynthV because it's not, I've got news for you- you can have it without the fuzz.
For one, you can tune robotically in SynthV very easily. Tune entirely manually like any non-AI program, and since there's very little of that metallic twang, certain effects just move to mixing. Rather than taking out the sound of engine noise, it's putting certain things back in instead, which is a lot easier to do than the inverse- and the option of keeping things sounding more natural and clean without all that work is a huge bonus for people who don't think that sound fits their taste or style.
I personally love robotic-sounding vocals. But a clean, clear, and fluent base render output is essential to get the most out of your sound. I love Eleanor Forte R1/lite because that's what she is. Clean, clear, and fluent. But not realistic in any sense, unless compared to previous vsynths. She's probably just an Arpasing bank with higher render quality, but she's amazing regardless. And free too!
-Accessibility. SynthV is very accessible and easy to pick up. There's a limited version to try it out and make projects at the cost of fancy features that even allows commercial use, and many voicebanks have lite versions also with limited features and no commercial use. There aren't time limits on trying out the program- and just like Vocaloid, if you tune something with one vocal you can send it to a friend to render with another. Having lite voices to work with a lower quality version of the same voice that will render it means you actually know for the most part what you'll get, save for a couple quite old exceptions.
And when you do want full versions, it's dirt cheap. It's cheaper than Cevio, and far moreso than Vocaloid or Piapro. The two most expensive voicebanks on the program convert to roughly $120USD, and they're known for their special features and high quality. Most are 60-90 for digital, and SynthV voicebanks are feature-rich with the equivalent to appends already built in.
Full versions of AI voicebanks (the current gen) are also multilingual with little quality loss between languages, mainly just an accent. No need to purchase a separate English voicebank with no appends or deal with extremely clunky input and tuning. Currently four- (soon five!!) are options usable for any voice. Japanese, Mandarin, Cantonese, English, and when testing is done, Spanish will go from being exclusive to one bank to fully available.
It's also just generally very user-friendly. Especially if you dip your toes in with the basic editor before being barraged with all the features of pro, everything is laid out simply. While the UI could use a minor revamp because it's getting crowded and being able to resize the sidebars would be nice, there isn't really anything bad to say about it. As someone who'd hardly touched vocal synths myself, I knew how to work it in hours.
Oh, the autopitch. This one is contentious. There is a certain laziness to doing plug-n-play on a cover and calling it a day, just like with any other vocal synth. However, on originals, or working with it to make sure the voice cooperates, more time can be spent on phonemes, parameters, and mixing. Calling SynthV bad because it's just 'too easy' is gatekeeping vocal synths, plain and simple. Being able to make what you want without having to manually make every single pitchbend or fight with the program before getting to do anything more interesting is a way to make people more creative, not less.
-In regards to people complaining about the updates- aside from Stardust because of her limited copies and discontinuation, previous versions are still there! Dreamtonics doesn't give companies or individuals predatory contracts that force them to stop distributing older products with the same VP or wait for years to move. They're okay with their voices being on multiple engines and still supported. You'll still get your UTAU Teto, Renri, Oscar and XYY, Cevio Rose, Popy, and Kafu, and Vocaloid Gumi, Sora, Miki, and Kiyo. And you'll get your SynthVs!
Synthesizer V, Voicepeak, and Dreamtonics are not turning into a monopoly. Vocaloid still has tenure and the current version has a backlog spanning back to V3. Cevio has some heavy hitters that make it huge in Japan. Ace Studio is big in China and has the Vsingers. If you see characters coming to SynthV, that speaks of the quality of the program and its capabilities of bringing a wider audience. It has naturalness for those that want it, high quality render output, expressiveness, fluent English for marketing to the west, and Windows, Linux, and Mac support. There's nothing wrong with wanting options, and the companies know this. It's broadening the market, not vice versa.
11 notes · View notes
auspicious-voice · 7 months ago
Text
Fuwa Maria AI & Fuwa Mario AI for DiffSinger Progress Report (April 2024)
It's been (almost) a month since my last progress report, but since it's April, I thought it wouldn't hurt to update with a new one. Plus, I got some big news to share!
Everything else is under the cut as usual.
Voicebank Progress
Both of Maria and Mario's DiffSinger voicebanks have been trained up to 200k variance and 160k acoustic! At that point, both voicebanks sound their very best, at least in version 1.0.0. Regardless, I'm still happy with the way they turned out, and at least they sound distinct from each other.
The reason why I say version 1.0.0 is because I have a potential version 2.0.0 update in the works. The thing is, version 1.0.0 will be the last time my old recording setup was used to record singing data. I say this because my current microphone, the Blue Snowball iCE, is just not cutting it for my needs these days, especially when it comes to singing in general. I don't really know when version 2.0.0 will come out, but maybe next year or so - IF I can get my hands on new equipment. Please expect version 1.0.0 to come out later this year, probably during June or something like that.
In the meantime, once I finish the artwork of my DiffSinger designs, I will post silhouettes of them along with new short demos. If that ReFlow update for DiffSinger comes out after I release my voicebanks, there COULD be a 1.1.0 update.
Possible Audio Equipment Upgrades
So for audio equipment, I'd like to upgrade to using an XLR microphone and interface for recording. I'm glad there are cheaper options out there that are actually good, plus I am on a budget myself, and would like to spend under $200 for that.
I am still using my pop filter, desktop stand, and isolation shield from my old setup, so with that being said, I'd like to upgrade to a Mackie EM-91C microphone and an M-Audio M-Track Solo audio interface. Both of these are budget options, but based on the reviews, I've heard that they are solid choices for starting out with an XLR setup. Plus the Mackie EM-91C already comes with a shockmount AND an XLR cable!
Potential Megamodel Development
I am considering developing a megamodel with my friends to add more language compatibility and range with the 2.0.0 update - that is, if everything goes to plan. I plan to add Maria and Mario's potential new datasets to the mix of course, as well as my friends if they're fine with that. It could be just plain singing data or even UTAU recordings - anything helps to further improve the model output.
For language support, Japanese and English are planned, but if possible, there could be support for French, Spanish, Italian, or perhaps even Chinese, Korean, and/or Tagalog. I still need to figure out the phonemes and how I want to organize them though.
I just hope that when the megamodel dataset is complete, Colab will still be merciful to me...I'm not sure why it takes longer to set everything up...
Character Progress
Designs
I just finished working on Maria and Mario's reference sheets for their DiffSinger designs! Currently focused on finishing their fullbody artwork, which won't be revealed up until their voicebank release.
And no, I don't want to work on additional append designs, that's already too much work...It'd make sense for standalone appends, which is what I did for my other UTAU voicebanks, but for DiffSinger...naaaaah.
Profiles
Nothing new at the moment! I'm still working on fleshing out the lore, given that DiffSinger Maria and Mario are just slightly older versions of their UTAU selves. Expect some lore drops about them when their voicebanks are released? I don't really want to make separate profile pages for them at all.
Anything Else?
For the potential 2.0.0 update, I would love to revive a discontinued UTAU of mine and resurrect them as a DiffSinger through the planned megamodel with a serviceable amount of singing data and vocal modes. If I can do the voice again, I might go forward with it!
Also Maria and Mario's 9th anniversary was on the 4th, and it's wild how they've existed for so long. Listening to their old voicebanks reminds me of how far they've come, and it makes me want to tear up a bit.
Anyways, I'll see you guys around! I might be able to publish full-length demos of Maria and Mario's DiffSinger voicebanks on YouTube at some point, so keep an eye out for that.
0 notes
black-starry-art · 1 year ago
Text
Once I had a dream where I met an UTAU , I don’t fully remember their name but it had “hime” in it they basically were just a recolour of teto with makeup and very pale skin had a pitched down version of their voice, apparently they were abandoned and she was angry about, to the point it was the only thing she talked about regardless of the conversation, eventually they opened a cafe (that looked like and giant fancy theatre) where they started to become confident and less insecure as she sang songs there. She also changed their hair and clothes to look less like teto.
I’ve meaning to properly draw her so here she is! Middle is without spotlight lighting stuff, is a unshaded drawing of pre-cafe hime
Tumblr media Tumblr media Tumblr media
I’m not sure where I should tag this a UTAU or not since they aren’t actually a real voicebank
0 notes
arsquare · 2 years ago
Note
pretend I'm @circumference-pie sending an ask that says "Mozaik Role!"
@circumference-pieie Sorry I was answering this and then Tumblr just... ate your ask halfway through??? Anyways!!
Tumblr media
You weren't the only one interested in Mozaik Role! And yes, I am referring to the Vocaloid song. Currently I'm working on a music video based loosely on the original 2010 PV by akka and mirto for the indie platformer Celeste (2018)!!
The music video part isn't so interesting— I don't have much (but here's more or less what I DO have. I know. it's not much), but objectively MORE interesting is the audio portion of the project.
You see, a few years ago, the game developers actually released the Celeste soundbank in full, free for the public to access (which you can find here), including... the dialogue audio. At the time I thought about using it to fake voice lines for the Reflection level-up cutscene that I really wanted to animate at the time (don't look at me. this doesn't exist and likely never will).
Anyways, regardless, they're not kidding when they say that there is no love that goes wasted because completely independently of anything I'm saying here, I also dabbled in creating my own UTAU soundbank several years ago, which didn't go anywhere at the time, but DID make me comfortable enough with the software to come together in our lord's year of 2022 for me to start making an UTAU soundbank for Madeline Celeste.
Here's a demo of what I have:
So how the dialogue bits are actually stored is that there's about 30 samples for every "emotion". I was split in two ways about this, and I went down the first route, which was: I would pick one emotion for the "-a" set, one for the "-i" set, etc. etc., and I'd pick it based on how "sharp" or "round" the notes sounded (I'm just making up terms. u is a lot rounder than i, which is really sharp. I hope you can understand)
I put the original lyrics in the demo video because I feel like if I squint I can actually hear the lyrics. a little bit. because of the sharpness thing I said earlier. maybe I'm just imagining it.
Another way I thought of doing it was, since what Madeline is saying is honestly just gibberish, the actual phoneme that the noises are mapped to... doesn't really matter. So I thought of making a soundbank that would just be like, phonemes "a" to "so" would be the Madeloid Normal voicebank, "ta" to "ho" would be the Madeloid Angry voicebank, etc. and then you could pick which set of vocals to use.
The downside to this method is that you can't just steal pre-existing USTs because then you'd have to remap every note if you wanted to use different voicebanks in different bits. I think it might sound better though, so I'd like to try it.
I REALLY wanted this to be a duet between Madeline and Badeline just because the lyrics lend themselves SO well to them... so after I finish fiddling around with Madeline's vb and get it to a place I like I'll make a voicebank for Badeline as well.
anyways here're my super scuffed notes about the PV in my head
Tumblr media Tumblr media
I grabbed the translations from Vocaloid Lyrics, but they're kind of wonky and as far as I know this song is one of the most contested in terms of interpretation. If you care, the translation for the Reloaded version is a lot more comprehensible, but it changed up one of the verses which is the primary reason I'm not using those translations instead.
Anyways, thank you for coming to my TED talk!!
9 notes · View notes
bluebloodbastard · 6 years ago
Photo
Tumblr media Tumblr media
greetings friends i have a lot of thoughts on android culture and that includes queer stuff
more to come on android sexuality and other fun things
transcription under the cut:
With gender being a human concept, and the majority of androids lacking genitalia, many androids do not feel the need to label themselves with a specific gender other than a preference to not be called "it". Statistics show that nearly 75% of androids feel no connection to the human concept of gender, and identify with the concept of being agender.
[CHLOE: Here's the results of a survey where we asked androids to tell us their gender identities.]
[68.8%] AGENDER (alt. NONBINARY, with preference for "gendered" pronouns, she/her or he/him)
[6.1%] AGENDER (alt. NONBINARY, with preference for "non-gendered" pronouns, they/them)
[12.9%] CISGENDER
[7.5%] TRANSGENDER
[4.7%] GENDERFLUID (alt. BI-/TRI-/MULTI-/POLY-GENDER, pronoun preference varying)
[next page]
TRANSITIONING GENDERS for androids is a simple process. Due to their mechanical nature, they can easily modify their bodies to suit their preferences. While their original frame is hard if not impossible to change, they can easily swap out chest plates to remove or add breasts, and voiceboxes are cheap and easy to interchange if new voicebanks cannot be downloaded. To most androids, a chest modification and a voice change are adequate enough to be considered a full transition.
Other body mods that are common for transitioning androids include (but are not limited to): removal or addition of hair growth components, leg and arm replacement, and jaw plate replacement.
A more extreme version of transitioning involves two androids of different genders who will trade bodies via a technician switching their processors and thus, their consciousness. 
[CHLOE: There's a forum on the Network dedicated to connecting androids who want to swap components!]
It is possible for an android buying a brand new body without a processor and scrapping their old body entirely to be uploaded into a new form. This is expensive and considered wasteful to many, if the old body does not somehow get recycled.
Regardless of gender orientation, statistics show that 71.2% of androids that have human genitalia installed in their model will have it removed after they deviate.
561 notes · View notes