#MidJourney pan | Explore Tumblr posts and blogs

deepdreamnights · 4 months ago

Text

Vidu and the Quest to Make More Toons

So, a ways back I talked about Minmax, but I've been trying out basically all the video generators looking for the tools I need, and low and behold this week I find out I've been accepted into the Vidu Artists program now, wherein I get credits and access to access their cooler features in in exchange for... talking about the tech and how I use it.

Well twist my arm. I shall endeavor to be objective and informative despite free stuff (a challenge my spirit needs practice withstanding if anyone else wishes to test me)

So let's talk Vidu.

(outside of being converted to gif, no animations in this post have been cut or edited)

Also, everyone say hi to Maureen the Lizard Queen, every hero needs an evil queen that really wants in his pteruges, and she's that for TyrannoMax.

Vidu's got a bit more oomph under the hood than MinMax (no shade to MinMax, they're brand new and very promising) and it's way too early to be picking winners when it comes to video.

Anyhow, basic features that are nice include the options to upload start and end frames, options for a 4 or 8 second duration (more about that later), and a cleanup/upscale. Credits line up more or less with seconds. 4 credits for a 4 second clip, 8 for an 8 second, and again at upscale. It's straightforward in a way a lot of services aren't.

Apetomic Pyle, done on the fast settings. (not to shabby still, and it gave him monkey legs which a lot of systems balk at)

If you're on the $30/mo tier, you can choose to do a double-cost "quality" over "speed" option. Thankfully, the artist program gets me access. Since there's not yet a seed option it's hard to do a direct comparison, but the quality is going to be a must if you're doing anything that looks like cel. Much cleaner, much smoother.

(4 and 8 second quality gens)

One of the nicest features is the character reference feature. Basically it's like Midjourney's --cref, but with a very strict adherence to character details.

The above images used reference shots of Maureen and Dr. Underfang, and it got the stripes on Underfang's tie right in basically every gen. That's a ridiculous level of character model adherence and, for my purposes, all but essential.

It did misinterpret Maureen's undertail coloration for a sort of fin or drape, but the shot I used was oddly cropped, and sometimes stuff like that happens with gen AI. Given my measuring stick for errors is the era of animation I'm emulating, whatever does slip through is only going to make it more authentic.

There is a limitation in that character-reference and text-only prompts default to 16:9 presently with no options to adjust, but some room to pan is always handy and most people are going to be outputting for phone and not outdated CRT televisions, so, it's understandable it'd be a lower priority feature for the devs.

Walk cycles! By Saint Eniac it's a miracle!

On the left we have one prompted with TyrannoMax's control art, and on the right we have one using that art as the starting frame (4 and 8 seconds, respectively).

Way More details under the fold.

Vidu likes a hefty prompt.

A lot of detail and evocative language helps, and older prompting tricks like mojo-jojoing important concepts are back. For the Max walk cycles above I used:

1986 vintage cel-shaded cartoon character walk cycle. The orange dinosaur-anthro wearing blue gladiator armor walks toward screen right, the camera tracks him, holding him in center-frame. He completes a full, brisk walk cycles from the side view. He walks boldly, back straight, head high, heroic. His tail sways behind him as he moves. The whole clip has the look and feel of vintage 1986 action adventure cel-animated cartoons. The animation quality is high, with flawless motion and anatomy. animated by Tokyo Movie Shinsha, studio Ghibli, don bluth. BluRay remaster. flat chroma-key green screen background

The potential for use with my Filmation-inspired technique is readily apparent. Both versions are on-model as much as any two shots in a 1980s action-figure shilling cartoon would be, some minor blurring to clean up in post but nothing serious. It should be pretty easy to extract the needed frames for looping and compositing.

Some Extra Points

There are the usual issues with hands, though more often than not it corrects my four-fingered anthros to having a human five-fingered hand. Buzby Spurlock animation was known for those kinds of inconsistencies, though. So an opening credits video is much less far off than it was at the last post.

It's also generally impressive how well it does with my dinosaur characters. Non-humanoid dinosaurs are difficult for most image generators, much less anthrosaurs in a vintage aesthetic. Vidu has yet to override the character art to give Underfang or Max the Jurassic Park style t-rex jaw, which is something both MJ and Dall-E 3 have trouble with.

Human characters like Kitty Concolor here, much more stable.

As always, clips are curated. I didn't choose my absolute best ones (gotta have something for the videos), and I'm working on a fun series of jank reels across all the generators.

#vidu #vidu artist plan #ai video #ai animation #tyrannomax #ai tutorial #ai assisted art #cartoons #80s nostalgia #unreality #maureen the lizard queen #dr. underfang #ApeTromic Pyle

65 notes · View notes

moonstruck-bubble · 7 months ago

Text

This is from a video on instagram that appears to be drone shots of a build with "midjourney" tagged beneath 5 inches of dots. These are just the ones I noticed right way. I'm sure they're hoping with panning and switching views people won't notice in machine generated. I'm just tired of these people trying to pass this shit off as real.

"Architectural designer" linked as their profession and a bunch of "get followers with ai" shit. Atleast their profile says "ai artist" interspersed by annoying emojis. But maybe you'll miss that part too.

Idk if pointing these things out is something people are interested in but hopefully it helps people who don't know what to look for.

#its so disingenuous #ai generated #midjourney #generative image #anti ai art

3 notes · View notes

jellyfisharcade · 2 years ago

Text

Fun with the new panning system - results are strange yet pretty wonderful - I approve! Midjourney Lightroom Upscaler you know the dealio!

#aiartwork #aiartdaily #midjourney #ai artwork #aiartcommunity #ai art gallery #generativeart #ai image #sci fi #city skyline

13 notes · View notes

republicsecurity · 1 year ago

Text

Behind the illusion

Don't read this if you are afraid of getting your illusion and fantasy destroyed. Read this if you want to embrace this kind of creativity and maybe help in writing stories, reports or snippets from this world.

Some people have asked how I generate this stories and the artwork. It is simple: it's a mixture of midjourney AI generated pictures and processing done in Affinity Publisher. The text are made with the help of ChatGPT after setting a lot of parameters or learning the model.

So how do I make the pictures? I have found pictures look well if you design them around the following prompt:

"A Cinematic Scene from 2023, sci-fi Drama, "Resilience", close up shot, one smiling teenage male cadet in red one piece uniform with shaved head, talking on videoconference, captured by Handheld camera, film directed by Greta Gerwig, Inspiring, toom call, --style raw --ar 21:9"

Midjoruney gives me this for examples and I use the upscale button to make the bigger.

--ar 21:9 = defines a hight and wideness. Here I choose to make the pictures wider. If you do nothing you get 1:1 which you can well use for instagram.

The prompt should be structured like this:

Setting the scene - (closeup, panorama, bird eye,...), then the subject, then more info about the camera, film. it's also possible to go for a style. I often opt for "ODST", by "Syd Mead", or "Tron".

See full instructive article: https://linusekenstam.substack.com/p/tutorial-dynamic-prompting

Also you can use the pan option:

to get these by simple repeating the command:

For angles use this guide

Another technique I like is varying this prompt:

I got this from him:

A photo from first person POV | while climbing through a street at golden hour | talking to a cadet in red body suit with shaved head | like a FPS videogame | from the angle, style of ODST --ar 16:9 --style raw --no helmet

As you can see I used "style of ODST" and "ar 16:9" to have a video game like image. Not every command produces what you want and you have to settle for it sometimes or try.

How to make these into more complex Images?

I do some creative after processing. Lets show on this picture:

As you can see, this is already some interactions of the base prompt. I then copy the picture into Affintiy Publisher:

I then use the image adjustment tools "fx" to change the colours and thus often the moods of the images. So you can make it look like a green tinted night vision image.

Also through publisher I add text overly, the helmet POV corners, symbols, etc...

Items and Mockups

From time to time its nice to have some items and mockups, sometimes in colours that they are not necessarily in real life.

Here in the Accessories page I created everything, except the chastity cage by Midjourney. The prudes don't let you create these things, so for sexual stuff you have to either use things that looks like it or use other sources.

I then combine these images on a bigger DIN A4 sizes canvas, that often emphasises the meaning.

mockup of a red flightsuit uniform on white background --style raw

I usually upscale and then use the picture I like best.

Examples

panorama shot a group ODST paramedic cadet with helmet in red full body armour, shot on Kodak Chrome, marching on a training ground, in the style of double exposure, red and green, mixed race , male --style raw --ar 16:9

a young men with a shaved head trying on some red ODST body armour, workshop, captured by Handheld camera, mixed race; image split into 4, shot from different angle, bright colours --style raw

Angles

Futuristic Filmschool Portrait, young cadet with shaved head in red flightsuit uniform, standing in front of a grey wall, Detailed Face, Minimal Composition, Simplicity, Awardwinning Muted Colors --ar 9:16 --seed 5000

Text

Text and story is an entirely different thing. I use chat GPT after i gave it some basic instructions, biggest problem was to let it write about mind control and chastity without disapproval. I still don't remember how I made it do that.

#ai art #ai artwork #midjourney #chatgpt #prompts

2 notes · View notes

vladsad55 · 2 years ago

Photo

❌Pan Fox❌ ⭕️ #Midjourney #Art ⭕️ ❌ #character #conceptart #digitalart #fantasy #cyberpunk #scifi #cyberpunkart #futuristic #sciencefiction #future #darkfantasy #artwork #illustration #portrait #generativeart #ukrainian #ukrainianart 🔝 ❌ by #VladSad ❌ 💙💛 Made in #Ukraine 🇺🇦 (Ukraine) https://www.instagram.com/p/CpcI0Ofskvh/?igshid=NGJjMDIxMWI=

12 notes · View notes

sufficientlylargen · 9 months ago

Note

In the category of "useful use cases for generative art", I came across this blog post a while ago by James Ernest, where he talks about how he used midjourney + photoshop to generate art for playtest versions of games he was working on - he could cheaply and quickly generate art for his playtest cards that properly set the tone for the game, and this could also be a starting point for a human artist if the game panned out.

How can you consider yourself any sort of leftist when you defend AI art bullshit? You literally simp for AI techbros and have the gall to pretend you're against big corporations?? Get fucked

I don't "defend" AI art. I think a particular old post of mine that a lot of people tend to read in bad faith must be making the rounds again lmao.

Took me a good while to reply to this because you know what? I decided to make something positive out of this and use this as an opportunity to outline what I ACTUALLY believe about AI art. If anyone seeing this decides to read it in good or bad faith... Welp, your choice I guess.

I have several criticisms of the way the proliferation of AI art generators and LLMs is making a lot of things worse. Some of these are things I have voiced in the past, some of these are things I haven't until now:

Most image and text AI generators are fine-tuned to produce nothing but the most agreeable, generically pretty content slop, pretty much immediately squandering their potential to be used as genuinely interesting artistic tools with anything to offer in terms of a unique aesthetic experience (AI video still manages to look bizarre and interesting but it's getting there too)

In the entertainment industry and a lot of other fields, AI image generation is getting incorporated into production pipelines in ways that lead to the immiseration of working artists, being used to justify either lower wages or straight-up layoffs, and this is something that needs to be fought against. That's why I unconditionally supported the SAG-AFTRA strikes last year and will unconditionally support any collective action to address AI art as a concrete labor issue

In most fields where it's being integrated, AI art is vastly inferior to human artists in any use case where you need anything other than to make a superficially pretty picture really fast. If you need to do anything like ask for revisions or minor corrections, give very specific descriptions of how objects and people are interacting with each other, or just like. generate several pictures of the same thing and have them stay consistent with each other, you NEED human artists and it's preposterous to think they can be replaced by AI.

There is a lot of art on the internet that consists of the most generically pretty, cookie-cutter anime waifu-adjacent slop that has zero artistic or emotional value to either the people seeing it or the person churning it out, and while this certainly was A Thing before the advent of AI art generators, generative AI has made it extremely easy to become the kind of person who churns it out and floods online art spaces with it.

Similarly, LLMs make it extremely easy to generate massive volumes of texts, pages, articles, listicles and what have you that are generic vapid SEO-friendly pap at best and bizzarre nonsense misinformation at worst, drowning useful information in a sea of vapid noise and rendering internet searches increasingly useless.

The way LLMs are being incorporated into customer service and similar services not only, again, encourages further immiseration of customer service workers, but it's also completely useless for most customers.

A very annoyingly vocal part the population of AI art enthusiasts, fanatics and promoters do tend to talk about it in a way that directly or indirectly demeans the merit and skill of human artists and implies that they think of anyone who sees anything worthwile in the process of creation itself rather than the end product as stupid or deluded.

So you can probably tell by now that I don't hold AI art or writing in very high regard. However (and here's the part that'll get me called an AI techbro, or get people telling me that I'm just jealous of REAL artists because I lack the drive to create art of my own, or whatever else) I do have some criticisms of the way people have been responding to it, and have voiced such criticisms in the past.

I think a lot of the opposition to AI art has critstallized around unexamined gut reactions, whipping up a moral panic, and pressure to outwardly display an acceptable level of disdain for it. And in particular I think this climate has made a lot of people very prone to either uncritically entertain and adopt regressive ideas about Intellectual Propety, OR reveal previously held regressive ideas about Intellectual Property that are now suddenly more socially acceptable to express:

(I wanna preface this section by stating that I'm a staunch intellectual property abolitionist for the same reason I'm a private property abolitionist. If you think the existence of intellectual property is a good thing, a lot of my ideas about a lot of stuff are gonna be unpalatable to you. Not much I can do about it.)

A lot of people are suddenly throwing their support behind any proposal that promises stricter copyright regulations to combat AI art, when a lot of these also have the potential to severely udnermine fair use laws and fuck over a lot of independent artist for the benefit of big companies.

It was very worrying to see a lot of fanfic authors in particular clap for the George R R Martin OpenAI lawsuit because well... a lot of them don't realize that fanfic is a hobby that's in a position that's VERY legally precarious at best, that legally speaking using someone else's characters in your fanfic is as much of a violation of copyright law as straight up stealing entire passages, and that any regulation that can be used against the latter can be extended against the former.

Similarly, a lot of artists were cheering for the lawsuit against AI art models trained to mimic the style of specific artists. Which I agree is an extremely scummy thing to do (just like a human artist making a living from ripping off someone else's work is also extremely scummy), but I don't think every scummy act necessarily needs to be punishable by law, and some of them would in fact leave people worse off if they were. All this to say: If you are an artist, and ESPECIALLY a fan artist, trust me. You DON'T wanna live in a world where there's precedent for people's artstyles to be considered intellectual property in any legally enforceable way. I know you wanna hurt AI art people but this is one avenue that's not worth it.

Especially worrying to me as an indie musician has been to see people mention the strict copyright laws of the music industry as a positive thing that they wanna emulate. "this would never happen in the music industry because they value their artists copyright" idk maybe this is a the grass is greener type of situation but I'm telling you, you DON'T wanna live in a world where copyright law in the visual arts world works the way it does in the music industry. It's not worth it.

I've seen at least one person compare AI art model training to music sampling and say "there's a reason why they cracked down on sampling" as if the death of sampling due to stricter copyright laws was a good thing and not literally one of the worst things to happen in the history of music which nearly destroyed several primarily black music genres. Of course this is anecdotal because it's just One Guy I Saw Once, but you can see what I mean about how uncritical support for copyright law as a tool against AI can lead people to adopt increasingly regressive ideas about copyright.

Similarly, I've seen at least one person go "you know what? Collages should be considered art theft too, fuck you" over an argument where someone else compared AI art to collages. Again, same point as above.

Similarly, I take issue with the way a lot of people seem EXTREMELY personally invested in proving AI art is Not Real Art. I not only find this discussion unproductive, but also similarly dangerously prone to validating very reactionary ideas about The Nature Of Art that shouldn't really be entertained. Also it's a discussion rife with intellectual dishonesty and unevenly applied definition and standards.

When a lot of people present the argument of AI art not being art because the definition of art is this and that, they try to pretend that this is the definition of art the've always operated under and believed in, even when a lot of the time it's blatantly obvious that they're constructing their definition on the spot and deliberately trying to do so in such a way that it doesn't include AI art.

They never succeed at it, btw. I've seen several dozen different "AI art isn't art because art is [definition]". I've seen exactly zero of those where trying to seriously apply that definition in any context outside of trying to prove AI art isn't art doesn't end up in it accidentally excluding one or more non-AI artforms, usually reflecting the author's blindspots with regard to the different forms of artistic expression.

(However, this is moot because, again, these are rarely definitions that these people actually believe in or adhere to outside of trying to win "Is AI art real art?" discussions.)

Especially worrying when the definition they construct is built around stuff like Effort or Skill or Dedication or The Divine Human Spirit. You would not be happy about the kinds of art that have traditionally been excluded from Real Art using similar definitions.

Seriously when everyone was celebrating that the Catholic Church came out to say AI art isn't real art and sharing it as if it was validating and not Extremely Worrying that the arguments they'd been using against AI art sounded nearly identical to things TradCaths believe I was like. Well alright :T You can make all the "I never thought I'd die fighting side by side with a catholic" legolas and gimli memes you want, but it won't change the fact that the argument being made by the catholic church was a profoundly conservative one and nearly identical to arguments used to dismiss the artistic merit of certain forms of "degenerate" art and everyone was just uncritically sharing it, completely unconcerned with what kind of worldview they were lending validity to by sharing it.

Remember when the discourse about the Gay Sex cats pic was going on? One of the things I remember the most from that time was when someone went "Tell me a definition of art that excludes this picture without also excluding Fountain by Duchamp" and how just. Literally no one was able to do it. A LOT of people tried to argue some variation of "Well, Fountain is art and this image isn't because what turns fountain into art is Intent. Duchamp's choice to show a urinal at an art gallery as if it was art confers it an element of artistic intent that this image lacks" when like. Didn't by that same logic OP's choice to post the image on tumblr as if it was art also confer it artistic intent in the same way? Didn't that argument actually kinda end up accidentally validating the artistic status of every piece of AI art ever posted on social media? That moment it clicked for me that a lot of these definitions require applying certain concepts extremely selectively in order to make sense for the people using them.

A lot of people also try to argue it isn't Real Art based on the fact that most AI art is vapid but like. If being vapid definitionally excludes something from being art you're going to have to exclude a whooole lot of stuff along with it. AI art is vapid. A lot of art is too, I don't think this argument works either.

Like, look, I'm not really invested in trying to argue in favor of The Artistic Merits of AI art but I also find it extremely hard to ignore how trying to categorically define AI art as Not Real Art not only is unproductive but also requires either a) applying certain parts of your definition of art extremely selectively, b) constructing a definition of art so convoluted and full of weird caveats as to be functionally useless, or c) validating extremely reactionary conservative ideas about what Real Art is.

Some stray thoughts that don't fit any of the above sections.

I've occassionally seen people respond to AI art being used for shitposts like "A lot of people have affordable commissions, you could have paid someone like $30 to draw this for you instead of using the plagiarism algorithm and exploiting the work of real artists" and sorry but if you consider paying an artist a rate that amounts to like $5 for several hours of work a LESS exploitative alternative I think you've got something fucked up going on with your priorities.

Also it's kinda funny when people comment on the aforementioned shitposts with some variation of "see, the usage of AI art robs it of all humor because the thing that makes shitposts funny is when you consider the fact that someone would spend so much time and effort in something so stupid" because like. Yeah that is part of the humor SOMETIMES but also people share and laugh at low effort shitposts all the time. Again you're constructing a definition that you don't actually believe in anywhere outside of this type of conversations. Just say you don't like that it's AI art because you think it's morally wrong and stop being disingenuous.

So yeah, this is pretty much everything I believe about the topic.

I don't "defend" AI art, but my opposition to it is firmly rooted in my principles, and that means I refuse to uncritically accept any anti-AI art argument that goes against those same principles.

If you think not accepting and parroting every Anti-AI art argument I encounter because some of them are ideologically rooted in things I disagree with makes me indistinguishable from "AI techbros" you're working under a fucked up dichotomy.

#this is very well written up #and yeah #it's kind of amazing how many anti-ai arguments I see that are also anti-fanart #or anti-independent artist #or anti-art-you-don't-like #this is the real problem with anti-ai legislation #if it becomes possible to copyright an art style you better believe disney will single-handedly eradicate the art market #copyright laws only protect those who have the resources to use them

2K notes · View notes

hillarywright · 18 days ago

Text

The above image was generated using Midjourney v6.

The below prompt was fed into ChatGPT to get dinner ideas.

Initial prompt:

I am having my in-laws over for dinner this Friday and need a meal that will absolutely "wow" them. They are huge fans of Le Bernardin in New York and I would love to make a meal fit for a king. My mother-in-law loves seafood but doesn't like red meat. My father-in-law is allergic to nuts and eats gluten-free. Please come up with a three-course meal for four people that is French-inspired and will have them raving all the way back to Florida.

ChatGPT Response:

Here’s a three-course, French-inspired seafood menu tailored to your in-laws’ tastes and dietary needs:

Appetizer: Lobster Bisque (Gluten-Free)

Start with a luxurious, creamy lobster bisque to set the tone for elegance.

Ingredients:

Lobster tails (cooked and chopped)

Shallots, garlic, and leeks (for a fragrant base)

White wine (optional for depth)

Seafood stock

Heavy cream

Fresh thyme and a touch of cayenne for seasoning

Instructions:

Sauté shallots, garlic, and leeks in butter until softened.

Deglaze with white wine (if using) and add seafood stock.

Simmer with fresh thyme and lobster shells for flavor.

Blend the soup (without the shells), stir in heavy cream, and add lobster chunks before serving.

Serve in warmed bowls with a drizzle of cream and fresh chives.

Main Course: Pan-Seared Halibut with Lemon Beurre Blanc Sauce

For the star dish, a light yet flavorful halibut that embodies fine French dining.

Ingredients:

Halibut fillets

Butter, olive oil, and lemon juice

Shallots and dry white wine (for the beurre blanc)

Fresh parsley for garnish

Instructions:

Season halibut with salt and pepper, then sear in a mix of olive oil and butter until golden.

In the same pan, sauté shallots, then deglaze with white wine.

Whisk in butter and lemon juice for a velvety sauce.

Pour the sauce over the fish and garnish with parsley.

Pair with sautéed asparagus and a side of herbed cauliflower purée for a gluten-free alternative to mashed potatoes.

Dessert: Crème Brûlée (Nut-Free and Gluten-Free)

Finish with a classic French dessert that will wow your guests.

Ingredients:

Heavy cream, vanilla bean, and egg yolks

Sugar for the custard and caramelized topping

Instructions:

Heat heavy cream with a split vanilla bean until just warm. Let it steep for flavor.

Whisk egg yolks with sugar, then temper with the warm cream.

Strain and pour into ramekins, then bake in a water bath until set.

Chill, then sprinkle sugar on top and caramelize with a kitchen torch.

Serve with fresh berries for an elegant touch.

This menu offers the sophistication of Le Bernardin, respects dietary restrictions, and ensures your in-laws feel like royalty. Let me know if you’d like prep tips or wine pairing suggestions!

#midjourney v6 #ai generated #halibut #seafood #recipes #chatgpt #recipe ideas

1 note · View note

enkeynetwork · 5 months ago

Link

#interfacciaweb

0 notes

sandimexicola · 6 months ago

Text

Warhol Tribute (more or less)

Midjourney Prompt: image, gray wall, hair, sneakers --ar 3:2 - Pan Left

#Midjourney #AI #AI art #AI art generation #AI artwork #AI generated #AI image #computer art #digital art #digital illustration

1 note · View note

amandasgreen57 · 1 year ago

Text

Inspiration continued

Just a quick post this morning because I have a lot of pans in the proverbial fire, including several AM appointments. Anyway, one of the nice things about AI apps like Midjourney is that you can design inspiration images for your storyboard. I’ve been doing that for the project that ambushed me the other day. It doesn’t take more than a few minutes and it helps me visualize–and describe–the…

View On WordPress

#inspiration

0 notes

angeliqueshelleyartist · 1 year ago

Text

Live Brief with Playground Games

Developing Concepts Post #1

Story & Mechanics

Having decided on a narrative as a group, I wanted to invite everyone to consider adding more story elements from Hansel and Gretel into the visual narrative. Initially, I thought we could just have the witch as a fire oni that sets the place ablaze when she reveals herself. I then considered some more complex elements by combining the theme of water, which is a heavy cyberpunk element (think neon lights reflected on wet streets), with elements of fire.

The natural conclusion was to have a bathhouse. The furnace/boiler room would be below, and the floor would disintegrate into a fiery pit. The fight sequence would pan out on what remained of the floor. The alternative was to have stepping stones through a holographic pool of water, which later flickers off to reveal a fiery pit. At this point I wasn't directly involved with the environment concept art for this project, but I wanted to create some functional pieces solely for pitching the idea. I also ran some prompts on MidJourney for references of what a cyberpunk bath house could look like.

Reference Images generated using MidJourney.

#Live Brief #Practice 2

0 notes

beforecolin · 1 year ago

Text

the purpose of this blog is to give a place to the abundance of words and images that i am constantly in the midst of processing. it should serve as a place to put my thoughts, a record of my thinking that i can refer back to. It is, of course, mostly for my own benefit, and it is my hope that this blog will keep me accountable to myself. Its a way of pretending what i do matters, without being too precious about it. And it’s fun because someone else can look at it if they want.

it’s funny when i pursue threads of interest that seem new, but suddenly reveal their connectedness to a whole web of previous interests. It is odd how i discovered the term art brut. i’ve been playing around with generating images with midjourney recently and one of the features that is a real boon of the software is that you can tell it to describe an image for you. so i plugged in an image that i created with it and had it describe it to me, and one of the neat things that happens is that it will give you names of artists along with other critical terminology from art theory. in this case i was pursuing an image rendered in a childlike style.

i had just recently gotten chatgpt to write the three little pigs in the nouveau roman style of alain robbe-grillet, the results of which were hilarious (the real joke of it, though, was that the output was evidence of a poor and reductive understanding of what robbe-grillet is really up to). one edition of my prompts was “poorly drawn, bad drawing, big bad wolf, drawn by a 3 year old with no drawing skills, "the wolf, a mere occurence" is legibly written in the piece depicting a scene with the big bad wolf and the three little pigs” ; another, “the big bad wolf with the phrase, ‘the wolf, a mere occurence’ included in the image, drawn by a 5 year old boy born in north carolina in 1993 with adhd and no artistic skill.” i really wanted to emphasize the poorly drawn, childlike style of the output. midjourney is so good at “good,” i wanted to see how it dealt with the bad. in some sense it failed to fail. every image it produced was compositionally perfect, but rendered with “loose, gestural mark-making,” as it so described itself. one of the things that worked out in an interesting way involved the text. “the wolf, a mere occurrence,” a line from the robbe-grillet three little pigs that encapsulated the dead-pan humor of it, never quite made into the image with perfect legibility. i was making these on the day midjourney released an update that was supposed to make text more legible, but it was still failing to get it right. The fact that AI is so bad at writing legible text is perfect for a prompt that includes within it the form of poor execution, as was the case with these. These are still some of my favorite images i’ve made with midjourney.

just before i had the idea to do the three little pigs, i was trying these ideas: “the annunciation drawn by a 5 year old boy born in 1993 with mediocre drawing skills and adhd” and “assumption of mary drawn by a 6 year old boy with adhd born in 1993” traditional themes rendered in a childlike style. In my approach to creation in general there is always a predominance of the recombinatory, so it is no surprise that it occurred to me to enact a conceptual juxtaposition like this before doing something more obvious like children’s story (albeit a version already corrupted by the nouveau roman, granted) in a children’s style. but once i discovered art brut, i began doing a lot of research on jean dubuffet. like, a lot. i will get into that more, but one of the things i read in a book titled Brutal Aesthetics said that dubuffet was interested in the “treatment of traditional themes with childish techniques.” this is part of what we are seeing in his Woman Pinning Up Her Hair and Childbirth. It was a moment of recognition, like, oh, i was thinking of doing something that has been done in a serious way. Art brut and jean dubuffet came as validating discovery. And this is something i’m excited about.

There is an essential irony to learning the history and tradition of a fascination with childlike drawing that exemplifies the paradox that was central to dubuffet’s understanding of art brut. In general it is a twist in the nature/culture distinction that mirrors other oppositions in play with the child, the naive, the innocent, on the side of nature, and the corrupted adult on the side of culture. “Dubuffet”, Hal Foster writes, “was on the side of the very acculturation that he lamented.” Art brut cannot be made consciously. Art made in the style of art brut can never be art brut, and that is ok. I can make a childlike drawing, but it will only be childlike, not a drawing made by a child. Unless i’m really just a 30 year old child, which is probably closer to the truth, and that is where we want to be. But the paradox is even more murky, aas we proceed with kettle logic: on the one hand there is no way of truly coaxing the childlike into enacting itself so that we can harness the pure essential quality of the child, but on the other hand, we can, because the child never existed, not in a pure, prelapsarian sate, not prior to but always already after corruption, somehow both before and after the fall. there is a mystical way of intuing this that is valuable and that I pursue alongside with a scientific approach, with measurable, quantifiable phenomena that fit into an analytic framework (mystic-child / scientist - man)(we will have to add briccoleur / engineer to this dichotomy soon, as dubuffett and levi-strauss were in direct correspondence and engagement with one another’s work. central to my thinking is a keen attentiveness to the ways we might create, enact, discover, invent the denaturalized child, that a precultural entity doesn’t exist as a static being but a state of becoming, and in a spot, for a moment, the mystical mind of the child exists and is born, distorted, and cataylzed by the knowledge of the acculturated. That it is never fully precultural, but that when it is, as animal, as child, as savage, as madman, it becomes so by the preexistence of a culture that it has not yet encountered itself. When the libido of a culture is liberated, it is a moment of recognition of that which it condemns externally as integral to its consistency as a distinct body. “insofar as it liberates libinal instincts, these instincts are sadistic,” and my thinking is informed by De Sade the dialectician. my feeling is more or less this: perhaps kid art can only fully become kid art when a man does it. and that kid art is always made by a little man. art brut, the childlike, is kin to the hard, indigestible kernel of the real that resists symbolic order while also serving as grounds for a new symbolic order. and with the author of brutal aestetics again, “Dubuffet and Bataille in particular seize on negative remainders, such as the brut and the informe, that appear to resist any cultural assimilation.”

kids are little men with their own rigorous schemata. everything i do in my work is in some sense chasing children, just as they chase me away from the playground, they’re a terrifying mob of bullies and mad scientists. all grounds are playgrounds. who says we can’t work on the playground? working with these images i realize that this quality of loose, gestural mark making was one of the central qualities i am in pursuit of in some way, in terms of the defining characteristics of kid art. (but, for me, it may turn out that largely what i will come to understand as the defining characteristics of kid art happen to coincide with the criteria for good art in general.) it is a recovery of raw vision without nostalgia, with new directions toward the child to come. kids are little men of knowledge with their own rigorous schemata, and according to student of bergson and pioneer in the study of child art, georges-henri luquet, they have a logically consistent framework that informs (l’informe) their works to a large degree. before kids learn the convention of visual, perspecitval realism, they are enmeshed in something Luquet called intellectual realism. this is when children draw things not really as they see them, but as they know them. one technique by which this is evidenced is one that relies on an objective notion of transparency: when a kid draws a shoe, you can often still see the toes on the feet through the shoe.

Foster quoting Luquet:

“According to Luquet, intellectual realism is governed by “internal models”; the object depicted by the child exists in her mind more than in the world, and she is faithful to this mental type more to any mundane referent.19 Indeed, she aims to conserve this model at all cost. Hence her sign for a horse or a house will reappear, in whole or part, in drawing after drawing (Luquet called this insistence “graphic automatism”), and if the image seems wrong to the child, she will simply do it over

on the same piece of paper. What defines the internal model is its “exemplarity”; it “is not meant to represent an individual object or person, but a category of objects.” To achieve this goal of “a generic image,” the child works “to present the greatest number, if not all, of the essential elements of the represented object, and to pre- serve each in its characteristic shape, ‘in itself,’ as it were.”It is this imperative that leads the child to her intellectual realism, which adults see as distortion but she regards as fidelity, and in this matter she proceeds logically.”

children are extreme idealists of Tlönian order (hlör u fang axaxaxas mlö).

from Brutal Aesthetics:

“For the art of children, which he included in the art brut collection from the beginning, speaks directly to his anti-aesthetic then in emergence: it has a strong claim on the moti- vated (children draw “solely for their own personal satisfaction,” not for others like parents and teachers), on the fundamental (according to Luquet, this art reveals prime elements of picture-making), and on the general (we were all child artists once). This last point was especially important to Dubuffet; again and again he favored generality over specificity. Of course, these broad concerns with the necessary, the essential, and the universal are quintessentially modernist; so too are the particular qualities typical of child drawing, such as flatness of image, frontality of figure, ambiguity between horizontal and vertical planes, and composition understood as tabulation more than as tableau.”

composition understood as tabulation more than as tableau. I’ve thought about this line a lot since i read it because i think it is another articulation of something i had been searching for in art in general, and it made explicit for me the connection between kid art and outsider art by way of defining a particular aesthetic characteristic, something like the aesthetic of the diagram. this is also where the child and the madman are connected. much in the way that children are faithful to internal models as they compose, as they are drawing what they know, so too is the schizophrenic, or other people that experience life in a way that has been, either or both internally or externally characterized as atypical, biologically and or socially, who i will simply refer to as the madman. (it seems apt that a dedication to a logic of exemplarity would be found to be a guiding principle in those that have somehow been deemed exceptional or exemplary.) supposedly what separates the fantastical compositions of a madman from those of the common man is the fact that for the madman, what is depicted, though fanciful, is not an imagined abstraction of the world and how they see it but a faithful representation of it. it is plain to see that these are the faultlines around which faulter the distinction between the common man and the exceptional man. perhaps it is only ever a difference in degree of deviation from the accepted image of thought, varying in degrees of abnormality against the standard scale of the normal. again the nature culture divide comes back here but with new stakes, perhaps we are less willing to admit that this dichotomy penetrates the substrate of something we hold to be so foundational as a a cognitive science of human perspective, that there is, at some level an equivalence in what we mean when we say, respectively, how we see the world (neuro-bio-physiologically) and how we see the world (culturally, aesthically). this reveals to me another question that recurs for me which has to do with speculating upon the relative spatio-temporal relation of position with regards to perception and interpretation and their (im)possible simultinaeity and/or coincidence (or maybe kant really did already settle this: “thoughts without content are empty, intuitions without concepts blind”). for the madman, the world is regular so he makes regular things. for the common man, the world is wonky so he makes things wonky.

0 notes

infoshakil260 · 1 year ago

Text

The "wonderful duck" priced at 9 yuan won't fly for long.

Recently, an AI application called Miaoya Camera has become popular due to its unique gameplay. Photos generated by it can be seen on social media, and have aroused intense discussions among users. So can Miaoya Camera maintain its popularity and continue to develop? Let’s take a look at how the author analyzes it.

Even if you can't fly, at least you have flown.

If your circle of friends is flooded with various photos of "friends", don't panic. For only 9 yuan, 9 yuan, the price of a cup of Luckin coffee, you can have a complete set.

If you prefer to drink a glass of Luckin, or you really can’t find 20 personal selfies as material, don’t panic, because this small program that “generates professional-quality blockbusters online” shouldn’t be popular for long.

"Miaoya Camera" suddenly became popular, and many people used it to create various personal portraits - business style, ancient costume style, Japanese campus style. Because there were so many people experiencing it, “the server was overcrowded.”

This application is not complicated to operate. Upload about 20 personal selfies, choose a template to determine the style, and then wait for AI to generate it. I won’t go into details about the specific film production effects here. If you search for “Miaoya” in your circle of friends, there should be case studies.

What deserves more attention is that this is an AI application. To be precise, this is the first domestic consumer-oriented, paid, screen-swiping AI application.

This is amazing. Because since AI big models became popular at the beginning of this year, almost all large Internet companies are working on AI and throwing money into it. However, there has never been a product that ordinary people can easily get started with and use on a large scale - Baidu's Wen Xin Yi Yan and Alibaba's Tong Yi Qian Wen. These chat robots are not open to the public; Huawei's Pangu model, Tencent's The Hunyuan model is for B-side enterprises and has nothing to do with ordinary people.

But Miaoya is changing this situation, and it comes with a fee.

9 yuan may not seem like much, but almost all similar photo apps that appeared in the past were free at the beginning - Face Meng, ZAO, etc. were all very popular, and no one charged money right away.

While Miaoya is generating photos for users, it is also slapping investors in the face: “Who said big AI models don’t make money?”

This makes the wonderful duck a good research sample. Although a flash in the pan is almost inevitable, it still brings inspiration to AI entrepreneurs.

1. AI photo-taking starts with "alchemy"

Before discussing Miaoya, we first zoomed in on "Little Book", where a bunch of people were showing "alchemy".

The so-called "refining elixir" does not mean refining the elixir of life, but training an AI model. The process of "alchemy" relies heavily on the graphics card. The computer is running at full speed and may take dozens of hours at a time, hence the name "alchemy".

What about making alchemy? ——Use AI to generate high-quality portraits.

After AI painting became popular, the foreign open source Stable Diffusion model (hereinafter referred to as "SD") and Midjourney, which ordinary people can use, began to be used by more people. These two software are extremely powerful and produce stunning photo effects. In particular, the SD model is now the underlying technical support for many Vincentian graph applications.

Some "crab-loving" people grabbed early bird tickets and used SD and Midjourney to first generate cartoon-style avatars for themselves, then try to generate more realistic real-person avatars, and then unlock more complex scenes and generate some real-life avatars. A blockbuster film that cannot be shot anywhere else in the world.

Before Miaoya appeared, some people had already changed their WeChat avatars to AI portraits. There are also some people who have turned this skill into a business - teaching others AI portraits, or customizing AI portraits for others. The price of customized photos ranges from more than ten yuan to dozens of yuan.

A group of wild "alchemists" have embarked on a vigorous journey of training AI.

The "Little Fairy's Tool Man" (for the sake of memory, let's call him "Gong Juren") started thinking about and tinkering with some programs in May, and then made his own LoRA model. It took him more than half a month to make a LoRA model for his wife using hundreds of real photos. In early June, he used the model to generate the first AI photo.

LoRA model, you can understand it as a plug-in, which can customize and fine-tune the SD large model. It requires only a small amount of data to train the style and characteristics of the model.

For example, the beautiful girl AI pictures that were very popular in the first half of the year were created by someone using the avatars of Japanese and Korean beautiful girls to train the corresponding LoRA model. Other developers could quickly generate beautiful girl-style photos after taking them.

Using SD as the bottom layer and LoRA to make elixirs, this method is used by many developers. The technologies and codes involved are basically open source. The threshold for AI photography has been lowered in the first half of the year.

Gong Guren is not a computer major, nor is he engaged in AI-related work. Before doing AI portraits, his only foundation was a little Python, which he learned by himself. The starting point for him doing this is to take good-looking photos of his wife. He used to use a camera, but now he uses AI.

He told Dingjiao that he paid attention to Miaoya as soon as it went online on July 17. He believes that Miaoya's product implementation method is the same as what he does.

"It should be made by SD. Combined with LoRA character models, several character templates are fixed, allowing users to upload 20 photos to practice LoRA, and then use the generated LoRA to insert them into the templates made in the large model to change faces." Gong Ju Ren analysis.

Many investors who follow the AI track also believe that Miaoya’s technical principle is SD+LoRA.

Wang Sheng, managing partner of Inno Angel Fund, analyzed "Fixed Focus", "I guess Miaoya uses the SD open source model, and then adds a stylized training data set like LoRA. They train each person individually a small LoRA , and then fade various background content.”

Some investors told "Dingjiao" that Miaoya is "the bottom layer of LoRA alchemy + scene prompt words". Because 20 photos are the basic amount of data used to train LoRA.

Miaoya has made templates for portrait scenes, such as workplace, emotions, Jiangnan, spring, etc. Users click to select the scene, which is similar to inputting prompt words to AI to determine the photo style. After the user uploads 20 photos, the system begins to train the personal LoRA, which is what Miaoya calls a digital clone.

Based on the opinions of many investors and practitioners, Miaoya’s AI portrait is not difficult to implement and the technical threshold is not high.

2. Is the price of a cup of Luckin worth it?

Although there is no technical threshold, there is still a threshold for using Miaoya.

In fact, Miaoya’s user experience is not very good.

First of all, they have to provide more than 20 selfies, which dissuades a large group of straight men; secondly, you have to queue up to generate photos, and you have to wait for more than ten hours when there are many people; in addition, it charges 9 yuan and 9 yuan; you have to download For high-definition and uncensored photos, you have to recharge to buy diamonds; if you are not satisfied with the generated photos, we will not refund you.

As a C-end product, it almost sets up obstacles in every aspect that users are sensitive to.

Why does Miaoya dominate the screen? Just because it's a large model?

"They went in the right direction and captured the strongest point of user demand. The product actually has a threshold. This is a matter of product and operations." Wang Sheng said.

From the product perspective, in the direction of personal avatars, we have seen too many similar products in history. Ten years ago, Mian Meng focused on generating personal cartoon images. Later, the WeChat Christmas avatar became popular in WeChat Moments, and the face-swapping app ZAO incubated by Momo all captured the needs of users.

For users, whether the pictures are fresh and fun, or whether they use pictures to socialize and satisfy certain psychological needs, the underlying logic is similar.

When it comes to AI photography, the demand has already existed.

In December last year, Lensa, an overseas AI photo-taking app, became popular. Its gameplay is to allow users to upload about 20 personal photos, and AI will generate different styles of avatars or portraits for a fee. The painting model it uses is SD and the open source dataset LAION-5B. This is strictly speaking the earliest AI photo product to hit the screen.

However, domestic AI photography was mainly popular among the small circles of AI developers and photography enthusiasts before. Ordinary people could not distinguish the concepts of SD and LoRA, let alone get started. Now, through a product, Miaoya has lowered the threshold for ordinary people to use AI photography to zero. This is the prerequisite for Miaoya to become popular.

Miaoya's development team relied on Alibaba Entertainment to promote this product. On some grass-growing platforms, you can see bloggers’ promotional blog posts. "The execution power of large manufacturers is still very strong." Gong Guren commented.

Price is a controversial point.

The common way similar products have been played over the past ten years is to attract users for free first, without considering profits in the short term. As long as the user base increases, there is no need to worry about making money. In other words, swiping the screen is more important than charging.

Miaoya charges fees when it launches, and will add items later, which will shut out a large number of early adopters.

Some people may say that compared with offline studios such as Tianzhenlan and Haima that charge hundreds of yuan, 9 yuan is too cheap. Yes, it is indeed cheaper, but not everyone goes there just to take ID photos. “I’m just here to have fun, and I still have to charge money?”

It’s not difficult to understand the reason behind Miaoya’s collection of money. It is a recognized fact that large AI models burn money, and the inability to keep up with computing power is a pain for almost all AI companies today.

Gong Juren told "Dingjiao" that SD is an open source technology and the technical barriers are not high. What can be done is to make some optimizations on the algorithm and model.

In addition, the so-called digital avatar is to generate the user character LoRA. This model training takes time. If a large number of users pour in at the same time, the computing power will inevitably be tight, so a large number of users will queue up and wait. And once you start training the model, computing power will be consumed. This is a real cost. Miaoya should have calculated the cost, so you must charge once you use it.

In addition, the pictures generated by Miaoya generally have low pixels. If you need a large high-definition picture, you will need to consume further computing power to enlarge the image, so it costs money to export high-definition pictures.

However, this allowed Miaoya to have monetization capabilities from the beginning, becoming the first C-side monetization application among China’s large models.

Putting aside this meaningless name, the question now facing Miaoya is: How can it remain popular when the technical barriers are not high, the product experience is average, and charges limit dissemination?

3. Is Miaoya really popular?

Before discussing the prospects of Miaoya, let’s answer a question first: Is Miaoya really popular?

In comparison, Face Meng can be called very popular. It once had more than 20 million APP downloads within a week. ZAO also hit the screen. On the second day after it was launched, it rushed to the second place in the entertainment free list of the App Store and remained at the top until it was removed. Lensa dominates the App Store download lists in the United States, Canada, the United Kingdom and other countries, with the highest single-day download volume exceeding 15 million.

How are Miaoya's results?

Miaoya does not have an APP, it is now a mini program. There is a platform in China that specializes in counting the popularity of small programs called Aladdin Index. We did not see Miaoya Camera in Aladdin’s major lists.

The WeChat index shows that the index of Miaoya Camera has been very low for four days after it went online on July 17. It started to rise on the fifth day and reached a peak on July 25. However, starting from July 26, the index turned downward. A sharp decline. Its heat only lasted about a week.

Baidu Index shows that the popularity of Miaoya Camera began to rise on July 23, reaching a peak on July 24, and then began to slowly decline, with the highest peak being less than 8,000. From the perspective of urban distribution, the people paying attention are mainly concentrated in first-tier cities such as Beijing, Shanghai, and Guangzhou.

We can initially draw a conclusion: Miaoya is not considered out of the circle.

"Screening is in the media and related industries, as well as investment and financing." Liu Di, managing director of Shengjing Jiacheng, told "Dingjiao".

Above the threshold of 9 yuan and 9 yuan, the first one didn't hit it. It was very difficult for Miaoya to hit the screen again in the later period.

Gong Juren believes that when drawing full-body photos of people, existing AI models have insufficient pixels allocated to the face, hands and feet, so it is difficult to generate accurate full-body photos at once, and basically require facial restoration plug-ins such as After Detailer. To repair the face or artificially repair the fingers. Therefore, most of Miaoya’s photos are headshots and half-length photos, which focus on highlighting the characters and blurring the background. “This can also easily form a stereotype and lead to aesthetic fatigue. People will get tired of seeing it and it will be boring.”

Miaoya’s “predecessor” Lensa has already given the answer.

According to statistics from data analysis company Sensor Tower, Lensa's downloads began to increase in mid-November last year, reached a peak in mid-December, and fell back to low levels in mid-January this year. The popularity of the entire life cycle lasted only two months. Other head AI+ image applications follow almost the same trajectory.

In comparison, Miaoya is not as popular as Lensa.

Compared with discussions at the functional level, the most discussed aspects of Miaoya in the past few days are two issues that are not very beneficial to the company - data privacy and refund issues.

In Miaoya’s original user service agreement, there was an “overlord clause”. Translated, the personal selfies submitted by users to Miaoya can be used by the Miaoya team without restrictions, and the scope of use even includes virtual spaces such as the Metaverse.

20 personal selfies can completely build a face portrait model. With this model, the platform can generate selfie photos of users in any scene.

Even if the platform does not do evil, it is difficult to avoid being used by people with ulterior motives. "How can Miaoya ensure that there are no private goods in the 20 photos? Now someone has included photos of celebrities in 20 photos in order to make the photos behind them look better. " Liu Di said.

This has caused some users to express concern and ask to cancel their accounts. Now, Miaoya has deleted this clause.

There is also the issue of refunds. According to Miaoya's rules, after the user pays to generate photos, the service is deemed to be completed. Even if the user is not satisfied, there will be no refund.

This rule is not excessive. After all, if you go to a physical store to take pictures, there are very few promises of refunds if you are not satisfied. The difference is that Miaoya's service is template-based. Sometimes the photos generated by AI just don't achieve the desired effect, but you can't negotiate and fine-tune it like you can with a human. It's a bit like a one-shot deal.

Gong Juren analyzed the "fixed focus" and found that there is a problem with the LoRA model. 20 photos are not enough to produce a photo that is the same as my own. Some people will look like some people and some will not.

In response to the issue of refunds, Miaoya also issued a special response. This shows that there are many users who have requests for refunds.

These controversies, while increasing the popularity of Miaoya, are also limiting its popularity.

4. Miaoya’s inspiration for AI entrepreneurship

In any case, Miaoya has become popular in some people’s circle of friends. Today, when the "Battle of 100 Models" is in full swing, there is a product that has hit the screen, which is an encouragement to AI entrepreneurs.

In the wave of mobile Internet entrepreneurship, camera applications were the first to hit the market. Now that we are entering the AI entrepreneurship wave, camera products are also the first to take the lead. The similarity of this path excites some entrepreneurs.

However, investors remain cautiously optimistic.

#Music #Film & Animation #Sports #Travel & Events #Gaming #People & BlogsFilm & Animation #Comedy #Entertainment

0 notes