Tumgik
#what is video encoding and decoding
getreview4u · 1 year
Photo
Tumblr media
(via What is Video Encoding and Decoding?)
0 notes
zephiris · 8 months
Text
Being autistic feels like having to emulate brain hardware that most other people have. Being allistic is like having a social chip in the brain that handles converting thoughts into social communication and vice versa while being autistic is like using the CPU to essentially emulate what that social chip does in allistic people.
Skip this paragraph if you know about video codec hardware on GPUs. Similarly, some computers have hardware chips specifically meant for encoding and decoding specific video formats like H.264 (usually located in the GPU), while other computers might not have those chips built in meaning that encoding and decoding videos must be done “by hand” on the CPU. That means it usually takes longer but is also usually more configurable, meaning that the output quality of the CPU method can sometimes surpass the hardware chip’s output quality depending on the settings set for the CPU encoding.
In conclusion, video codec encoding and decoding for computers is to social encoding and decoding for autistic/allistic people.
153 notes · View notes
feminerds · 9 months
Text
Tumblr media Tumblr media Tumblr media
Couple of pages, one from about this time last year and two from last week-ish.
Page 1. 17 Nov 2022 - Flower on Head Bunny.
Soft Sweet Naive Tender Bunny.
She's a Rider - Caroline Polachek "Bunny is a Rider"
of Montreal "Bunny Ain't No Kind of Rider"
Listening on OGION (my computer, all my towers have been named after famous fictional wizards, which I kinda didn't realise until I got this new one last year and was like I gotta name it!)
Page 2. 06 Dec 2023 - Billions and Billionaires.
I've been thinking about Caroline Polachek's Desire I Want to Turn Into You (DIWTTIY) a LOT, I can tell because Kits has started calling her "Caro Polo" in a funny singsong voice that implies we're talking about the -thing- again. I will try to write all the loosely assembled thoughts down a in a continue reading jump. Maybe. I dunno.
Erstwhile indie darling, goose screamer, k*yne scooper and acclaimed quirked-up vocalist Caro Polo has stopped explaining, intellectualising, labeling, making sense! I'm sure there's theory or a name for this idea or reoccurring expression(ism) ;) because of course there's always a framework, context and philosophy that one must know of and employ effectively to place their work in the culture - but good pop music doesn't make sense, or have a basis in theory, and it will not explain itself!
I think PONY by Ginuwine is the most clear example, to mind, like you get it... you get what he's on about, the vibe could not be more legible, tbh, but the song lyrics themselves are not in clear support of the thesis, nor is the odd farting bass, but nonetheless you vibe!
It’s giving C'mon stop trying to hit me and hit me. Morpheus in the Matrix, who was of course shitting on the mother toilet when he said that.
I dunno for sure, really the intentionality of it or her work generally, and it is beside the point — to be encoded or decoded merely makes it a signal, not a sign, not a message, not a meaning. Is this all in there or are you projecting? It doesn't matter so much, as it is a successful attempt at unfocussing that third mythic eye*, feeling that intangible gestalt enough that she's tuning into the desperate leitmotifs of the current moment and amplifying/refracting them through this soaring album.
OSTENSION. DéTOURNEMENT. RECUPERATION.
Par Avion
Re: The Billion____s and The Billionaires
Page 3. 06 Dec 2023 - Année du Lapin (2023) Year of Bunny
Low Pixel
Certified Top 0.1% fan of Caroline Polachek in 2023
Bunny is Hustle Culture?
Bunny is the girls on Epst*in's Private Island? This has legs but is grim and requires more thoughtful elaboration than the general rambling I’m currently on.
Is DIWTTIY about the weird, real and unreal relationship we all have with billionaires because of their inescapable influence over lives?
Is that what the Grimes feat. is about? Is that why Caro Polo did the Harambe song? I've got a Hyper-Chain-Link-Fence of Theories.
That Trace Dominguez video - You Love Billionaires?
Why Bunny?
and What makes her a Rider?
*Probbo I know!
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Any ways here's my Tally Tunes playlist for 2023.
22 notes · View notes
michaeldswanson · 5 months
Text
Apple’s Mysterious Fisheye Projection
If you’ve read my first post about Spatial Video, the second about Encoding Spatial Video, or if you’ve used my command-line tool, you may recall a mention of Apple’s mysterious “fisheye” projection format. Mysterious because they’ve documented a CMProjectionType.fisheye enumeration with no elaboration, they stream their immersive Apple TV+ videos in this format, yet they’ve provided no method to produce or playback third-party content using this projection type.
Additionally, the format is undocumented, they haven’t responded to an open question on the Apple Discussion Forums asking for more detail, and they didn’t cover it in their WWDC23 sessions. As someone who has experience in this area – and a relentless curiosity – I’ve spent time digging-in to Apple’s fisheye projection format, and this post shares what I’ve learned.
As stated in my prior post, I am not an Apple employee, and everything I’ve written here is based on my own history, experience (specifically my time at immersive video startup, Pixvana, from 2016-2020), research, and experimentation. I’m sure that some of this is incorrect, and I hope we’ll all learn more at WWDC24.
Spherical Content
Imagine sitting in a swivel chair and looking straight ahead. If you tilt your head to look straight up (at the zenith), that’s 90 degrees. Likewise, if you were looking straight ahead and tilted your head all the way down (at the nadir), that’s also 90 degrees. So, your reality has a total vertical field-of-view of 90 + 90 = 180 degrees.
Sitting in that same chair, if you swivel 90 degrees to the left or 90 degrees to the right, you’re able to view a full 90 + 90 = 180 degrees of horizontal content (your horizontal field-of-view). If you spun your chair all the way around to look at the “back half” of your environment, you would spin past a full 360 degrees of content.
When we talk about immersive video, it’s common to only refer to the horizontal field-of-view (like 180 or 360) with the assumption that the vertical field-of-view is always 180. Of course, this doesn’t have to be true, because we can capture whatever we’d like, edit whatever we’d like, and playback whatever we’d like.
But when someone says something like VR180, they really mean immersive video that has a 180-degree horizontal field-of-view and a 180-degree vertical field-of-view. Similarly, 360 video is 360-degrees horizontally by 180-degrees vertically.
Projections
When immersive video is played back in a device like the Apple Vision Pro, the Meta Quest, or others, the content is displayed as if a viewer’s eyes are at the center of a sphere watching video that is displayed on its inner surface. For 180-degree content, this is a hemisphere. For 360-degree content, this is a full sphere. But it can really be anything in between; at Pixvana, we sometimes referred to this as any-degree video.
It's here where we run into a small problem. How do we encode this immersive, spherical content? All the common video codecs (H.264, VP9, HEVC, MV-HEVC, AVC1, etc.) are designed to encode and decode data to and from a rectangular frame. So how do you take something like a spherical image of the Earth (i.e. a globe) and store it in a rectangular shape? That sounds like a map to me. And indeed, that transformation is referred to as a map projection.
Equirectangular
While there are many different projection types that each have useful properties in specific situations, spherical video and images most commonly use an equirectangular projection. This is a very simple transformation to perform (it looks more complicated than it is). Each x location on a rectangular image represents a longitude value on a sphere, and each y location represents a latitude. That’s it. Because of these relationships, this kind of projection can also be called a lat/long.
Imagine “peeling” thin one-degree-tall strips from a globe, starting at the equator. We start there because it’s the longest strip. To transform it to a rectangular shape, start by pasting that strip horizontally across the middle of a sheet of paper (in landscape orientation). Then, continue peeling and pasting up or down in one-degree increments. Be sure to stretch each strip to be as long as the first, meaning that the very short strips at the north and south poles are stretched a lot. Don’t break them! When you’re done, you’ll have a 360-degree equirectangular projection that looks like this.
If you did this exact same thing with half of the globe, you’d end up with a 180-degree equirectangular projection, sometimes called a half-equirect. Performed digitally, it’s common to allocate the same number of pixels to each degree of image data. So, for a full 360-degree by 180-degree equirect, the rectangular video frame would have an aspect ratio of 2:1 (the horizontal dimension is twice the vertical dimension). For 180-degree by 180-degree video, it’d be 1:1 (a square). Like many things, these aren’t hard and fast rules, and for technical reasons, sometimes frames are stretched horizontally or vertically to fit within the capabilities of an encoder or playback device.
This is a 180-degree half equirectangular image overlaid with a grid to illustrate its distortions. It was created from the standard fisheye image further below. Watch an animated version of this transformation.
Tumblr media
What we’ve described so far is equivalent to monoscopic (2D) video. For stereoscopic (3D) video, we need to pack two of these images into each frame…one for each eye. This is usually accomplished by arranging two images in a side-by-side or over/under layout. For full 360-degree stereoscopic video in an over/under layout, this makes the final video frame 1:1 (because we now have 360 degrees of image data in both dimensions). As described in my prior post on Encoding Spatial Video, though, Apple has chosen to encode stereo video using MV-HEVC, so each eye’s projection is stored in its own dedicated video layer, meaning that the reported video dimensions match that of a single eye.
Standard Fisheye
Most immersive video cameras feature one or more fisheye lenses. For 180-degree stereo (the short way of saying stereoscopic) video, this is almost always two lenses in a side-by-side configuration, separated by ~63-65mm, very much like human eyes (some 180 cameras).
The raw frames that are captured by these cameras are recorded as fisheye images where each circular image area represents ~180 degrees (or more) of visual content. In most workflows, these raw fisheye images are transformed into an equirectangular or half-equirectangular projection for final delivery and playback.
This is a 180 degree standard fisheye image overlaid with a grid. This image is the source of the other images in this post.
Tumblr media
Apple’s Fisheye
This brings us to the topic of this post. As I stated in the introduction, Apple has encoded the raw frames of their immersive videos in a “fisheye” projection format. I know this, because I’ve monitored the network traffic to my Apple Vision Pro, and I’ve seen the HLS streaming manifests that describe each of the network streams. This is how I originally discovered and reported that these streams – in their highest quality representations – are ~50Mbps, HDR10, 4320x4320 per eye, at 90fps.
While I can see the streaming manifests, I am unable to view the raw video frames, because all the immersive videos are protected by DRM. This makes perfect sense, and while I’m a curious engineer who would love to see a raw fisheye frame, I am unwilling to go any further. So, in an earlier post, I asked anyone who knew more about the fisheye projection type to contact me directly. Otherwise, I figured I’d just have to wait for WWDC24.
Lo and behold, not a week or two after my post, an acquaintance introduced me to Andrew Chang who said that he had also monitored his network traffic and noticed that the Apple TV+ intro clip (an immersive version of this) is streamed in-the-clear. And indeed, it is encoded in the same fisheye projection. Bingo! Thank you, Andrew!
Now, I can finally see a raw fisheye video frame. Unfortunately, the frame is mostly black and featureless, including only an Apple TV+ logo and some God rays. Not a lot to go on. Still, having a lot of experience with both practical and experimental projection types, I figured I’d see what I could figure out. And before you ask, no, I’m not including the actual logo, raw frame, or video in this post, because it’s not mine to distribute.
Immediately, just based on logo distortions, it’s clear that Apple’s fisheye projection format isn’t the same as a standard fisheye recording. This isn’t too surprising, given that it makes little sense to encode only a circular region in the center of a square frame and leave the remainder black; you typically want to use all the pixels in the frame to send as much data as possible (like the equirectangular format described earlier).
Additionally, instead of seeing the logo horizontally aligned, it’s rotated 45 degrees clockwise, aligning it with the diagonal that runs from the upper-left to the lower-right of the frame. This makes sense, because the diagonal is the longest dimension of the frame, and as a result, it can store more horizontal (post-rotation) pixels than if the frame wasn’t rotated at all.
This is the same standard fisheye image from above transformed into a format that seems very similar to Apple’s fisheye format. Watch an animated version of this transformation.
Tumblr media
Likewise, the diagonal from the lower-left to the upper-right represents the vertical dimension of playback (again, post-rotation) providing a similar increase in available pixels. This means that – during rotated playback – the now-diagonal directions should contain the least amount of image data. Correctly-tuned, this likely isn’t visible, but it’s interesting to note.
More Pixels
You might be asking, where do these “extra” pixels come from? I mean, if we start with a traditional raw circular fisheye image captured from a camera and just stretch it out to cover a square frame, what have we gained? Those are great questions that have many possible answers.
This is why I liken video processing to turning knobs in a 747 cockpit: if you turn one of those knobs, you more-than-likely need to change something else to balance it out. Which leads to turning more knobs, and so on. Video processing is frequently an optimization problem just like this. Some initial thoughts:
It could be that the source video is captured at a higher resolution, and when transforming the video to a lower resolution, the “extra” image data is preserved by taking advantage of the square frame.
Perhaps the camera optically transforms the circular fisheye image (using physical lenses) to fill more of the rectangular sensor during capture. This means that we have additional image data to start and storing it in this expanded fisheye format allows us to preserve more of it.
Similarly, if we record the image using more than two lenses, there may be more data to preserve during the transformation. For what it’s worth, it appears that Apple captures their immersive videos with a two-lens pair, and you can see them hiding in the speaker cabinets in the Alicia Keys video.
There are many other factors beyond the scope of this post that can influence the design of Apple’s fisheye format. Some of them include distortion handling, the size of the area that’s allocated to each pixel, where the “most important” pixels are located in the frame, how high-frequency details affect encoder performance, how the distorted motion in the transformed frame influences motion estimation efficiency, how the pixels are sampled and displayed during playback, and much more.
Blender
But let’s get back to that raw Apple fisheye frame. Knowing that the image represents ~180 degrees, I loaded up Blender and started to guess at a possible geometry for playback based on the visible distortions. At that point, I wasn’t sure if the frame encodes faces of the playback geometry or if the distortions are related to another kind of mathematical mapping. Some of the distortions are more severe than expected, though, and my mind couldn’t imagine what kind of mesh corrected for those distortions (so tempted to blame my aphantasia here, but my spatial senses are otherwise excellent).
One of the many meshes and UV maps that I’ve experimented with in Blender.
Tumblr media
Radial Stretching
If you’ve ever worked with projection mappings, fisheye lenses, equirectangular images, camera calibration, cube mapping techniques, and so much more, Google has inevitably led you to one of Paul Bourke’s many fantastic articles. I’ve exchanged a few e-mails with Paul over the years, so I reached out to see if he had any insight.
After some back-and-forth discussion over a couple of weeks, we both agreed that Apple’s fisheye projection is most similar to a technique called radial stretching (with that 45-degree clockwise rotation thrown in). You can read more about this technique and others in Mappings between Sphere, Disc, and Square and Marc B. Reynolds’ interactive page on Square/Disc mappings.
Basically, though, imagine a traditional centered, circular fisheye image that touches each edge of a square frame. Now, similar to the equirectangular strip-peeling exercise I described earlier with the globe, imagine peeling one-degree wide strips radially from the center of the image and stretching those along the same angle until they touch the edge of the square frame. As the name implies, that’s radial stretching. It’s probably the technique you’d invent on your own if you had to come up with something.
By performing the reverse of this operation on a raw Apple fisheye frame, you end up with a pretty good looking version of the Apple TV+ logo. But, it’s not 100% correct. It appears that there is some additional logic being used along the diagonals to reduce the amount of radial stretching and distortion (and perhaps to keep image data away from the encoded corners). I’ve experimented with many approaches, but I still can’t achieve a 100% match. My best guess so far uses simple beveled corners, and this is the same transformation I used for the earlier image.
Tumblr media
It's also possible that this last bit of distortion could be explained by a specific projection geometry, and I’ve iterated over many permutations that get close…but not all the way there. For what it’s worth, I would be slightly surprised if Apple was encoding to a specific geometry because it adds unnecessary complexity to the toolchain and reduces overall flexibility.
While I have been able to playback the Apple TV+ logo using the techniques I’ve described, the frame lacks any real detail beyond its center. So, it’s still possible that the mapping I’ve arrived at falls apart along the periphery. Guess I’ll continue to cross my fingers and hope that we learn more at WWDC24.
Conclusion
This post covered my experimentation with the technical aspects of Apple’s fisheye projection format. Along the way, it’s been fun to collaborate with Andrew, Paul, and others to work through the details. And while we were unable to arrive at a 100% solution, we’re most definitely within range.
The remaining questions I have relate to why someone would choose this projection format over half-equirectangular. Clearly Apple believes there are worthwhile benefits, or they wouldn’t have bothered to build a toolchain to capture, process, and stream video in this format. I can imagine many possible advantages, and I’ve enumerated some of them in this post. With time, I’m sure we’ll learn more from Apple themselves and from experiments that all of us can run when their fisheye format is supported by existing tools.
It's an exciting time to be revisiting immersive video, and we have Apple to thank for it.
As always, I love hearing from you. It keeps me motivated! Thank you for reading.
12 notes · View notes
fettesans · 2 months
Text
Tumblr media
Top, Gold oval brooch with a band of diamonds within a blue glass guilloche border surrounded by white enamel (1890). Lady’s blue right eye with dark brow (from Lover’s Eyes: Eye Miniatures from the Skier Collection and courtesy of D Giles, Limited) Via. Bottom, screen capture of The ceremonial South Pole on Google Street View part of a suite of Antarctica sites Google released in 360-degree panoramics on Street View on July 12, 2017. Taken by me on July 29, 2024. Via.
--
Images are mediations between the world and human beings. Human beings 'ex-ist', i.e. the world is not immediately accessible to them and therefore images are needed to make it comprehensible. However, as soon as this happens, images come between the world and human beings. They are supposed to be maps but they turn into screens: Instead of representing the world, they obscure it until human beings' lives finally become a function of the images they create. Human beings cease to decode the images and instead project them, still encoded, into the world 'out there', which meanwhile itself becomes like an image - a context of scenes, of states of things. This reversal of the function of the image can be called 'idolatry'; we can observe the process at work in the present day: The technical images currently all around us are in the process of magically restructuring our 'reality' and turning it into a 'global image scenario'. Essentially this is a question of 'amnesia'. Human beings forget they created the images in order to orientate themselves in the world. Since they are no longer able to decode them, their lives become a function of their own images: Imagination has turned into hallucination.
Vilém Flusser, from Towards a Philosophy of Photography, 1984. Translated by Anthony Mathews.
--
But. Actually what all of these people are doing, now, is using a computer. You could call the New Aesthetic the ‘Apple Mac’ Aesthetic, as that’s the computer of choice for most of these acts of creation. Images are made in Photoshop and Illustrator. Video is edited in Final Cut Pro. Buildings are rendered in Autodesk. Books are written in Scrivener. And so on. To paraphrase McLuhan “the hardware / software is the message” because while you can imitate as many different styles as you like in your digital arena of choice, ultimately they all end up interrelated by the architecture of the technology itself.
Damien Walter, from The New Aesthetic and I, posted on April 2, 2012. Via.
3 notes · View notes
mgmrosales · 7 months
Text
Ideology and Culturalism II: Pop Icons!
Tumblr media
youtube
Rihanna’s music video for “Bitch Better Have My Money” reflects the inner workings of the culture industry through the ideology discussed within the workings of Adorno and Horkheimer. Firstly, to give a general overview of the music video and its role within the culture industry, Rihanna glorifies themes of wealth, power, and materialism–contributing to the ideas of commodification and dependency on economic value. The video begins with kidnapping the wife of a “wealthy” man, as the video progresses, we see Rihanna partaking in luxuries and borderline opulent activities while keeping the woman hostage. Money is at the very center of the music video, from driving a convertible (01:59-01:57), lounging on a yacht (02:48-03-27), to partaking in the usage of drugs and alcohol (03:58-04-26). Adorno and Horkheimer reveal the manipulations within the culture industry–the falsification of profitable “needs.” Adorno and Horkheimer argued that the culture industry manipulates individuals' desires and preferences, the music video profits off the societal desires for success through wealth indicators–reinforcing materialism under a capitalist framework.  
This is an industry that idealizes consumerablity to value and gauge the necessity of a product which disturbs the artistic process of creation. All forms of art, from podcast to music video, are subject to the interests of money. While this prioritization of money is explicitly promoted in the lyrics of BBHMM, the song fits the standard mold of creation. There is a repetitive nature within the lyrics, art is then removed from the artist–it becomes a commodified product, transformed and designed for profit. The industry is solely concerned with making profits–this is directly linked to pop culture and everything in between.
Tumblr media
Rihanna, herself, is a pop culture icon that is subject to the means of a mass capitalist consumerist mindset displaced within the music industry. She reflects the timely trends of the time, we can see this through the musical style of the backing track and the stylized outfits from 8 years ago. Down to the makeup trends of a long and thick liner and the neutral but bold lip color, implementing the micro trends of 2015 to mold to what would reel in the masses. The standardized content to capture the interest of the masses to form to contribute to a homogeneous culture of values. 
Tumblr media
During the process of encoding and decoding, the viewing audience may have varying interpretations of the music video based on their personal experience and background. Some may see the video as a form for women empowerment, the breakdown a male-dominate industry. Rihanna taking control and asserting herself as a force to be reckoned with. Or, others can see this as the glorification of violence as a means of retaining wealth and power. This ties into the commodified rebellion aspect of the culture industry. We as an audience interpret and decode Rihanna as a powerful figure, empowering women for the sake of feminism, but at the same time she is primarily making profit from our sentiment, rather than directly advocating for women. 
youtube
"Bitch I’m Madonna", coming out the same year as BBHMM, functions in a similar manner to Rihanna’s work. Both pop star icons, which directly support Adorno and Horkheimer’s beliefs on the culture industry, have been molded into marketable products for mass consumption by the culture industry. The name “Madonna” has become a brand itself, which is apparent in the name of the song! Her image is enhanced through her commercial appeal as a global and legendary celebrity. She also incorporates cameos from different house-hold name celebrities (02:00-02:28) like Beyonce, Kanye, Katy Perry, and Miley Cyrus–highlighting the interconnectedness of fame and strengthening the celebrity culture. Keeping it in the circle, literally, supports the capitalist notions of social status, a notion that Adorno and Horkheimer challenge in their work. 
Based on the way the music video was filmed, Madonna features an excess of wealth and luxury through a party-like concept, and is constantly surrounded by glamorous clothing and accessories. By barely having any cuts, almost a one shot, and Madonna as the focal point, the music video carefully crafts a branded image where success is measured by the individual’s own ability to make a name for themselves–promoting a culture of consumption. The flashy visuals, bright colors, and the strong and visually appealing choreography conforms to the ideas found within the culture industry and its expectations of a formulaic, mainstream media piece. 
Tumblr media
Madonna received a lot of negative criticisms and feedback when premiering her music video back in 2015. The public reception consisted of a lot of people suggesting that Madonna was clawing to stay “relevant” at this time by displaying acts that are typically done by younger folk. Tying this to Stuart Hall’s findings of encoding and decoding, a viewer may receive a message of excessive narcissism within the world of Madonna. In correspondence to Hall’s ideas on oppositional reading, Madonna encodes the music video with her vibrancy and energy, showcasing empowerment for an older generation, but through decoding the message, one might suggest that the focus on materialism is detrimental to the career of the icon. The dynamic nature of decoding supports Stuart Hall's theory, emphasizing the active role of the audience in making meaning from media texts.
Tumblr media
Both iconic women showcase their adaptation to cultural trends of the era in media; due to both celebrities being subject to the culture industry, the use of conformity is a characteristic of the mass production of cultural products–this is also applied to the standardized format commonly found in the music industry that can diminish the artistic integrity found in the careers of both Rihanna and Madonna. Through the use of iconography, an audience may decode both artists’ expressions in the context of social frameworks. 
Based on the conceptualization of the culture industry, are there aspects of pop culture that do not fall under the commodification of art? 
How does the interplay between the sound and visuals of both Madonna’s “Bitch I’m Madonna” and Rihanna’s “Bitch Better Have My Money” challenge or support each other? How may that influence the process of decoding the messages from the artist?
In what ways is Madonna and Rihanna presented as a marketable commodity in her music videos from now and before, and how does this reflect the commercialization of art discussed by Adorno and Horkheimer’s “The Culture Industry as Mass Deception?
Max Horkheimer and Theodor Adorno, “The Culture Industry: Enlightenment as Mass Deception,” in Dialectic of Enlightenment (California: Stanford University press, 2002)
Stuart Hall, “Encoding, Decoding,” in The Cultural Studies Reader (London: Routledge, 1993)
13 notes · View notes
lstine919 · 7 months
Text
Levi Stine - Ideology and Culturalism II
“Bitch Better Have My Money”
youtube
Tumblr media
Rihanna’s Bitch Better Have My Money, first and foremost, puts money at the center of human motivation. The entire concept of the video’s narrative is for Rihanna’s protagonist to obtain the money owed to her. Observing this work through the eyes of Horkheimer and Adorno, it is clear that this perpetuates the idolization of money in the culture industry. “Their ideology is business”,(1) and the ideology of this video drives the business of both the music industry and the culture industry. Horkheimer and Adorno claim that “the only escape from the work process is adaptation to it in leisure time.”(2) Viewers of this video consume a barrage of capitalistic ideals, most poignantly the way in which the kidnappers live in luxury because of their dirty, violent work. Much of the video takes place on their yacht (2:48-3:28), a testament to their wealth. The culture industry also seems to be represented by Rihanna herself. She is an unstoppable force that has come to take what’s hers, parallel to the irresistibility of the culture industry and how no one can survive without being roped into it.(3) Horkheimer and Adorno also claim that the media of the culture industry creates an illusion that is believed to still be connected to the real world,(4) and the normalization of such violence in the video can have a negative effect on viewers. Rihanna and her team of kidnappers perpetuate Horkheimer and Adorno’s notion of the all-consuming culture industry, an industry that worships money and the hard labor falsely-cited as required in order to obtain it. 
Tumblr media
Signs are in effect all throughout Bitch Better Have My Money, and are often subverted throughout the narrative. Stuart Hall states how denotation is controlled by the sign sender, while connotation, the reception of the message by the individual, is subject to numerous external factors. (5) The video opens (0:08-0:20) with a woman's legs sticking out of a wooden chest. Then, a woman is introduced in a lavish home with a formal dress and shiny earrings (0:21-0:31), denoting wealth while connoting, within the context of the story, money that has been immorally-obtained. As the narrative of the story progresses and the rich woman remains held hostage, the viewer assumes that she is the one who will end up in the chest, that she is the “bitch” who “better have my money.” The video twists this on its head with the introduction of Mads Mikkelsen’s “The Accountant” (5:21), as well as the backstory of the situation, switching the connotation of Rihanna’s actions from seeming like senseless torture to being perceived as a powerful resistance against an evil man who wronged her. Hall claims that the process of encoding and decoding requires a means and a relation for social production, (6) a context for the sign to be put in. BBHMM, a hit song by a woman in an industry historically dominated by men, mirrors the female empowerment that the video displays in the climax of the narrative. The final shot of the video reveals Rihanna herself as the woman in the chest (6:02-6:37). She’s naked, smoking a cigarette, and covered in money and blood. The visual denotes the mutilation of The Accountant and a rightful repossession of funds. More intuitively, the phallic imagery of the cigarette in her mouth and her nudity, as well as the nudity present throughout the video, connotes a sort of sexual power and domination to the narrative. When the topless hostage signified vulnerability, while the final shot provides Rihanna with an upfront strength and badassery.
Discussion Questions:
How do you think the culture industry’s perspective on money would change if creation within the industry itself wasn’t so lucrative?
In what ways would Rihanna change the signs in her video if she wanted to portray herself as the antagonist, and the hostages as the protagonists?
“Radio Ga Ga”
youtube
Tumblr media
Queen’s music video for their 1984 hit “Radio Ga Ga” details the inner workings of the culture industry in true dystopian fashion, exaggerating the role of technology in our lives in order to reflect on how we use it. Horkheimer and Adorno believe that “culture today is infecting everything with sameness.” (7) Queen exemplifies this fear in a sequence that occurs during the first and second chorus (2:12-2:38 and 3:37-4:01), in which the four band members, dressed in red, rally a mass of people, dressed all in white. Queen extends a salute and the crowd echoes it. This sequence represents a society ruled by radio (“All we hear is/Radio ga ga”), conforming and oppressive. This is a scenario in which “ideology becomes the emphatic and systematic proclamation of what is,” (8) causing nothing new or creative to emerge from this fictional modern-day society. The video puts radio at the root of the future’s problems, but at the same time, radio is praised for having been so simple. The concept of a music video being nostalgic about a time in which only radio existed is a contradictory example of Horkheimer’s and Adorno’s belief that within the culture industry, the message of a work emerges from the same school of thought as the lens through which people receive it. (9) Music lovers will watch this video and further praise a technology that innovated music listening (this connection only strengthens as time goes on and nostalgia for the music video itself increases).
Tumblr media
The “Radio Gaga” music video effectively uses visual signs to convey their anti-dystopian message. Hall cites Barthes’ notion that the connotation of the signifiers given to audiences is closely linked to the audience’s culture. (10) In the 1980s, music videos were incredibly popular, and they were an example of the many technological advancements brought on by the decade. The futurism of this video’s world, at the time of the video’s release, may have been met with celebration and excitement, while in the “postmodern” age we live in now, the video is understood to be a warning against technological domination. This video may now serve as an example of how the connotation of an encoded message can be changed over time. Further details within the video act as signs, outlining their process and function in digital media. As previously mentioned, the image of the band in red rallying and controlling the masses in white (2:12-2:38 and 3:37-4:01) connotes government/state control, as those colors are reminiscent of many harsh dictatorships throughout history. The use of clips from the german film Metropolis (1927) in the beginning of the video (0:00-0:31) shows a parallel between this video’s message and the fear of technological dystopia that is present in other important forms of media in other countries. As someone previously familiar with the film, I assumed what Hall defines as a “negotiated code” when I viewed the video, in that I understood what had been “dominantly defined” because the video presented situations and events which were “in dominance”, that I understood. (11)
Tumblr media Tumblr media
Discussion Questions:
What are some other ways in which technological advancement has increased the scope of the culture industry?
How do you think that the globalization of information has contributed to the reception of media through “negotiated code”? What sort of dystopia would Queen be fearing if they created this video today?
Tumblr media
1 Max Horkheimer and Theodor Adorno, “The Culture Industry: Enlightenment as Mass Deception,” in Dialectic of Enlightenment (California: Stanford University press, 2002), 109
2 Horkheimer and Adorno, The Culture Industry, 109
3 Horkheimer and Adorno, The Culture Industry, 104
4 Horkheimer and Adorno, The Culture Industry, 99
5 Stuart Hall, “Encoding, Decoding,” in The Cultural Studies Reader (London: Routledge, 1993), 513
6 Hall, Encoding, Decoding, 508
7 Horkheimer and Adorno, The Culture Industry, 94
8 Horkheimer and Adorno, The Culture Industry, 118
9 Horkheimer and Adorno, The Culture Industry, 102
10 Hall, Encoding, Decoding, 513
11 Hall, Encoding, Decoding, 516
@theuncannyprofessoro
8 notes · View notes
a-ghostiee · 1 year
Text
Everything I found in the new DRDT MV
SPOILERS AFTER THE CUT PLEASE DO NOT READ IF YOU WANT TO DISCOVER THESE THINGS FOR YOURSELF
We’ll start with the footnotes. I found nearly all of them, the exception being [8] which I couldn’t find. I will provide timestamps for each, and try my best to explain what it means.
[1] (1:22) - It is talking about solving the crossword, meaning that J would go by Julia and Xander would be Alexander. It’s also saying that David isn’t in the crossword, yet Teruko is. I will go into more detail on the crossword later.
[2] (3:02) - Arabidopsis is a thale cress plant; Drosophila melanogaster is a fruit-fly; and E. coli is bacteria. I’m not quite sure how these link, though.
[3] (2:18) - Literally a quote from Title 17 of the United States Code; which talks about Copyright.
[4] (1:47) - This footnote is attached to a bit of text in the background that says “subtract 4, add to tetraphobia.”. Tetraphobia is the practice of avoiding the number 4, which is mentioned in the description.
[5] (3:10) - As it says in the description, this part of the song has been mistranslated several times, so there is no reliable translation for it.
[6] (2:02) - The little 6 can be found next to the hands that look like this: 🙏. I assume the previous hand gestures were referencing a specific prayer, but I’m not sure.
[7] (2:41) - I’m gonna be honest, there isn’t much to work with here. The footnote in the description isn’t much help and the little 7 appears to be attached to the word “mind”
[8] - I couldn’t find [8] in the video, however, I googled the quote from the description and it comes from Alice in Wonderland.
[9] (2:08) - Again, the footnote in the description isn’t that helpful. This time, its attached to “sing a degraded copy”. The phrase “degraded copy” is in pink, so it’s probably important (maybe), but I really don’t know.
[10] (2:01) - The bit in the description mentions that “10 in Roman Numerals is X” and footnote 10, can be found on the right of the big, pink X in the background. Maybe the footnote is hinting that only 10 people will die in DRDT, because the pink X is very similar to the dead portrait Xs.
[11] (1:32) - Now, this one is probably the most interesting, because it says that ⚪⚪⚪⚪⚪ ⚪⚪⚪⚪⚪ doesn’t exist. Given that this video centres around David, we can assume that this is most likely talking about his sister that he mentioned in Chapter 2 Episode 10, Diana. She fits the amount of letters perfectly. As for why this footnote is attached to “suspicious gaps”, I’m not quite sure. All I know about the gaps, is that there’s four of them.
[12] (2:02) - This part of the video shows that one person received 16 votes, and no-one else received any. This fits in with the description talking about majority vote (YTTD ref? /hj)
[13] (2:40) - Now this one is definitely next to something that’s been encoded. It seems like it’s been encrypted into Base64, but when I put those letters and numbers into a decoder, I got utter nonsense. It could have been encrypted several times, or it could just not be Base64, I’m still trying to work that out. The symbol in the description appears to mean correct as well, which fits with the placement.
[14] (3:52) - This appears to be at the end of a long chain of numbers, split into several parts. If you look at the equals symbol in the back, you’ll find the little 14. The hint in the description says “word length of 256″, which I could easily link to ASCII. ASCII is a form of character storage which only has 256 possible characters. Again, I’ll figure out what all the numbers mean, and translate it into something readable at some point.
[15] (1:48) - Not quite sure what to say here. The 15 is attached to the word “happiness” and the description talks about “ignorance is bliss”
[16] (2:50) - This is found on several screenshots(?) of a music sheet, which is “Entry of the Gladiators” by Julius Fucik.
[17] (2:01) - Probably the first footnote I spotted, I think I noticed it during my first watch, when it premiered. The description is right though, “Democratic-ly” isn’t a word.
[18] (3:04) - This one can be found with the dandelions (weed). I don’t know what the description is talking about though, the flowers are beautifully drawn ^^
[19] (3:42) - This one’s quite interesting because it’s part of a conversation. Not quite sure what it’s about but I’m pretty sure one of the mystery people is David.
[20] (1:53) - The description mostly explains that the 5 stages of grief are kind of outdated because they can be classed as reductionist, only considering nature.
[21] (3:49) - Again, another pretty simple one that I’m not sure I need to really explain.
[22] - This one is literally on screen for barely a second right at the very end, just before the video stops.
Now, I’ll move on to discuss the YouTube comment type things that appeared on the screen at around 1:09. These are in the order of when the appeared on screen (i think lol).
1.  “ ⚪⚪⚪⚪⚪ is like the byakuya/nagito/kokichi of the cast.” - This is probably talking about David, given the personality he showed during Chapter 2 Episode 11. Also, David has the right amount of letters. 2. “lets play spot the komaeda.” - Again, most likely about David. 3. “I like that  ⚪⚪⚪⚪⚪⚪ is a protagonist who also plays the antag[onist]” 4. This one is likely about Teruko given how she’s our protagonist and can be quite antagonistic at times. (*cough* pulling a knife out on several people during Chapter 2 *cough*) 5. “mm  ⚪⚪⚪⚪ anyone?” - Not sure who this is talking about because there are so many people who have a four-letter name. (Levi, Whit, Arei, Nico, Rose and Eden) 6. “⚪ and  ⚪⚪⚪⚪ totally swapped places” - Ok, this is definitely about the J and Arei swapped theory. 7. “ ⚪⚪⚪⚪⚪ will obviously die in ch5″ - Could be talking about David again, but there’s something later in the video that might suggest otherwise, which I’ll talk about later on. 8. “I just hope ⚪⚪⚪⚪⚪⚪ doesn’t go crazy and kill in chapter 3. That would be way [too] predictable.” - Arturo 100% 9. “Everyone in the comment section is a fucking idiot” - Now this one’s kind of mean, but also very interesting. This could be David telling us we’re all wrong, or the creators.
At about 1:22, a crossword shows up briefly, as mentioned during [1]. I like crossword type things so I took some time to solve it. After this crossword shows up, there are several bits of text, which are matched with a roman numerals, so I matched those phrases up to the roman numerals from the crossword.
Across:
I. Alexander IV. Arei - “Right now, why do you cry?” VI. Arturo - “mind exercises 1234” VIII. Nico - “even if I try to think, idk!!! (lmao)” IX. Levi - “look aside from that, give me the usual medicine” - I wasn’t sure what was going on here XII. Eden - “But you’re in my way, aren’t you?” XIII. Teruko - “or” / “To be or not to be” XV. Whit - “Remaining ignorant, isn’t that “happiness”?” XVI. Hu - “Go and cry.”
Down:
II. Rose - “Ego cogito ergo (terbatus) sum” III. Charles - “If you doubt brittle things are broken.” V. Ace - “Right now, why do you go insane?” VII. Julia - “Do it like that, let’s live together” X. Min - “Democratic-ly”  XI. Mai - “God is dead” XIV. Veronika - “Things like substance of the arts”
And to finish off this post, I’ll talk about anything extra that couldn’t fit anywhere else. I’ll provide timestamps as well, lol.
(0:37) - Text says “I am a cat” before the word dog quickly covers the word cat. This could be a reference to how MonoTV looks like a cat, but insists they are a dog.
(1:00) - Text that is very briefly on screen says: “I did love you once so you should not have believed me.”
(1:04) - The person on screen looks like she could be Mai? Not really sure here, but she seems important.
(1:05) - Text at the bottom of the screen says “I’m guilty as charged. Sorry, we’re not there yet.” This could be a reference to how David says “I’m guilty as charged” in Chapter 2 Episode 11.
(1:28) - Text in red says “I hate the things that I love, and I love the tings that I hate.” This could be a reference to the photos of Mai and Teruko.
(2:02) - “Voting results: Everyone will be executed. There is no such thing as “victory” in a killing game.” This does not look good for the DRDT cast.
(2:22) - This looks like Xander. It looks like its from before he got his eyepatch, but that could just be the angle. If it is from before, does this mean David knew Xander before the killing game but lost(?) those memories.
(2:38) - “Note to self: put something here” Maybe something will be added to the video later?
(3:00) - “portrait of someone dearly loved” with an arrow pointing to the photo of Mai Akasaki. Maybe David loved Mai before something tragic happened?
(3:00) - “portrait of someone dearly unloved” with an arrow pointing to the photo of Teruko Tawaki. I think Teruko has something to do with whatever happened to Mai and David hates her for it. I think David was the person at the very beginning of the very first episode of DRDT, he’s the only person that I think has a motive. You can also spot a fork in the background of the MV and at the beginning of the prologue.
(3:04) - There’s a QR code on the books to the right. I haven’t been able to scan it so I don’t know what it leads to yet.
(3:06) - The supporting cast list has Mai Akasaki scribbled out and what looks like “Ms Naegi”  cut off underneath it. On the right of the screen, there was faint text that says: “(i.e. these are the only characters who make an appearance.) which could mean that Arei, Hu and Ms Naegi are in the video. This is a stretch but maybe Hu killed Arei. Probably not, though.
(3:10) - It looks like Xander is the one holding the gun here.
(3:20) - No, that’s wrong!
(3:44) - The lyrics here say “I’ll disappear” and David disappears from the chair, leaving what looks like splattered blood. The words “Chapter 3″ flash on the screen, though its cut off. David dies in Chapter 3 maybe?
Thank you for reading this extremely long post; I’ll reblog it anytime I get more information.
20 notes · View notes
languagedaemon · 11 months
Text
Comprehensible Input: introduction
Tumblr media
In the midst of the pandemic, in 2020, I had a trial class with a new student, James. In the interview he told me that he didn’t want to study grammar or do homework, nor any exercises. Since he was a total beginner, I thought it was quite difficult what he was asking me to do, and that he would make minimal progress. He gave me a link to a video explaining the method he wanted to use in our classes. A bit skeptical, I watched it. We had not seen this theory in the teacher’s training. It was my first contact with the natural approach, or comprehensible-input, or input-based learning.
Before that, my teaching had been based on grammar and communication tasks. I used a textbook (Dicho y hecho, from UNAM), a grammar website (Lingolia), tried to follow a clear progression in topics, and saw in-class conversation only as practice or even a break. After watching James’ video, I understood that the natural approach had been the way I, and so many people of my generation, learned English: through movies, series, music, video games, websites, just interacting with the language, trying to decipher it, to go through it to get to the information we were interested in. I spent my life receiving messages in English without trying to produce them, but little by little my oral and written expression improved, effortlessly. The comprehensible input hypothesis, pioneered by Stephen Krashen, explained why. It was a total change of perspective.
In brief, what the input hypothesis proposes, in a microscopic vision, is that the only true moment of learning is when the student receives a message (encoded) and understands it (decodes it), that it is only through this process of reception that the structures and contents of the target language are assimilated, take shape in the student’s mind, and gradually become resources available for production. Thus, it is not advisable to study grammar or do exercises, but rather to focus on “passive” tasks such as reading and listening, trusting that speaking and writing will be the consequences of this.
Therefore, a study program based on comprehensible input would replace textbooks with novels, grammar charts with magazine articles, drills with real conversations, the need to memorize the basics with a dive into the language, jumping into the pool without knowing how to swim. It is, in a sense, a method without a method, a Zen method, learning the language by interacting with it, as if you already knew it, a kind of learning by doing, learning on the job. Of course, trying to mark a trajectory that goes from the simplest content (books for babies and children, for example) to the most complex.
In my experience, viewing language learning in this way generates a less stressful, less forceful study, and more fun and interesting. Personally, I think a little grammar can be useful at different times, but I generally subscribe to the ideas of Stephen Krashen and company. In the following weeks we will look at the basics of the input hypothesis in detail.
9 notes · View notes
adafruit · 1 year
Text
Come as we explore strange new video codecs 🔍🖖🎥
Our last few experiments with playing video+audio on the ESP32-S3 involved converting an MP4 to MJpeg + MP3, where MJpeg is just a bunch of jpegs glued together in a file, and MP3 is what you expect. This works, but we maxed out at 10 or 12 fps on a 480x480 display. You must manage two files, and the FPS must be hardcoded. With this demo https://github.com/moononournation/aviPlayer we are using avi files with Cinepak https://en.wikipedia.org/wiki/Cinepak and MP3 encoding - a big throwback to when we played quicktime clips on our Centris 650. The decoding speed is much better, and 30 FPS is easily handled, so the tearing is not as visible. The decoder keeps up with the SD card file, so you can use long files. This makes the board a good option for props and projects where you want to play ~480p video clips with near-instant startup and want to avoid the complexity of running Linux on a Raspberry Pi + monitor + audio amp. The only downside right now is the ffmpeg cinepak encoder is reaaaaallly slooooow.
11 notes · View notes
Text
I use she/he, call me Num.
TAGS:
#asknum ~ ask tag
#subnum ~ submission tag
#10%, #20%, etc ~ self explanatory
#ongoing poll ~ poll rbed before a poll finished
RULES:
ANY numbers in images count. If an image is too blurry to tell it will not be counted.
Words also count. (Seven, nineteenth, twenties, etc.)
Numbers outside of images only count if they are part of the text. Numbers in urls, number of notes, and timestamps of the post don't count as those can be changed.
Ongoing polls will be tagged accordingly and will NOT include remaining time, number of votes, and vote percentages, as those will change.
Any encoded numbers (binary, hexadecimal, equations, etc.) count solely as what they are in their base forms. I will not be decoding anything. (If you send me 8+4, that is 20% with an 8 and a 4. It is NOT 40% because it equals 12.)
Submissions and asks are fine. Please try to avoid sending me videos longer than 10 seconds, or posts in a language other than English. Tagging me in stuff is also fine.
EDIT: if I mess up and you tell me I messed up. And you're not the first person to tell me you have to send me $5 on venmo @/emeraldwhale
EDIT 2: terfs you have to kill yourselves now. And also send me $50
012 45 789
8/10
14 notes · View notes
indigosabyss · 5 months
Text
Dr Stone Characters Reincarnating into Naruto: Nanami Sai Edition
For all intents and purposes, Sai was one of the Academy civilian students that would fly under the radar. Reasonably good at academics, well above average at thrown weapons, average or below at everything else. Probably would be in the genin corps. Or a career chunin if he pushed it.
Until the codes were found.
It was one of the Academy teachers who had brought attention to it, after he found scraps of what he expected to be an encoded note to a classmate.
A long series of letters and numbers, startlingly good for his age. As in, no one was able to decode it.
At first, it started with the Intelligence Division considering extending an apprenticeship to the boy. But concerns quickly rose as inspection of the boy's belongings found thick volumes filled with the exact same code.
No plaintext, no references, no doodles that an ordinary child would be drawn to do. No gradual evolution of the code being built up, either. As if it had sprout up fully formed, yet had completely avoided being picked up by the Konoha Intelligence Division.
After some frantic deliberation, the boy was dragged to T&I for questioning.
The second they put the incriminating books in front of him, he started bawling.
"I JUST WANTED TO PLAY MONSTER QUEST."
After some panicked confusion, and a box of tissues (Torture and Interrogation wasn't equipped for crying children) they manage to coax some semblance of a story out of the kid.
"So you've figured out an architecture structure that will revolutionize our computational systems." Ibiki surmised, feeling a little lost, "And... you want to use that to make a 'video game'. Which you have painstakingly been coding for years now."
Sai sniffled and nodded.
"And you didn't think that this would instead be a better way to encrypt and store our information?" Ibiki asked him, feeling a little lost. The possibilities the boy had laid out in front of him were baffling. And he wanted to make games?
Sai looked at him, looking just as lost, "But that's no fun." He pointed out.
Well, at least they knew this really was a kid and not a child sized invader.
(notes under cut)
I think we can do a lot with the utter geniuses in child bodies running about who have absolutely no intention of helping the ninjas without being asked first.
In a full length version of the fic, I would give Sai a different name but to minimize confusion I stuck with this.
If you're familiar with my previous dabbling in dcst x naruto this past month ur prolly asking why I wrote this when I was clearly more favorable to Francois being the one who gets reincarnated. And the answer is. This is happening in the same timeline.
Ofc this ficlet in particular is a bit out there. I just wrote it to highlight the two important traits of Sai in the AU. 1) he still wants to be a programmer. 2) ppl think he's a spy for it.
I've built up lore for this you know. Most pre-petrification characters will be reborn years and years away from each other and there are going to be Reasons for why this is happening (chakra meteor reasons :DDDD) and I have literally so much in store for this AU.
I'll try to get it all out in writing on tumblr but in case i don't you can always ask me to clarify on twitch. Am on hiatus atm but will be back in the second week of June.
3 notes · View notes
canmom · 2 years
Text
AI and Shannon information
there’s an argument I saw recently that says that, since an AI image generator like Stable Diffusion compresses a dataset of around 250TB down to just 4GB of weights, it can’t be said to be storing compressed copies of the images in the dataset. the argument goes that, with around 4 billion images in the dataset, each image only contributes around 4 bytes to the training data.
I think this is an interesting argument, and worth digging into more. ultimately I don’t think I agree with the conclusion, but it’s productive just in the sense of trying to understand what the hell these image generators are, and also to the understanding of artwork in general.
(of course this came up in an argument about copyright, but I’m going to cut that out for now.)
suppose I had a file that's just a long string of 010101... with 01 repeating N times in total. I could compress that to two bits of data, a number (of log2(N) bits) that says how many times to repeat it, and an algorithm that repeats the string N times. this is an example of Run Length Encoding. a more sophisticated version of this idea gives the DEFLATE algorithm underlying zip and png files.
that's lossless compression, meaning the compressed image can be decompressed to an exact copy of the original data. by contrast, lossy compression exploits properties of the human visual system by discarding information that we are unlikely to notice. its output only approximates the original input, but it can achieve that much greater compression ratios.
our compression algorithms are tuned to certain types of images, and if they're fed something that doesn't fit, like white noise with no repeating patterns to exploit or higher frequencies to discard, they'll end up making it larger, not smaller.
depending on the affinities of the algorithm, some things 'compress well' like an PNG image with a lot of flat colours, and some things 'compress poorly' like a film of snowflakes and confetti compressed with H.264. an animation created to be encoded with GIF, with hard pixel art edges and indexed colours and may perform poorly in an algorithm designed for live film such as WebP.
now, thought experiment: suppose that I have a collection of books that are all almost identical except for a serial number. let's say the serial number is four bytes long, so that could be as many as 2^32=4,294,967,296 books. say the rest of the book is N bytes long. so in total a book is N+4 bytes. my 'dataset' is thus 2^32×(N+4) bytes. my compression algorithm is simple: the algorithm holds the N bytes of book similar to LenPEG, the encoded file is a four byte serial number, and I simply append the two.
how much data is then used to represent any given book in the algorithm? well there's 2^32 books, so if the algorithm holds N bytes of uncompressed book, we could make the same argument as 'any given image corresponds to just 4 bytes in Stable Diffusion's weights' to say that any given book is represented by just N/2^32 bytes! probably much less than a byte per book, wow! in fact 2^32 is arbitrary, we could push it as high as we like, and have the 'average amount of data per book' asymptote towards zero. obviously this would be disingenuous because the books are almost exactly the same, so in fact, once we take into account both the decoder and the encoded book, we’re storing the book in N+4 bytes.
so ultimately the combination of algorithm and encode together is what gives us a decompressed file. of course, usually the encoder is a tiny part of this data that can be disregarded. for example, the ffmpeg binary weighs in at just a few megabytes. it’s tiny! with it, I can supply (typically) hundreds of megabytes of compressed video data using, say, H.265, and it will generate bitmap data I can display on my screen at high definition. this is a great compression ratio compared to what is likely many terabytes if stored as uncompressed bitmaps, or hundreds of gigabytes of losslessly compressed frames. with new codecs like AV1 I could get it even smaller.
compression artefacts with algorithms like JPEG and H.264/5 are usually very noticeable - ‘deep frying’, macroblocking, banding etc. this is not true for all compression algorithms. there are algorithms that substitute ‘plausible looking’ data from elsewhere in the document. for example if you scan a text file, you can store just one picture of the letter A, and replace every A with that example. this is great as long as you only replace the letter A. there was a controversy a few years ago where Xerox scanners using the JBIG2 format were found to be substituting numbers with different numbers for the sake of compression - e.g. replacing a 6 with an 8. unlike the JPEG ‘deep frying’, this kind of information loss is unnoticeable unless you have the original to compare.
in fact, normal text encoding is an example of this method. I can generate the image of text - this post for example - on your screen by passing you what is essentially a buffer of indices into a font file. each letter is thus stored as one or two bytes. the font file might be, say, a hundred kilobytes. the decoding algorithm takes each codepoint, looks it up in the font file to fetch a vector drawing of a particular letter, rasterises it and displays the glyph on your screen. I could also take a screenshot of this text and encode it with, say, PNG. this would generate an equivalent pixel representation, but it would be a much larger file, maybe hundreds of kilobytes.
so UTF-8 encoded text and a suitable font renderer is a really great encoding scheme. it stores a single copy of the stuff that’s redundant between all sorts of images (the shapes of letter glyphs), it’s easily processed and analysed on the computer, and it has the benefit that a human can author it very easily using a keyboard - even easier than we could actually draw these letters on paper. compared to handwritten text, you lose the particular character of a person’s handwriting, but we don’t usually consider that important since the intent of text is to convey words as efficiently as possible.
come back to AI image generators. most of that 4GB is encoding whatever common patterns the neural net's training process detected, redundant across thousands of very similar images. the text prompt is the part that becomes analogous to 'compressed data' in that it specifies a particular image from the class of images that the algorithm can produce. it’s only tens of bytes long and it’s even readable and authorable by humans. as an achievement in lossy image compression, even with its limitations, this is insanely impressive.
AI image generators of the ‘diffusion’ type spun out of research into image compression (starting with Ho et al.). the researchers discovered that it was possible to interpolate the ‘latent space’ produced by their compression system, and ‘decode’ new images that share many of the features of the images they were trying to compress and decompress.
ok, so, the point of all this.
Shannon information and the closely related ‘entropy’ a measure of how ‘surprising’ a new piece of data is. in image compression, it measures how much information you need to distinguish a particular piece of data from a ‘default’ set of assumptions. if you know something about the sort of data you expect to see, you need correspondingly less information to distinguish the particulars.
image compression is all about trying to exploit commonalities between images to minimise the amount of ‘extra’ information you need to distinguish a specific image from the other ones you expect to be called on to decode. for example, in video encoding, it’s observed that often you see a patch of colours moving as a unit. so instead of storing the same block of pixels with a slight offset on successive frames, you can store it just once, and store a vector saying how far it’s moved and in which direction - this is a technique called block motion compensation. using this technique, you can save some data for videos that fit this pattern, since they’re not quite as surprising.
the success of AI does end up suggesting the rather startling conclusion with the right abstraction, there isn't a huge amount of Shannon information distinguishing any two works, at least at the level of approximation the AI can achieve. this isn't entirely surprising. in AI code generation - well how much code is just boilerplate? in illustration, how many pictures depict very similar human faces from a similar 3/4 angle in similar lighting? the AI might theoretically store one representation of a human face, and then just enough information to distinguish how a particular face differs from the default face.
compare this with a Fourier series. you transform a periodic function, and you get first the broad gist (a repeating pattern -> a single sine wave), and then a series of corrections (sine waves at higher frequencies). if you stop after a few corrections, you get pretty close. that’s roughly how JPEG works, incidentally.
the AI's compression is very lossy; it will substitute a generic version that's only approximately the same as a particular picture, that only approximately realises some text prompt. since text prompts are quite open ended (they can only add so much shannon information), there are a huge amount of possible valid ‘decodings’. the AI introduces some randomness in its process of ‘decoding’ and gives you a choice.
to get something more specific, you must fine-tune your prompt with more information, or especially provide the AI with existing images to tune itself to. one of the main ways you can fine-tune your prompt is by invoking the name of a specific artist. in the big block of encoded algorithm data and its ‘latent space’, the weights representing the way that that specific artist’s work differs from the ‘core model’ will be activated. how much information are you drawing on there? it’s hard to tell, and it will further depend how much that artist is represented in the dataset.
by training to a specific artist, you provide the AI a whole bunch of Shannon information, considerably more than four bytes - though maybe not that much once the encoding is complete, just enough to distinguish that artist’s work from other artists in the ‘latent space’. in this sense, training an AI on someone's work specifically, or using their name to prompt it to align with their work, is creating a derivative work. you very much could not create a ‘similar’ image without providing the original artworks as an input. (is it derivative enough to trigger copyright law? that’s about to be fought.)
I say this neutrally - art thrives on derivative work: collage, sampling in music, studies of other artists... and especially just plain old inspiration and iteration. but “it isn't derivative” or it “isn’t an encoding” isn't a good hill to defend AI from. the Stable Diffusion lawsuit's description of the AI as an automated collage machine is a limited analogy, but it's at least as good as saying an AI is just like a human artist. the degree to which an AI generated image is derivative of any particular work is trickier to assess. the problem is you, as a user of the AI, don’t really know what you’re leaning on because there’s a massive black box obscuring the whole process. your input is limited to the amount of Shannon information contained in a text prompt.
ok, that’s the interesting part over. everything about copyright and stuff I’ve said before. AI is bad news for artists trying to practice under capitalism, copyright expansion would also be bad news and likely fail to stop AI; some artists in the USA winning a class action does not do much for artists anywhere else. etc. etc. we’ll probably find some way to get the worst of both worlds. but anyway.
20 notes · View notes
usafphantom2 · 10 months
Text
Tumblr media
'First light': NASA receives laser beam message from 16 million kilometers away
Fernando Valduga By Fernando Valduga 11/27/2023 - 08:43 in Space, Technology
An innovative experiment flying aboard NASA's Psyche mission has just reached its first major milestone by successfully carrying out the most distant demonstration of laser communications.
The technological demonstration may one day help NASA's missions to investigate space more deeply and discover more discoveries about the origin of the universe.
Launched in mid-October, Psyche is currently on its way to humanity's first glimpse of a metallic asteroid between the orbits of Mars and Jupiter. The probe will spend the next six years traveling about 3.6 billion kilometers to reach its namesake, located on the outside of the main asteroid belt.
Tumblr media
Along with the tour is the demonstration of Deep Space Optical Communications technology, or DSOC, which is carrying out its own mission during the first two years of the trip.
Tumblr media
The technological demonstration was designed to be the most distant experiment of the U.S. space agency of high-bandwidth laser communications, testing the sending and receiving of data to and from Earth using a near-infrared invisible laser. The laser can send data from 10 to 100 times the speed of traditional radio wave systems that NASA uses in other missions. If it is totally successful in the coming years, this experience could be the future basis of the technology used to communicate with humans who explore Mars.
And the DSOC recently achieved what engineers called the "first light", the feat of successfully sending and receiving their first data.
The experiment sent for the first time a laser encoded with data from far beyond the Moon. The test data were sent from almost 16 million kilometers away and arrived at the Hale Telescope at the Palomar Observatory of the California Institute of Technology in Pasadena, California.
The distance between DSOC and Hale was about 40 times greater than the Moon is from Earth.
Tumblr media
Psyche Probe.
“Achieveing the first light is one of the many critical milestones of the DSOC in the coming months, paving the way for communications with higher data rates, capable of sending scientific information, high-definition images and video streaming in support of humanity's next giant leap: sending human beings to Mars,” Trudy Kortes, director of technology demonstrations at NASA's Directorate of Space Technology Missions, said in a statement.
Sending lasers through space
Tumblr media
The first light, which occurred on November 14, happened when the laser flight transceiver instrument in Psyche received a laser beacon sent from the Optical Communications Telescope Laboratory at the Table Mountain facility of NASA's Jet Propulsion Laboratory near Wrightwood, California.
The initial beacon received by Psyche's transceiver helped the instrument point its laser to send data back to the Hale Telescope, which is located about 160 kilometers south of Table Mountain.
“The test (of November 14) was the first to fully incorporate ground resources and the flight transceiver, requiring the DSOC and Psyche operations teams to work together,” Meera Srinivasan, DSOC operations leader at JPL, located in Pasadena, California, said in a statement. "It was a formidable challenge and we have much more work to do, but in a short time we were able to transmit, receive and decode some data."
This is not the first time that laser communications have been tested in space. The first bidirectional laser communication test took place in December 2021, when NASA's Laser Communications Relay Demonstration was launched and went into orbit about 22,000 miles (35,406 kilometers) from Earth.
Since then, experiments have sent optical communications from the Earth's low orbit to the Moon. And the Artemis II spacecraft will use laser communications to send high-definition videos of a manned trip around the Moon. But DSOC marks the first time that laser communications have been sent through deep space, which requires incredibly accurate aiming and pointing over millions of kilometers.
Tumblr media
The initial test of the capabilities of the technical demonstration will allow the team to work on the refinement of the systems used in the laser pointing accuracy. As soon as the team ticks this box, the DSOC will be ready to send and receive data to the Hale Telescope as the spacecraft moves away from Earth.
Future challenges
Although the DSOC does not send scientific data collected by the Psyche spacecraft because it is an experiment, the laser will be used to send bits of test data encoded in the photons of the laser, or quantum light particles.
Tumblr media
Detector matrices on Earth can capture the Psyche signal and extract the data from the photons. This type of optical communication can change the way NASA sends and receives data from its deep space missions.
“Optical communication is a blessing for scientists and researchers who always want more from their space missions and will allow human exploration of deep space,” Dr. Jason Mitchell, director of the Advanced Communications and Navigation Technologies Division of the Department of Communications and NASA's Space Navigation program said in a statement. "More data means more discoveries."
As Psyche continues his journey, more challenges await him.
youtube
The DSOC team will monitor how long it takes for laser messages to travel through space. During the first light, the laser took only 50 seconds to travel from Psyche to Earth. At the farthest distance between the spacecraft and the Earth, the laser is expected to take 20 minutes to travel one direction. And during that time, the spacecraft will continue to move and the Earth will rotate.
Meanwhile, the Psyche spacecraft continues to prepare for its main mission, connecting propulsion systems and testing the scientific instruments needed to study the asteroid when it arrives in July 2029. The mission will be able to determine whether the asteroid is the exposed nucleus of an ancient planetary building block since the beginning of the solar system.
Source: CNN
Tags: SpaceNASA
Sharing
tweet
Fernando Valduga
Fernando Valduga
Aviation photographer and pilot since 1992, has participated in several events and air operations, such as Cruzex, AirVenture, Dayton Airshow and FIDAE. He has work published in specialized aviation magazines in Brazil and abroad. Uses Canon equipment during his photographic work in the world of aviation.
Related news
SPACE
Space Operations Command reveals official 'futuristic' painting
11/26/2023 - 6:01 PM
A Chollima-1 rocket launched North Korea's first spy satellite on Tuesday, after two failed attempts since the summer. (Photo: NKNA)
SPACE
North Korea claims that it has successfully put spy satellite into orbit and will launch more
22/11/2023 - 21:17
Sierra Space's first Dream Chaser vehicle, called Tenacity, will soon go to a NASA facility in Ohio for environmental testing before a launch scheduled for spring 2024. (Photo: Sierra Space)
SPACE
Sierra Space completes the assembly of the first Dream Chaser, Tenacity
22/11/2023 - 11:14
TECHNOLOGY
NASA starts X-59 Quesst painting process
20/11/2023 - 23:03
SPACE
VIDEO AND IMAGES: SpaceX's Starship launch fails minutes after reaching space
11/18/2023 - 4:30 PM
SPACE
Satellite with Saab technology is launched by SpaceX
14/11/2023 - 08:10
3 notes · View notes
paulfinn · 1 year
Text
Discussion Leader Presentation
"Encoding, Decoding" by Stuart Hall
In “Encoding/decoding” Stuart Hall details three positions that people may take upon decoding a television message. Hall argues that the negotiated position is one where the populace understands and partially agrees with the intended, dominant messages encoded while also disagreeing with some elements of the messages. Hall describes negotiated decoding as, “Decoding within the negotiated version contains a mixture of adaptive and oppositional elements: it acknowledges the legitimacy of the hegemonic definitions to make the grand significations (abstract), while, at a more restricted, situational (situated) level, it makes its own ground rules - it operates with exceptions to the rule. It accords the privileged position to the dominant definitions of events while reserving the right to make a more negotiated application to ‘local conditions’, to its own more corporate positions.” Hall argues that this position of decoding allows for the audience to understand and often agree with the dominant message given they have put their own rules in place to coexist with the dominant message.
Beastie Boys, Sabotage (1994)
youtube
The music video for "Sabotage" by The Beastie Boys is a humorous parody of 1970s cop television shows, featuring the band members as comically exaggerated characters in a high-energy chase scene filled with stunts and explosions. The majority of audiences loved the music video for its goofy and fun nature while some also speculated the presence of an underlying anti-authority message. For years the meaning of the song was left for speculation until it was later revealed in the band’s memoir that the inspiration for the song was rooted in the band’s frustration with their producer, Mario Caldato Jr, who would push the band to finish tracks in an attempt to complete as many as possible. 
The music video was encoded to poke fun at and indirectly call out their producer for essentially sabotaging various tracks of theirs by putting on costumes and performing wild stunts. Audiences, in turn, decoded the music video from a negotiation position as they resonated with the anti-authority sentiment while instead directing those feelings towards police and law enforcement at the time.
Rick Astley, Never Gonna Give You Up
youtube
In the "Never Gonna Give You Up" music video, Rick Astley performs in various 1980s settings where he passionately sings and dances while encoding a message of unwavering love and commitment to someone he deeply cares about by staying with them and never giving up on them. The video is also renowned for its connection to the internet meme "Rickrolling" where viewers are unknowingly redirected to the music video after clicking on a link. 
The song's transformation into an internet meme through "Rickrolling" illustrates how the song was decoded through a negotiation position. Viewers began to decode the song in a humorous and subversive way, using it to playfully prank others online. This decoding led audiences to associate the song's meaning not only with a heartfelt declaration, but also with a symbol of internet culture's ironic and unexpected twists while also keeping the theme of resilience as the internet never gave up on the use of this playful prank.
Discussion Questions
What are some other pieces of media that received negotiated coding?
Did you pick up on any other messages from the Sabotage music video?
In which medium do you tend to find more negotiated coding?
2 notes · View notes
ivettel · 2 years
Text
behind the curtain ep 4
bit late with this one but i wanted to finish laundry first so here we go.. notes on this bitch
Tumblr media
right off the bat, yes! i did rotoscoping!
what! after avoiding it since 2017!! shoutout to jennifer @antoniosvivaldi for inspiring me to do that, btw. you should absolutely check out her stuff if you haven't already--her style is so unique and refreshing!
for the most part, i think they turned out swell--after effects behaved itself for once, which like, thank fuck, because i was on a call with fio @maranello and others at 1 in the morning like "haha! i totally know what i'm doing!" narrator voice they did NOT know what they were doing, they were making educated guesses based on past horrible experiences (hence avoiding rotoscoping for years 💀).
but this is meant to be educational lol so! what is rotoscoping? simply put, it's a tracing technique. it has its roots way back in animation when tech was starting to pick up in like the 1920s and artists wanted a more efficient way of animating. rotoscoping is one of those tools that've been used differently from how it was originally intended, which is actually? so cool from like, a media arts study perspective?? because it's commonplace to use it for live-action film and vfx work as a way to mask scenes out and isolate them in addition to its original use of mapping things to isolated scenes. i won't bore you with the stuart hall encoding/decoding stuff, but just know that i find the development of digital art circa adobe dominance fascinating. i am using this century-old animation technique to impose my blorbo upon the eyes of thousands.
ANYWAY. i really liked this particular mask--it has a lot of movement but still manages to flow nicely?
Tumblr media
me: [cuts off luke's arm] fio: i think that's his arm me: oh... my god
next up: the lightsaber
goodness. where do i start. well first of all i had a vision of something much more 2d when it comes to lightsaber anatomy, lol. but i extended my subscription for maxon and figured--why not take full advantage of this while i've still got it? so i got this 3d model of luke's lightsaber. it's untextured and unrigged and clunky but thankfully it had most of the inner parts so as far as i'm concerned i struck GOLD.
idk what i can really say in terms of like What Is 3d Modelling, because i think people have an understanding of that. so we'll go instead thru my process!
i added materials and added a null object (does that count as rigging? for something as straightforward as this?) to do a simple rotation animation on the first day...
Tumblr media
and then i had an idea before bed to separate the parts like that one scene in the clone wars where they show how a lightsaber is assembled, except i haven't watched the scene so god knows how they animated it NKJFGNDFKJGDF. anyway the day after, this was kinda where i got:
Tumblr media
keyframing on c4d is a Bitch because u can't just Access The Graph Editor you have to go through the dope sheet and change ur views and it's just. annoying!! coming from an after effects standpoint! but i can see how it's optimized for Actual Animation work so ughgh. we deal. onwards..
asked the team over at usergif and natalie @kenobiis suggested putting in a kyber crystal to fill out the middle. i ended up taking the og "laser" cylinder and modifying/animating it because uh THIS is the real inside of a lightsaber and i am not putting all that into a 3 second gif LMAO. but yeah i fine-tuned the animation and plopped it in after effects, then fiddled with video copilot's saber to make luke's blade.
Tumblr media
u might notice the motion blur--that's re:vision's RSMB! i also added a little bit of depth of field with frischluft, but it doesn't show up well in gif form. speaking of things that don't render well:
there is A FUCK TON of aliasing going on. i couldn't make any anti-aliasing settings work for some reason so i ended up trying to smooth it out in ae.. to probably not a lot of effect. i got the very edges around it smoothed out with the classic gaussian blur and a matte choker method, but the black rings are killer. ugh. it's whatever, i figure i'll work something better out for the next time.
finished animation in c4d + the final gif:
Tumblr media Tumblr media
the rest
everything else is fairly basic and intuitive i think? obviously used shape layers + alpha mattes, my beloved. i fucked up a little on the text because i think i made my offset keyframes backwards somewhere in the middle of the process but at that point i was too lazy to go back in and fix it. oops!
anyway if u got this far hello thank u i hope this was informative in some way. if u have any questions don't be afraid to ask :D
7 notes · View notes