#no spn
Explore tagged Tumblr posts
clarrisani · 7 days ago
Text
Seems to be true of 2024 as well.
AO3 Ship Stats: Year In Bad Data
You may have seen this AO3 Year In Review.
Tumblr media
It hasn’t crossed my tumblr dash but it sure is circulating on twitter with 3.5M views, 10K likes, 17K retweets and counting. Normally this would be great! I love data and charts and comparisons!
Except this data is GARBAGE and belongs in the TRASH.
I first noticed something fishy when I realized that Steve/Bucky – the 5th largest ship on AO3 by total fic count – wasn’t on this Top 100 list anywhere. I know Marvel’s popularity has fallen in recent years, but not that much. Especially considering some of the other ships that made it on the list. You mean to tell me a femslash HP ship (Mary MacDonald/Lily Potter) in which one half of the pairing was so minor I had to look up her name because she was only mentioned once in a single flashback scene beat fandom juggernaut Stucky? I call bullshit.
Now obviously jumping to conclusions based on gut instinct alone is horrible practice... but it is a good place to start. So let’s look at the actual numbers and discover why this entire dataset sits on a throne of lies.
Here are the results of filtering the Steve/Bucky tag for all works created between Jan 1, 2023 and Dec 31, 2023:
Tumblr media
Not only would that place Steve/Bucky at #23 on this list, if the other counts are correct (hint: they're not), it’s also well above the 1520-new-work cutoff of the #100 spot. So how the fuck is it not on the list? Let’s check out the author’s FAQ to see if there’s some important factor we’re missing.
The first thing you’ll probably notice in the FAQ is that the data is being scraped from publicly available works. That means anything privated and only accessible to logged-in users isn’t counted. This is Sin #1. Already the data is inaccurate because we’re not actually counting all of the published fics, but the bots needed to do data collection on this scale can't easily scrape privated fics so I kinda get it. We’ll roll with this for now and see if it at least makes the numbers make more sense:
Tumblr media
Nope. Logging out only reduced the total by a couple hundred. Even if one were to choose the most restrictive possible definition of "new works" and filter out all crossovers and incomplete fics, Steve/Bucky would still have a yearly total of 2,305. Yet the list claims their total is somewhere below 1,500? What the fuck is going on here?
Let’s look at another ship for comparison. This time one that’s very recent and popular enough to make it on the list so we have an actual reference value for comparison: Nick/Charlie (Heartstopper). According to the list, this ship sits at #34 this year with a total of 2630 new works. But what’s AO3 say?
Tumblr media
Off by a hundred or so but the values are much closer at least!
If we dig further into the FAQ though we discover Sin #2 (and the most egregious): the counting method. The yearly fic counts are NOT determined by filtering for a certain time period, they’re determined by simply taking a snapshot of the total number of fics in a ship tag at the end of the year and subtracting the previous end-of-year total. For example, if you check a ship tag on Jan 1, 2023 and it has 10,000 fics and check it again on Jan 1, 2024 and it now has 12,000 fics, the difference (2,000) would be the number of "new works" on this chart.
At first glance this subtraction method might seem like a perfectly valid way to count fics, and it’s certainly the easiest way, but it can and did have major consequences to the point of making the entire dataset functionally meaningless. Why? If any older works are deleted or privated, every single one of those will be subtracted from the current year fic count. And to make the problem even worse, beginning at the end of last year there was a big scare about AI scraping fics from AO3, which caused hundreds, if not thousands, of users to lock down their fics or delete them.
The magnitude of this fuck up may not be immediately obvious so let’s look at an example to see how this works in practice.
Say we have two ships. Ship A is more than a decade old with a large fanbase. Ship B is only a couple years old but gaining traction. On Jan 1, 2023, Ship A had a catalog of 50,000 fics and ship B had 5,000. Both ships have 3,000 new works published in 2023. However, 4% of the older works in each fandom were either privated or deleted during that same time (this percentage is was just chosen to make the math easy but it’s close to reality).
Ship A: 50,000 x 4% = 2,000 removed works Ship B: 5,000 x 4% = 200 removed works
Ship A: 3,000 - 2,000 = 1,000 "new" works Ship B: 3,000 - 200 = 2,800 "new" works
This gives Ship A a net gain of 1,000 and Ship B a net gain of 2,800 despite both fandoms producing the exact same number of new works that year. And neither one of these reported counts are the actual new works count (3,000). THIS explains the drastic difference in ranking between a ship like Steve/Bucky and Nick/Charlie.
How is this a useful measure of anything? You can't draw any conclusions about the current size and popularity of a fandom based on this data.
With this system, not only is the reported "new works" count incorrect, the older, larger fandom will always be punished and it’s count disproportionately reduced simply for the sin of being an older, larger fandom. This example doesn’t even take into account that people are going to be way more likely to delete an old fic they're no longer proud of in a fandom they no longer care about than a fic that was just written, so the deletion percentage for the older fandom should theoretically be even larger in comparison.
And if that wasn't bad enough, the author of this "study" KNEW the data was tainted and chose to present it as meaningful anyway. You will only find this if you click through to the FAQ and read about the author’s methodology, something 99.99% of people will NOT do (and even those who do may not understand the true significance of this problem):
Tumblr media Tumblr media
The author may try to argue their post states that the tags "which had the greatest gain in total public fanworks” are shown on the chart, which makes it not a lie, but a error on the viewer’s part in not interpreting their data correctly. This is bullshit. Their chart CLEARLY titles the fic count column “New Works” which it explicitly is NOT, by their own admission! It should be titled “Net Gain in Works” or something similar.
Even if it were correctly titled though, the general public would not understand the difference, would interpret the numbers as new works anyway (because net gain is functionally meaningless as we've just discovered), and would base conclusions on their incorrect assumptions. There’s no getting around that… other than doing the counts correctly in the first place. This would be a much larger task but I strongly believe you shouldn’t take on a project like this if you can’t do it right.
To sum up, just because someone put a lot of work into gathering data and making a nice color-coded chart, doesn’t mean the data is GOOD or VALUABLE.
4K notes · View notes
pansexual-lilychen · 2 months ago
Text
Tumblr media
91K notes · View notes
rabid-transcendentalist · 2 months ago
Text
Tumblr media
74K notes · View notes
strawlessandbraless · 3 months ago
Text
Tumblr media
What an unsurprising & completely expected turn of events that literally everyone saw coming 😮
Source 🔗
Free 🔗
60K notes · View notes
lady-raziel · 2 months ago
Text
Tumblr media
I just wanna know if love wins before America loses
52K notes · View notes
destiel-news-channel · 8 months ago
Text
Tumblr media
[Image ID: The Destiel confession meme edited so that Dean answers 'There's a petition to ban conversion therapy in the EU' to Cas' 'I love you'. /End ID]
If you are a citizen in the EU please sign this petition:
87K notes · View notes
deansbisexualflannel · 2 months ago
Text
Tumblr media
31K notes · View notes
lolstargirl · 2 months ago
Text
Tumblr media
30K notes · View notes
angel-fruitcake · 2 months ago
Text
Tumblr media
29K notes · View notes
demonicseries · 2 months ago
Text
imagine it. The night is November 5th, 2024. The election results are in. Misha Collins post a video. The camera is facing him, as he says “I love you.” Then it pans to the other person in the room, Jensen Ackles. He responds with “Kamala Harris is the next present for the United States”
25K notes · View notes
darthlexapro · 2 months ago
Text
it only just now dawned on me that I’ll very likely learn who won the election from those Supernatural homosexuals. this is how we live now.
Tumblr media
31K notes · View notes
zener · 2 months ago
Text
YOUR gays went to superhell. MY gays went to superheaven. we are not the same.
20K notes · View notes
jackalspine · 7 months ago
Text
@schnuffel-danny hehehe
Tumblr media Tumblr media
regarding this post: from schnuffle
57K notes · View notes
Text
Pretty and spooky
Tumblr media Tumblr media Tumblr media Tumblr media
6K notes · View notes
eldritchsquared · 26 days ago
Text
a customer today looked at me, said “y’know? i think you’ll appreciate this,” and pulled his shirt down to show me his supernatural tattoo. calling me a slur would’ve been easier
13K notes · View notes