#Ai scraping
Explore tagged Tumblr posts
artsietango · 1 year ago
Text
This Google Drive AI scraping bullshit actually makes me want to cry. My entire life is packed into Google Drive. All of my writing over the years, all of my academic documents, everything.
I’m just so overwhelmed with all the shit I’m going to have to move. I’m lucky to have Scrivener, but online data storage has been super important as I’ve had so many shitty computers, and the only reason I haven’t lost work is because Google Drive has been my backup storage unit.
My partner has recommended gitlab to move my files to - it seems useful, and I can try and explain more about what it is and how it works when I get more familiar with it. I’m unsure if it’s a text editor, or can work that way. He was explaining something about the version history that I don’t quite understand right now but might later. I’m just super overwhelmed and frustrated that this is the dystopia we live in right now.
29K notes · View notes
z-mizcellaneous-z · 1 year ago
Text
listen. all im saying is it would be iconic as fuck if the writers on strike wrote insane amounts of horrendously smutty omegaverse fan fiction so when the studios try to AI scrape they'll be fucked over into next year
5K notes · View notes
amalgamasreal · 1 year ago
Text
SOURCE
Bit of a long video but worth a watch.
TL;DW though is that hidden in the Terms and Conditions for Google's AI Labs is a nice little poison pill that says they get access to your entire Google Drive if you opt in.
So if you're an author of some type and you keep your unpublished works in your G-Drive that means an AI will get to scrape all of it and by opting in you will have given them permission to it. The content creator goes on to predict that Google is going to let out their own streaming service where the scripts, and potentially the art if it's animated, will be almost or entirely AI generated using that scraped data as a baseline and the authors/artist's who's work was essentially stolen in its most raw form to crib from will have zero way of fighting Google on that in our current legal system.
This is of course right in the middle of the writers and actors strike where we're seeing just what lengths studios will go to in order to screw everyone but themselves.
They go on to recommend that if you keep any creative or personal works on Google Drive that you pull it off as soon as possible and delete your entire Drive. They acknowledge that of course this doesn't mean Google really deleted the data but if you do it before they start compulsory opting everyone in there's a chance your work might get overlooked. They also recommend several free editing programs that aren't run by corporations like Google with LibreOffice (the default office program of most Linux distros) being named.
Finally they go over methods of shaming Google which I feel like you just have to watch for comedies sake so I won't describe them in full.
Now this is from me: I know the majority of people don't have the ability to build and manage a big archive just for themselves, but if you're a creative NOW IS THE TIME to educate yourself on what you can do to protect your works. Cloud storage was always iffy at best, but with AI scraping entering the mix it's now downright malignant. Get a bunch of thumb drives, buy some external hard drives, if you have the money buy a pre-built NAS, and if you really want to get into learn how to build your own NAS. These are the old ways before cloud and they're coming back again, more important than ever.
2K notes · View notes
canadiancryptid · 10 months ago
Text
Tumblr media
Hey so just saw this on Twitter and figured there are some people who would like to know @infinitytraincrew is apparently getting deleted tonight so if you wanna archive it do it now
417 notes · View notes
fabaulti · 1 year ago
Text
I think most of us should take the whole ai scraping situation as a sign that we should maybe stop giving google/facebook/big corps all our data and look into alternatives that actually value your privacy.
i know this is easier said than done because everybody under the sun seems to use these services, but I promise you it’s not impossible. In fact, I made a list of a few alternatives to popular apps and services, alternatives that are privacy first, open source and don’t sell your data.
right off the bat I suggest you stop using gmail. it’s trash and not secure at all. google can read your emails. in fact, google has acces to all the data on your account and while what they do with it is already shady, I don’t even want to know what the whole ai situation is going to bring. a good alternative to a few google services is skiff. they provide a secure, e3ee mail service along with a workspace that can easily import google documents, a calendar and 10 gb free storage. i’ve been using it for a while and it’s great.
a good alternative to google drive is either koofr or filen. I use filen because everything you upload on there is end to end encrypted with zero knowledge. they offer 10 gb of free storage and really affordable lifetime plans.
google docs? i don’t know her. instead, try cryptpad. I don’t have the spoons to list all the great features of this service, you just have to believe me. nothing you write there will be used to train ai and you can share it just as easily. if skiff is too limited for you and you also need stuff like sheets or forms, cryptpad is here for you. the only downside i could think of is that they don’t have a mobile app, but the site works great in a browser too.
since there is no real alternative to youtube I recommend watching your little slime videos through a streaming frontend like freetube or new pipe. besides the fact that they remove ads, they also stop google from tracking what you watch. there is a bit of functionality loss with these services, but if you just want to watch videos privately they’re great.
if you’re looking for an alternative to google photos that is secure and end to end encrypted you might want to look into stingle, although in my experience filen’s photos tab works pretty well too.
oh, also, for the love of god, stop using whatsapp, facebook messenger or instagram for messaging. just stop. signal and telegram are literally here and they’re free. spread the word, educate your friends, ask them if they really want anyone to snoop around their private conversations.
regarding browser, you know the drill. throw google chrome/edge in the trash (they really basically spyware disguised as browsers) and download either librewolf or brave. mozilla can be a great secure option too, with a bit of tinkering.
if you wanna get a vpn (and I recommend you do) be wary that some of them are scammy. do your research, read their terms and conditions, familiarise yourself with their model. if you don’t wanna do that and are willing to trust my word, go with mullvad. they don’t keep any logs. it’s 5 euros a month with no different pricing plans or other bullshit.
lastly, whatever alternative you decide on, what matters most is that you don’t keep all your data in one place. don’t trust a service to take care of your emails, documents, photos and messages. store all these things in different, trustworthy (preferably open source) places. there is absolutely no reason google has to know everything about you.
do your own research as well, don’t just trust the first vpn service your favourite youtube gets sponsored by. don’t trust random tech blogs to tell you what the best cloud storage service is — they get good money for advertising one or the other. compare shit on your own or ask a tech savvy friend to help you. you’ve got this.
1K notes · View notes
thenerdyindividual · 2 years ago
Text
So with AO3 recommending locking your fics to help prevent scraping for AI use, I know a few people (myself included) who have locked down their fics. But it’s made me curious how many people are locking so…
Also reblog this and tell me in the tags why you do or don’t plan to lock your works.
For those of you that want to lock your works but don’t want to do each fic individually, here is a tutorial for how to lock all your fics at once.
1K notes · View notes
etakeh · 3 months ago
Text
If anyone's on the (super uncool but sometimes necessary in order to get a job) website Linkedin, they have made AI data collecting opt-OUT.
So settings and privacy → data privacy → Data for Generative AI Improvement → Off
Tumblr media
While you're there, dedicate a good 10 minutes to going through the rest of the settings. THERE ARE SO MANY.
And they're all turned on.
80 notes · View notes
probablyasocialecologist · 5 months ago
Text
There has been a real backlash to AI’s companies’ mass scraping of the internet to train their tools that can be measured by the number of website owners specifically blocking AI company scraper bots, according to a new analysis by researchers at the Data Provenance Initiative, a group of academics from MIT and universities around the world.  The analysis, published Friday, is called “Consent in Crisis: The Rapid Decline of the AI Data Commons,” and has found that, in the last year, “there has been a rapid crescendo of data restrictions from web sources” restricting web scraper bots (sometimes called “user agents”) from training on their websites. Specifically, about 5 percent of the 14,000 websites analyzed had modified their robots.txt file to block AI scrapers. That may not seem like a lot, but 28 percent of the “most actively maintained, critical sources,” meaning websites that are regularly updated and are not dormant, have restricted AI scraping in the last year. An analysis of these sites’ terms of service found that, in addition to robots.txt restrictions, many sites also have added AI scraping restrictions to their terms of service documents in the last year.
[...]
The study, led by Shayne Longpre of MIT and done in conjunction with a few dozen researchers at the Data Provenance Initiative, called this change an “emerging crisis” not just for commercial AI companies like OpenAI and Perplexity, but for researchers hoping to train AI for academic purposes. The New York Times said this shows that the data used to train AI is “disappearing fast.”
23 July 2024
85 notes · View notes
cinnamontails-ff · 7 days ago
Text
Tumblr media
Thank you to everyone who's reached out to me regarding this piece of AI-scraping garbage. I already reported it to AO3 and am hoping it will be removed. If there's anything else I can do, please let me know!
In the meantime, @bananaiguana is currently working on an absolutely fantastic podfic for Accountant's Guide, so if you've ever wanted an audio version -- this is where it's at. Grass-fed and 100% human-made  ❤
Also, for the record: Scarlett would never.
25 notes · View notes
vladdyissues · 3 days ago
Text
I hated doing it, but in light of the current Speechify scandal and the likelihood of other avaricious AI "entrepreneurs" doing the same thing in the future—harvesting fanfiction off of AO3, butchering it with AI, and putting it behind a paywall—I've locked all my works indefinitely, which means only logged in AO3 users will be able to read and comment. I apologize for any inconvenience this may cause to guests, especially my non-American readers.
If you're able, please sign up for an AO3 account. It's extremely easy and well worth it. I have a few invites left over if you'd like one. DM me if you're interested.
35 notes · View notes
ahiddenpath · 7 days ago
Text
‘Netflix of audiobooks’ scrapes thousands of fanworks off ao3 without permission. Yours, likely, included
Another day, another damned website stealing fanfiction. Check the reddit post for details, including how to request removal if your work was stolen.
God damn it. I changed the last two open fanfics of mine to private only, to hopefully make it harder for bots to scrape. It means I get fewer hits, as guests don't see my work, but there you have it.
Please support your fandom creators, this is the bs we're dealing with.
17 notes · View notes
the960writers · 10 months ago
Text
In light of the current shit with tumblr making us opt-out of sharing our blogs with AI scrapers, I checked the state of Wordpress for this and, not surprisingly since it's the same company, you need to opt out there too.
If you have a wordpress-blog of the NAME.wordpress.com kind, you need to go into Settings and under the section Privacy, hit the checkmark for "Prevent third-party sharing for NAME.wordpress.com".
Tumblr media
I know some of us here at writeblr have secondary blogs on wordpress, so make sure to opt-out of AI scraping there.
99 notes · View notes
llyfrenfys · 10 months ago
Text
By the way - I've had my work scraped by AI before. I'm protected only in that the AI sucks when it comes to minoritised languages. The site where I saw my work was a scam bookstore selling a Victorian Welsh dictionary and clearly a scraping ai saw my work had the words "Welsh" and "Dictionary" in it and went ham. Resulting in a product description with bits of my work on LGBT+ terminology in it. This anecdote in itself is funny, but the practice of ai scraping is not. I'm a writer and many thousands of writers like me depend on our written output for our livelihoods/careers. Allowing ai scraping on tumblr is putting a lot of people's livelihoods at risk. I don't even earn anything from my work- but I know many others who rely on their writing to get by and I'm so worried for all of them.
I genuinely don't want to leave this site. I refuse to move anywhere else and want to make this a better place. Rather than migrate platforms every few years.
Automattic, do better. Tagging @staff to voice concerns, but do so with the caveat I know it's none of their fault. This is an Automattic issue mainly.
64 notes · View notes
canadiancryptid · 10 months ago
Text
New privacy setting just dropped! Its turned off by default!
Tumblr media
Its under blog settings, for each individual sideblog. Bottom of the page. Don't know if you can get to it from app but you definitely can on desktop mode
80 notes · View notes
writingwife-83 · 2 years ago
Text
Might make a poll from the perspective of readers too 🤔
280 notes · View notes
scrapethiswouldya · 2 months ago
Text
I love that one band... What's it called? Insane clown pussy? Yea that's right. hoot hoot!
17 notes · View notes