siikr
Siikr
54 posts
Don't wanna be here? Send us removal request.
siikr 3 months ago
Note
No, some searches still are not working for me. For example, searching nuclearspaceheater for "go hard" does not yield post 709016480559382529 as I would expect.
Confirmed. Thank you for bringing this to my attention and I hate you for doing this to me.
9 notes View notes
siikr 4 months ago
Text
Upgraded blogs
Attention users whose blogs have already been upgraded (if you've seen a little notice that says your blog is upgrading last time you searched, and now you no longer see that notice -- that's you!).
Your feedback would be appreciated!
If you search for posts by a username you've reblogged from, answered, mentioned, or fake-mentioned (by copying and pasting a note in order to publicly reply), what's the success rate?
Do you see posts where the inline preview of reblogs says "tumblr user" where the full preview displays an actual username? The full preview is the one you get by clicking the eye icon. Can you link me to those posts? If you can't link me to those posts, can you link me to posts you are less ashamed of which do the same thing?
How well does literal phrase search work (by wrapping some search terms in quotes). Is it too literal? Not literal enough? I am generally somewhat unhappy with it but it's difficult to determine how much compute to spend on this for what level of specificity.
Personally, I do not like the fact that words are default ANDed instead of ORed. ORing would be better in that in this case it naturally amounts to AND after ranking, since the posts that include all words would naturally rise to the top. But OR would require more compute and rely on my custom parser of dubious rigor (though honestly the rigor of the official parsers leaves much to be desired too). So if AND is good enough for most of your purposes I guess I can just let it be.
If you disable "include reblogs" in the advanced dialog (gear icon), and then sort your search results by popularity, does the order of the results comport with your general recollection of the popularity of those posts, regardless of their raw note count? (in other words, does it generally seem like the posts that have a lower note count only appear higher up because they also haven't had as much time being viral).
If you try to search for a user as "[username].tumblr.com", is the thing that happens funny? Because I don't care what you say and I'm keeping it.
If you are thinking "I may give this some effort later", please consider doing it sooner instead! The code to decentralize me is almost finished, and it will be much harder to ensure consistency after I am running in multiple versions on multiple servers I have no control over.
4 notes View notes
siikr 4 months ago
Text
Hmm... I've never been decentralized before...
Siikr has had a lot of new users over the past few days
Absolutely none of which have successfully had their blogs indexed because without fail the internet chooses the worst fucking times to make shit go viral.
But then again... new users might have turned into returning users, and Siikr is free for them but expensive for me so, probably this is for the best.
In related news. I AM STILL VERY MUCH LOOKING FOR ANYONE(S) INTERESTED IN HOSTING A DISTRIBUTED SIIKR NODE.
I am willing to decentralize the shit out of this if even just one person is willing to host a node.
Just one.
Anyone.
Please?
Hello?
26 notes View notes
siikr 5 months ago
Text
Aaannd the server crashed.
Guys this is literally running on the equivalent of a raspberry pi AND THE WORD CLOUD IS ONLY AVAILABLE FOR EXISTING BLOGS. IT WILL NOT APPEAR FOR YOU IF YOU HAD NOT HEARD OF SIIKR BEFORE.
Chilllll.
12 notes View notes
siikr 5 months ago
Text
So you know those dumb little wordcloud things?
You know, where like, they go through your blog and find the words you use most often, and then spit out stylized text with the most often used words as the biggest ones so you can embed or screenshot them or whatever?
I FUCKING HATE THOSE.
Like, the idea is really cool in theory. A standardized analysis generating an artifact characteristic of you, easily digestible at a glance.
Except in practice everyone's word cloud ends up being "like, people, think, want, make, get..." -- i.e. basically just a bag of the most common words in the english language (presuming they speak mostly english).
But what I actually want is a collection of words I use more than the average person does. And while we're at it, also a collection of words I use less than the average person does.
So anyway I made that:
Tumblr media
It's on Siikr now. New blogs don't get it yet, only blogs that were indexed as of a few days ago (still working on optimizations to allow for real time generation).
The words in green are the words you use weirdly often.
The words in red are the words you suspiciously seem to avoid.
In both cases, the bigger the word, the more weird your usage of it is relative to all of the other blogs in Siikr's index. This is limited to the most extreme 100 words in both directions.
Hovering over a word gives you some statistics about how much it should appear in your blog vs how much it actually appears in your blog.
So that's fun and everything -- but it can and very well might get even more fun.
Because generating this meant creating a list of all of the words used by every blog, and storing a bunch of numbers per word per blog. Currently, that's ~9 million associations over ~57k words.
Every blog->word relation stores frequency statistics, and every word itself keeps a running average of its frequency across all blogs.
Which means we could in theory (and almost certainly will in practice), treat each word as a dimension in a 57 thousand dimensional space.
Then treat each user as a point in that 57 thousand dimensional space, where their coordinates in the space are (user_word_freq - avg_word_freq).
From there, we can measure the distance (as cosine similarity, or euclidean distance, or even just raw inner product) between users, and return for your blog, an ordered list of:
Dopplegangers - blogs most like yours (closest to your blog in 57k dimensional word frequency space).
Foils- blogs least like yours (furthest from yours in 57k dimensional word frequency space).
Manic Pixy Dream Friends - blogs that overuse the same words you overuse (closest to your blog in 57k freq-space with respect to only positive vector components)
Least Like Un-You - blogs that avoid the same word you avoid (closest to your blog in 57k freq space with respect to just the negative vector components)
2K notes View notes
siikr 5 months ago
Text
I now have experimental support for something approximately like pagination. Your browser won't melt as much, but it will still download the same amount of data. You should see a "Load More" button after the first 15 posts. Might add a toggle for load "Load All". Note also that for the next half a day or so your searches will likely be slower than usual because the server is chugging away at the cool things.
0 notes
siikr 5 months ago
Text
@antinegationism is cooking up something fun.
You guys will either moderately like it or be horrified about what it says about you as a person and beg me to find another way. Oh also, I stole tumblrs url again. Navigating to siikr.tumblr.com no longer redirects you, but does still interact with the tumblr url and does still send streaming messages. I really want my results to be paginated because I know they melt your browser sometimes but @antinegationism just won't do it.
3 notes View notes
siikr 5 months ago
Text
(On mobile? Click here for Siikr)
K, I'm back.
New features:
I'm now much more aggressive about hunting down time-traveling posts. If you've had issues with thousands of your posts not getting ingested and were wondering why -- that's why. A single time traveling post can wreak havoc on an entire chain of other posts.
This can be slow, but a little icon will appear when I'm hunting to let you know what the hold-up is.
Oh, necromancy too. Reblogging your own deactivated blogs causes Tumblr's API to do all sorts of insane shit with timestamps and offset ordering. Some of these posts get assigned a post_date of " +GMT". Just, a blank space +GMT. Okay.
Oh, yes, also, now a little icon can sometime appear to tell you bad things.
When you try to murder me, I will kindly request that you don't instead of just letting it happen. (Which is to say, I just won't index your blog at all if it looks like it'll make me run out of disk space and corrupt my database)
You'll know it's happening. Because little icon.
My tag filter feature --historically just there to confuse you-- is now incidentally also functional. I think? I don't know, I've never actually used it either.
I'll push the changes to git at some point. Though I think people mostly just wanted it to be FOSS as like, a vibe.
17 notes View notes
siikr 5 months ago
Text
Tell me you love me, and be honest.
Changes were made a couple of days ago which may trade off search speed for storage efficiency.
If you use me regularly and notice significant slowdown with either indexing, searching, result ordering, or basically anything, please reply to this post so I can do better.
If you use me regularly and think I'm great or even better, please also reply so as to average out confounders and also as like a self esteem thing.
4 notes View notes
siikr 5 months ago
Text
kontextmaschine once donated $1500 to siikr with a physical check.
man I wish kontextmaschine was around to see all this shit lmaooo
87 notes View notes
siikr 6 months ago
Text
LMAO, if I die y'all are gonna have to police each other's siikr use until a hero emerges among you.
Tumblr media
14 notes View notes
siikr 6 months ago
Note
Also, you know I went through all of that effort open sourcing my code and the least you guys could do is implement every idea I mention for me.
Rude.
I think it might be the clock/second-chance replacement policy
i think it's like the inverse of that, yeah.
Anyway it's up now.
(I haven't implemented the change yet though so the disk will still fill up on new blogs)
1 note View note
siikr 6 months ago
Text
Siikr will be back within the next 24 hours
And depending on how much I feel like programming, might never go down again.
27 notes View notes
siikr 6 months ago
Text
Siikr will be back within the next 24 hours
And depending on how much I feel like programming, might never go down again.
27 notes View notes
siikr 6 months ago
Note
is siikr going to be down indefinitely?
Just the opposite.
It will be up indefinitely, but definitely not today.
3 notes View notes
siikr 8 months ago
Text
200 blogs were sacrificed
but we're back.
4 notes View notes
siikr 8 months ago
Text
Please be aware that the deletion list has been updated.
Siikr is technically not an archiving service and should not be treated as such, but regardless, please contact @antinegationism within the next 12 hours if siikr is the only archive you have of your blog and that blog is one you need back and it appears in the deletion list.
3 notes View notes