#( ¯\_(ツ)_/¯ i needed a break from data analysis.
Explore tagged Tumblr posts
Text
Your Muse as the Solar System.
BOLD what applies. Italicize sometimes. Repost, don’t reblog!
SUN • egotistical • melted wax wings and fingers • stretching sunburnt skin • the most generous soul • blood in the fruit • halos • anger on fire • high vitality • thunderous laughter • is pride really a sin? • halogenic aura
MERCURY • expansion of the mind • silver-tongued • an everlasting wanderer • polyglot • high dexterity • handwritten letters • innately critical • en vogue • eyes in the trees • hidden libraries • there’s always room for improvement
VENUS • in love with strangers • iridescent waters • love potions for your mirror • selfless devotion • shattering crystal • seafoam upon sand • the golden ratio • drowning in your own passion • material value & high principles • luring • plush lips
EARTH • fresh springs • tree hugger • we can start again tomorrow • a blazing rainforest • respects survival of the fittest • nature’s adversity • lazy bones • constantly evolving • flowers sprouting from wounds • a granite altar • fossilized remains
MOON • illusory • silver shimmer off the ocean • secrets and gossip • cycles of reincarnation • a crybaby • physically ethereal • shared glances with a stranger • cat eyes • mistrusting their intuition • fear is a prison • ornate magic wands
MARS • healthy competition • attraction and repulsion • magma and rubies • a blade being forged • wrath wrath wrath • malefic • intense eye contact • cannon fodder & fireworks • blood floods • copper taste on your tongue
JUPITER • red robes and a suit of armor • beacon of stability • leader by birth • thunderbolts and lightning • guilty but can’t stop • secret rich kid • golden touch golden tears • innate optimist • failure isn’t an option • constantly reaching for more • unfinished symphonies
SATURN • traditional • overbearing energy • a sculptor of reality • this existence is a karmic one • has a heart it’s just.. way down deep • law, order & justice • avoid all necessary risk • the sound of shackles clanging • sisyphus’ struggle • grappling with the reality of time • self-governing
URANUS • psychedelic funk music • overflowing cups • a rebellion with skin • looking good in photo id • oblivious but caring • middle fingers in the air • double rainbows • icy diamond exterior • holographic • afraid of their own mediocrity • pearlescent smoke
NEPTUNE • an elegy for the lost • dissolving boundaries • white horses • the burden of mystical conditions • deceptive • escapism is their reality • a polarizing entity • artists soul • paranoia • searching for the unseen • a siren’s swan song
PLUTO • angel statues over graves • power • the cycle of necrosis • transformative • unfathomable depths • an ivory tower toppling over • screaming at the sky • violets and irises • eclipsed darkness • speaks with their shadow • sex, death, rebirth
9 notes
·
View notes
Text
@phdfan :
Haha, yes I fell down that rabbit hole too. :D I actually think your earlier Wiki is very close to what Obsidian is. The old pen and paper version is basically just analogue hyperlinks. Glad you found the system stimulating, even if it isn't what you end up going with! :)
Very stimulating.
So very stimulating I put everything else aside for a few days just to dig in and become motivated in the much needed (personal, professional) realm of knowledge management and workflow.
May end up using all of this week just systematizing my diversity of waterfowl into a self-organizing linear arrangement.
I set aside learning & exploration time for the past few days for Obsidian, some of its competitors (decided against them all), for Zettelkasten-style note-taking workflows (including ‘competitors’), and ways that people have customized all of this sort of stuff for their own use-case scenarios and for modern times with the goodness of Things Computers Do.
Spent time looking at images and a partial digitization the paper and file zettelkasten system that Luhmann used during his academic career -- very very fascinating -- along with some of the information sciences analysis of how he handled his workflow. (also fascinating!) High-speed watched a bunch of YouTube vids on how different people implement their slip-box style system (or competing equiv) in Obsidian (or other similar tool) and jotted down or screencapped favorite ideas and formats. Skimmed through that book you mentioned in your post (and boiled it down to 1800 words of useful info ... heh, skimmed at an every increasing speed after the initial chapters, aahaha).
I don’t mean to knock on the "How to take smart notes” book. Sure, I skimmed much that was not about the core of the slip-box/zettelkasten system but the rest of its message (via fast-skim) very strongly spoke to a problem I have encountered oh so painfully first hand.
From prior years/decades, have theoretically still have possession of countless sets of notes, readings, and lecture materials from class I have taken and classes I have developed and taught. Many of the graduate classes and seminars I have been in were freaking fantastic (american graduate school tends to be very course-heavy in many fields, in addition to research training before actual indi) and ... too much of this knowledge has lost in the sands of time. I also have boxes and boxes and boxes of literature and data sets from far too many prior research projects.
My mind is full of years decades of blurry memories of so many books and papers and print-outs with underlined passages and absolutely obscure marginalia mixed in with truly insightful comments, questions, and connections to other ideas/authors/subjects/phenomenon.
AND ... ¯\_(ツ)_/¯
It’s not like I don’t have something to show for it -- my CV has heft so some of those notes were very useful in the moment, historically -- but I have long since surpassed the ability of my feeble memory.
Now, every time I have a BIG JUICY INTERCONNECTED THOUGHT, as my brain 🧠 jumps around waving pompoms of excitement, the rest of me just deflates and powers off because I cannot even begin to think about how difficult it will be to do the necessary (lit/data) research all over again. Please no. Just cannot.
Which is the whole point of the "How to take smart notes” and it really struck me hard, especially as I looked into Luhmann’s career and the breadth of topics he covered.
It also made me think some more about why there are so few broad generalists today (who are good and ground breaking), and why specialists just keep specializing even more narrowly, forever copypastaing pretty much the same background section with a few new references (as in: now updated for 2020!) in every single paper they write. (There are other reasons they specialize so narrowly but that’s beyond the scope of this post).
Being such a narrow specialists has always bored me. .a.l.w.a.y.s. It just isn’t how my brain wants to work or enjoys working or can sustain work over long periods of time.
.
For my first attempt at a main “slip-box” style system:
Currently programming my own Obsidian CSS stylesheet (based on another person’s modded stylesheet and then diving into templates and designing my own knowledge management workflow and lightweight, extendable set of heuristics for naming, folder-hierarchy, tag taxonomy, note types (with templates?), etc.
When I used a wiki for my academic knowledge management system, at that time it was TOO HEAVY WEIGHT a solution for me to enjoy it and use it well. I ran mediawiki circa late 00s, which requires a webserver and an sql database. Since then, I know that other lightweight wikis came about. But, at that time, it was just too heavy weight and clunky. (think wikipedia).
Yes! Obsidian is very much like a lightweight wiki but it is also so much more feature rich and workspace-oriented(!!) and it appears to be easily(?!) extended through custom CSS and API plugins. Wheee! \o/
I think the two biggest wins, though, are:
1. How fast and easy it is to just create a new note on the fly, blast nonsense into it, and be done. The tool just gets out of the way. :D
2. How flexible the tool is, especially with multi-panel windowing within a workspace so I can see multiple notes (or other useful views) at the same time -- yesssss! -- plus the ability to set up each value very differently.
My only major issue right now is the simplicity of the tag display -- sorted only by number of times tag is use (yuck) -- although I assume that can be fixed with a plug in.
In addition to messing around with the very beginnings of a personal slip-box system that can hopefully help me manage broad multi-disciplinary thinking and writing (nonfiction, aca, etc), using some of the initial KM/InfoSci activities mentioned above, I have also started thinking about how to use Obsidian to:
1. plan COMPLEX fiction.
2. manage image-related data to be used for thinking with images
Complex fiction:
Complex in the sense of requiring a lot of world building or requiring a lot of interweaving of timelines, etc. (might need to write a timeline plugin?). Or complex in needing a “story bible.” Or complex in the thematic sense.
Have started using Monsters as a test case because it is a more “solved” problem in the sense that most of the novel is either in my head (bursting the limits of my brain’s capacity), in my notes (and epic mess in multiple systems: two scrivener binders, trello, tags on two tumblr blogs, and a playlist in iTunes), or based on existing canon (externally documented by fandom or by the creators themselves).
My “only” problem (only: a massive one) is ordering the concepts and scenes in a non-linear timeline while keeping in mind ALL OF THE INSPO (omfg so much inspo for each scene) and creating a flow that works from a dramatic standpoint.
Scrivener has not been the correct tool for this and I have tried. Oh have I tried.. At least, not for me.
Scrivener is great when I am outlining or writing actual draft or editing the drafts. But... otherwise it becomes a garbage dump of terribly organized notes, inspo, character info, etc.
Assuming I get a handle on monsters in obsidian -- and the first session went well -- I will start customizing and seeing if obsidian can work as a “story bible” for a difficult and previously intractable fiction project that has suffered from lack of correct tooling.
Again, more tooling in obsidian may be needed -- pluggins, CSS -- but that option IS available and that is a win.
Thinking with images:
I’ve tried a bunch of different off-the-shelf tools (some expensive, some “freemium”) for managing images or making image databases but none really worked for me. Too heavy weight. Too much interface in the way. Too much everything. And always too proprietary.
Have been poking around the edges of obsidian to see if this is what I am looking for because, after all, it is essentially just a system for rhizomatic thinking using markdown/html hypertext.
...tbd on this, but I have faith because tags and CSS (and plugins?) are my friend.
5 notes
·
View notes
Text
what’s the most annoying question to ask a nun* in 1967?
tl;dr - In 1967, a very long survey was administered to nearly 140,000 American women in Catholic ministry. I wrote this script, which makes the survey data work-ready and satisfies a very silly initial inquiry: Which survey question did the sisters find most annoying?
* The study participants are never referred to as nuns, so I kind of suspect that not all sisters are nuns, but I couldn't find a definitive answer about this during a brief search. 'Nun' seemed like an efficient shorthand for purposes of an already long title, but if this is wrong please holler at me!
During my first week at Recurse I made a quick game using a new language and a new toolset. Making a game on my own had been a long-running item on my list of arbitrary-but-personally-meaningful goals, so being able to cross it off felt pretty good!
Another such goal I’ve had for a while goes something like this: “Develop the skills to be able to find a compelling data set, ask some questions, and share the results.” As such, I spent last week familiarizing myself with Python 🐍, selecting a fun dataset, prepping it for analysis, and indulging my curiosity.
the process
On recommendation from Robert Schuessler, another Recurser in my batch, I read through the first ten chapters in Python Crash Course and did the data analysis project. This section takes you through comparing time series data using weather reports for two different locations, then through plotting country populations on a world map.
During data analysis study group, Robert suggested that we find a few datasets and write scripts to get them ready to work with as a sample starter-pack for the group. Jeremy Singer-Vines’ collection of esoteric datasets, Data Is Plural, came to mind immediately. I was super excited to finally have an excuse to pour through it and eagerly set about picking a real mixed bag of 6 different data sets.
One of those datasets was The Sister Survey, a huge, one-of-its-kind collection of data on the opinions of American Catholic sisters about religious life. When I read the first question, I was hooked.
“It seems to me that all our concepts of God and His activity are to some degree historically and culturally conditioned, and therefore we must always be open to new ways of approaching Him.”
I decided I wanted to start with this survey and spend enough time with it to answer at least one easy question. A quick skim of the Questions and Responses file showed that of the multiple choice answer options, a recurring one was: “The statement is so annoying to me that I cannot answer.”
I thought this was a pretty funny option, especially given that participants were already tolerant enough to take such an enormous survey! How many questions can one answer before any question is too annoying to answer? 🤔 I decided it’d be fairly simple to find the most annoying question, so I started there.
I discovered pretty quickly that while the survey responses are in a large yet blessedly simple csv, the file with the question and answers key is just a big ole plain text. My solution was to regex through every line in the txt file and build out a survey_key dict that holds the question text and another dict of the set of possible answers for each question. This works pretty well, though I’ve spotted at least one instance where the txt file is inconsistently formatted and therefore breaks answer retrieval.
Next, I ran over each question in the survey, counted how many responses include the phrase “so annoying” and selected the question with the highest count of matching responses.
the most annoying question
Turns out it’s this one! The survey asks participants to indicate whether they agree or disagree with the following statement:
“Christian virginity goes all the way along a road on which marriage stops half way.”
3702 sisters (3%) responded that they found the statement too annoying to answer. The most popular answer was No at 56% of respondents.
I’m not really sure how to interpret this question! So far I have two running theories about the responses:
The survey participants were also confused and boy, being confused is annoying!
The sisters generally weren’t down for claiming superiority over other women on the basis of their marital-sexual status.
Both of these interpretations align suspiciously well with my own opinions on the matter, though, so, ymmv.
9x speed improvement in one lil refactor
The first time I ran a working version of the full script it took around 27 minutes.
I didn’t (still don’t) have the experience to know if this is fast or slow for the size of the dataset, but I did figure that it was worth making at least one attempt to speed up. Half an hour is a long time to wait for a punchline!
As you can see in this commit, I originally had a function called unify that rewrote the answers in the survey from the floats which they'd initially been stored as, to plain text returned from the survey_key. I figured that it made sense to build a dataframe with the complete info, then perform my queries against that dataframe alone.
However, the script was spending over 80% of its time in this function, which I knew from aggressively outputting the script’s progress and timing it. I also knew that I didn’t strictly need to be doing any answer rewriting at all. So, I spent a little while refactoring find_the_most_annoying_question to use a new function, get_answer_text, which returns the descriptive answer text when passed the answer key and its question. This shaved 9 lines (roughly 12%) off my entire script.
Upon running the script post-refactor, I knew right away that this approach was much, much faster - but I still wasn’t prepared when it finished after only 3 minutes! And since I knew between one and two of those minutes were spent downloading the initial csv alone, that meant I’d effectively neutralized the most egregious time hog in the script. 👍
I still don’t know exactly why this is so much more efficient. The best explanation I have right now is “welp, writing data must be much more expensive than comparing it!” Perhaps this Nand2Tetris course I’ll be starting this week will help me better articulate these sorts of things.
flourishes 💚💛💜
Working on a script that takes forever to run foments at least two desires:
to know what the script is doing Right Now
to spruce the place up a bit
I added an otherwise unnecessary index while running over all the questions in the survey so that I could use it to cycle through a small set of characters. Last week I wrote in my mini-RC blog, "Find out wtf modulo is good for." Well, well, well.
Here’s what my script looks like when it’s iterating over each question in the survey:
I justified my vanity with the (true!) fact that it is easier to work in a friendly-feeling environment.
Plus, this was good excuse to play with constructing emojis dynamically. I thought I’d find a rainbow of hearts with sequential unicode ids, but it turns out that ❤️ 💙 and 🖤 all have very different values. ¯\_(ツ)_/¯
the data set
One of the central joys of working with this dataset has been having cause to learn some history that I’d otherwise never be exposed to. Here’s a rundown of some interesting things I learned:
This dataset was only made accessible in October this year. The effort to digitize and publicly release The Sister Survey was spearheaded by Helen Hockx-Yu, Notre Dame’s Program Manager for Digital Product Access and Dissemination, and Charles Lamb, a senior archivist at Notre Dame. After attending one of her forums on digital preservation, Lamb approached Hockx-Yu with a dataset he thought “would generate enormous scholarly interest but was not publicly accessible.”
Previously, the data had been stored on “21 magnetic tapes dating from 1966 to 1990” (Ibid) and an enormous amount of work went into making it usable. This included both transferring the raw data from the tapes, but also deciphering it once it’d been translated into a digital form.
The timing of the original survey in 1967 was not arbitrary: it was a response to the Second Vatican Council (Vatican II). Vatican II was a Big Deal! Half a century later, it remains the most recent Catholic council of its magnitude. For example, before Vatican II, mass was delivered in Latin by a priest who faced away from his congregation and Catholics were forbidden from attending Protestant services or reading from a Protestant Bible. Vatican II decreed that mass should be more participatory and conducted in the vernacular, that women should be allowed into roles as “readers, lectors, and Eucharistic ministers,” and that the Jewish people should be considered as “brothers and sisters under the same God” (Ibid).
The survey’s author, Marie Augusta Neal, SND, dedicated her life of scholarship towards studying the “sources of values and attitudes towards change” (Ibid) among religious figures. A primary criticism of the survey was that Neal’s questions were leading, and in particular, leading respondents towards greater political activation. ✊
As someone with next to zero conception of religious history, working with this dataset was a way to expand my knowledge in a few directons all at once. Pretty pumped to keep developing my working-with-data skills.
2 notes
·
View notes