#mirko's introduction post
Explore tagged Tumblr posts
Text
WELCOME TO MY LITTLE CORNER ON THE INTERNET, VISITOR!
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
Hello! My nameās Mirkoslavec, however you can refer to me as Mirko, Mirk or a secret third option (or my a completely different name, if weāre close). I'm 19 years old and my birthday is on June 8th!
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
Ever since I was young, I was into various things, such as drawing, writing, creating various headcanons, reading, sleeping (this one is obvious) and, obviously, wanting to be original. I started to both draw and write ever since I was a little kid, and though most of my stories were either lost to time, were deleted ācause I was embarrassed and decided to just delete them from existence, I still improved through time, the same goes for drawings. However, unlike with stories, the old drawings are, mostly, still saved on my pen drive, however, most of them are from the year 2019, so itās just 5 years ago. If I ever decide to post some of my old art there, I would do it as a form of redrawing those drawings, so I wouldnāt really feel ashamed of myself.
Iām really into various topics as well, mainly the darker ones, such as death, loss of loved ones, addictions, dark thoughts, dealing with various mental problems, having to come up with the past of the loved ones, that you recently learnt through the most unexpected way, torture, politics (as in, learning why things are that way, and not in another way), history (because I just love learning about the past, especially about Mediaeval Times and the last century). However, donāt expect me to get political - I just want to get away from the real world and just enjoy every free moment I have, and also I know, how people can get heated due to that topic.
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
The mascot of this blog is a character called āMiroslav Bochtoā, who seems to be, at first, just a normal, anthropomorphised food object (there is an object version of him), however actually isnāt an object at all. His reference, under his āusedā form is there:
Miroslav will be used mainly in āpersonalā art, such as in art, where I celebrate various holidays, such as vacations, Christmas, Halloween, art for other things, such as rants, reviews and just in general.
Of course, there are more than one āmainā OC, however Miroslav is the main one, as he was created in thought of, directly, representing me.
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
I know, currently, three languages - Polish (my native), English (pretty good) and German (basic knowledge), however I wish I could learn Russian, Greek, Spanish, Portuguese and Icelandic.
Another thing I want to learn soon is how to code and create 3D models. I have my reasons - with coding I could actually create a game on my own - either a fangame for one of the fandoms I belong to, or my own original game, with modelling itās a mix between āI want to create gamesā and āI want to create cool stuff, that arenāt only 2Dā due to the fact, that I;m aware about the whole thing with me having to not only learn how to code, but also knowing how to improve the game, how to create various events, that would get the players hooked, how to not make the game be unfairly hard or unfairly easy etc., and with modelling, I might focus on creating, firstly, only 3D models, completely for fun, and not for any games, unlessā¦ Something would change.
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
Iām in many fandoms, so you might expect to not only see drawings, headcanons, opinions etc. related to only one fandom, but also for many fandoms.
I belong to:
Object Show Community (BFDI, II, LOTS, BOTO, plan to watch more Object Shows)
Five Nights at Freddyās (and fangames, such as Five Nights at Candyās, Playtime with Percy, Those Nights at Rachelās, Dayshift at Freddyās, The Joy of Creation, Popgoes, etc.)
Warrior Cats,
Cartoon communities (The Simpsons, Gravity Falls),
The Sims,
Omori,
Skyrim,Ā
Terraria,
Stardew Valley,
OCs (as in my original universe for OCs I have),Ā
Undertale (and its AUs),
Doki Doki Literature Club,Ā
Anthropomorphised animals
Brawl Stars
And many more, however I donāt really remember them, however, if I remember some, Iāll either add them or just upload a drawing from fandoms I didnāt list there!
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
The tags, that I plan to use are divided in four categories, due to me wanting to make the blog be as easy access not only for me, but also for anyone reading the blog.
The first category is āPersonal tagsā, which are tags supposed to indicate what the post is supposed to be about.
#mirko draws - itās obviously a post, featuring a drawing.
#mirko redraws - same as above, however it is more about redrawing either an old piece of mine, a scene in games/in movies/in comics or something else
#mirkoās opinions - used, when I share an opinion over various things
#mirko rants - as the name says, it is used when I rant about something
#mirko rambles - used, when I ramble about something, such as the newest released episode, the news about various games, etc.
#mirko reviews - a post, that indicates, that what I posted is a review of something (either a game, an episode of a TV series, or webseries)
#mirko reblogs - as name suggest it, itās used when I reblog stuff from other people.
#mirko's answers - tag used for asks
#mirko vents - used only for venting (either through art or textā¦ Or both)
#mirkoās special announcements - only used when I want to announce something special
#mirkoās introductory post - only used for the post youāre reading right now
#mirkoās headcanons - used on a post featuring my headcanons for various characters
#mirkoās comics - a comic, that was created by me.
#mirko updates stories - this tag is used, when a story, either on AO3 or Wattpad, gets updated
#shitposting time - a tag for various shitposts, that are going to be occasionally posted, related to various events.
#mirkoās designs - designs created by me
The second category is āFandom-relatedā tags, that are only used for something related to projects for fandoms Iām part of.
#the rising moon universe - tag used for all posts, that are related to a BFDI AU, which is āwhat if BFDI characters were anthropomorphised, and there was more stuff to it?ā
#goikian stories - a tag used for all posts, which are about an AU based on beta BFDI content (such as Firey comic series, Total Firey Island, Total Firey Island Points, etc.)
#along came a bubble - a tag used for all posts, which are about a BFDI AU, in which Bubble snapped at everyone, who mistreated her.
#among the clouds - a tag used for all posts, which are for a BFDI AU, in which TV needs to deal with a heavy loss of his
#battle cats - a tag used for all posts, which are for a BFDI x Warrior Cats AU.
#paltronicsā experiments - a tag used for all posts, which are for PWP AU, in which Paltronics decided to use a technology to create an updated cast of the original characters, adding more stuff to it.
#percyās afterparty - an AU, in which many years after the incident, that changed the poodleās life completely, Percy is forced to confront the forgotten past
#the playhouse of damned ones - an AU, in which the playhouse was abandoned for an unknown, for public, reason and Nick decides to see what happened in there
#our playtime - some sort of OMORI x PWP x BFDI (woah) AU
The third category is āOCsā related
#miroslav and friends - a series of stories/comics/drawings etc., featuring my persona - Miroslav and his gang in various situations
#lights in darkness - a series of stories/comics/drawings etc., featuring my original characters, still related to object shows, living on an island and having lots of adventures.
#the forest seven - OCs, that belong to the Forest in Goikian Stories
#the forest guards - a group of OCs (both objects and cats), that were guarding the Evil Forest for many years
#wolkrows - a group of demonic creatures, that have two forms - one āhiddenā and another - the real, blob-like form
The fourth category is āthings I did for peopleā related
#mirkoās art for people - a tag used, if the art was requested by someone
#mirkoās paid work - a tag used, if the art was bought by someone
#mirkoās gift - a tag used, if the art was a gift for someone
#mirkoās part of art-trade - a tag used, if the art was a part of art trade with someone
#mirkoās part of collab - a tag used for my part of a collab
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
Naturally, the tags in fourth category wouldnāt be used so often, due to various reasons, as, for example school, me being busy with life or any other reasons.
Requests - CLOSED
C0mmissi0ns - CLOSED
Art-Trades - CLOSED
Collabs - CLOSED
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
You know, I wish my blog could be kid-friendly, howeverā¦ Me being me is making it impossible, so I require everyone, who are visiting my blog to be at least 15 years old! Under 15 do not interact!
Another thing I want to say, that on my blog you can find themes, such as:
Gore,
Death,
Torture,
Addictions,
Nightmares,
Dealing with mental problems,
Memory loss,
Paranormal activities,
Repressed memories,
Mystery pasts,
Breakups/bad romances,
Loss of friends/family etc.,
Kidnappings,
Body Horror,
Disturbing lore,
Suggestive themes,
So, if youāre uncomfortable with any of the topics above, donāt follow me, donāt try to force me to not talk/make stories based on those topics/etc., just because you dislike those topics!Ā
All of those topics will have warnings.
You can expect some of ālighterā stuff be there too, because even people like me need some fluff, right?
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
ā ą§ā¤ā šØšššš ā ą§ā¤ā šØšššš
The list of my other social media is:
Gamejolt - Mirkoslavec
Discord - mirkoslavec
Toyhouse - Mirkoslavec
Deviantart - MirkoslavecĀ
Cara - soon
Hope you will enjoy the stay there!
#mirko's introduction post#digital artist#artists on tumblr#object show community#object shows#miroslav bochto#osc community
3 notes
Ā·
View notes
Text
ā INTRODUCTION ā
MICAH or VINCENTćā ā±ā ā he it
eighteenā ā āā ā german systemā įŖ
dni: basic dni criteria, xeno/neo antis, anti furry/therians etc, pro/com/darkshippers, maps/zoophiles etc, non traumagenic systems (includes supporters/people neutral about this), trump supporters, in general fascists, people who romanticize disorders, people who support/are transracial, terfs, people who think he/him and transmasc lesbians aren't a thing (disclaimer: transMASC people are NOT necessarily men. i do not believe men can be lesbians), transmeds, anti-recovery, people who are strightly against final fusion OR functional plurality (both are valid choices), fakeclaimers obv., ableists, zionists
NOTE: this list is not limited to these things, there might be things i forgot here and i block freely. if anything on this dni list makes you mad then leave, don't waste my time with arguments.
byi: please use tone tags for things like sarcasm/jokes/anything that the tone might be hard to understand with. this is to prevent misunderstandings, i will NOT disclose details about anything regarding my mental issues/disabilities/other private things. this includes talk about the system. i will not answer questions about it, you can DM me but do not feel offended if i don't respond or don't want to talk to you, this blog will contain nsfw sometimes. i will put warnings, if i see any minors int with those posts you're blocked, don't dm me if you're younger than 16 or older than 30. you can interact but don't dm me or ask for my other socials
more about me!!
heyo, you can call me micah or vincent, rotten dolls or doll is okay too. i'm 18 and from germany so english is not my first language. i'm fine with whatever pronouns apart from she/her or shx/hxr, my favorites are he/it. ask for more i like. i will not be like super active here, since i have a low social battery and have work on most days apart from weekends. you can ask for my discord or dm me here if you want, but don't feel offended if i don't answer. also, you might see this post mentions that i have DID, so some alter will use 'we' and 'us'. however, there will be no to rare system talk, don't ask about it unless stated it's okay. i'm not interested in system discourse on here either, so please respect this. i also LOVE making rentries, if you want to see any of the ones i made, feel free to ask in DMs, since i don't really want to post them here
current main interests: phasmophobia, insym, marvel (especially x-men, winter soldier & spider-man), jjk, bnha/mha, bsd, star wars, hannibal, banana fish, sherlock, arcane, hello charlotte
likes/interests: space, horror games, will wood, rock music (eg. skillet, bmth, slipknot, ghost, linkin park, powerwolf...), history, politics, physics, informatics, books/reading, mitski, jazmin bean, babybugs
other fandoms/interests: doctor who, attack on titan, good omens, fnaf, markiplier, fantastic beasts, genshin impact
fav characters from some media:
marvel
- iron man / tony stark
- peter parker / spider-man (all actors)
- charles xavier / professor x (movies)
- natasha romanoff / black widow
- loki
- bucky barnes & winter soldier
- clint barton / hawkeye
- sam wilson / falcon / captain america
- logan howlett / wolverine
bungo stray dogs
- oda sakunosuke
- chuuya nakahara
- nikolai gogol
- lucy montgomery
- ryunosuke akutagawa
- atsushi nakajima
- akiko yosano
- edgar allan poe
my hero academia
- tomura shigaraki
- dabi
- hawks
- himiko toga
- spinner
- mirko
- ochako
- aizawa
- shoto todoroki
jujutsu kaisen
- shoko irei
- gojo satoru
- toji fushiguro
- yuji itadori
- miwa
- megumi fushiguro
- maki zenin
- uraume
arcane
- viktor
- ekko
- jinx
- sevika
- mel
- vander
hello charlotte
- charles eyler
- charlotte wiltshire
- bennett
- vincent
- frei
favorite ships from some media:
marvel
- cherik
- pepperony
- stucky
- winterfalcon
- lokius
- peter and mj (tom holland)
bungo stray dogs
- shin soukoku
- soukoku
- kunizai
- sigzai
- fyozai
- ranpoe
- fyolai
- doa trio
- perfect crime trio
- odango
- rimlaine
my hero academia
- togachako
- spinneraki
- todobakudeku
- momojirou
- erasermic
jjk
- satosugu
- itafushi
- sukume
arcane
- jayvik
- caitvi
- timebomb
other fandoms/media
- asheiji
- johnlock
- hannigram
- aziracrow
- harry x ron x hermione (this is underrated imo)
- sherliam (moriarty the patriot)
- thoschei
- wriollette
- kazuscara
- jeanlisa
- cynari
- kaebedo
and more!!
NOTE: If these ships make you angry, just block me. I don't care what you ship as long as it's legal and not super weird, respect my opinions i'll respect yours too. Don't be weird!
#intro post#blog intro#introduction#introductory post#bsd#bungou stray dogs#bnha#mha#boku no hero academia#my hero academia#marvel#x men#spiderman#jjk#jujutsu kaisen#hannibal#bbc sherlock#good omens#star wars#genshin impact#did system#traumagenic system#endos dni#endos do not interact#endos fuck off#doctor who#aot#attack on titan#Spotify
2 notes
Ā·
View notes
Text
Before I start the Liyue arc in the Sagau Headcanons ..
Here are some stories I want to post in wattpad soo choose for me what to post first with these introductions :]] will do a poll after
"šš°š®š®šŗ šš“š“š¶š¦š“"
You sought to kill your mother's former protƩgƩ as he returns as a teacher in your old high-school while being ignorant to the people who'll be affected by your petty revenge
X fem reader
Warnings:
child-neglect;
intense mentions of gore;
sadistic mannerisms;
toxic parent reader;
Very slow redemption ;
Love interests;
All Might
Endeavour
Mirko
Hawks
Fat Gum
Midnight
Eraserhead
Present Mic
Vlad king
Overhaul
Oc
+
"voleuse de coeurs"
Settling down in your homeland was never the life you'd wish to spend in, so you go to a trip all over Teyvat!Learning about the mysteries of each nations,Escaping each and every crime you have committed, And on the plus side!You may have stole a couple of hearts on the way, and an archon's gnoses while your at it!wait-
X fem reader
Love interests;
Aether
Lumine
Kaeya
Diluc
Childe
Rosaria
Eula
Amber
Kazuha
Ayaka
Thoma
Xiao
Razor
Bennett
Fischl
Sucrose
Albedo
Ayato
Chongyun
Ganyu
Scaramouche
Keqing
Xingqiu
Sara
Itto
Dainsleif
Oc
+
"Ang š®šššĆ±ššš"
In which a once sealed goddess now roams the world of Teyvat with two companions who'm she tried to kill upon meeting them.
Xfemreader
Love interests ;
Venti
Zhongli
Baal
Dainsleif
Xiao
Ganyu
Kaeya
Aether
Lumine
Tartaglia
Scaramouche
Diluc
Baizhu
Oc
+
#genshin x reader#god reader#lumine#bnha#bnha x reader#fem reader#aether#fatui x reader#wattpad#fanfic#all might x reader#kaeyagenshinimpact#genshin childe#albedo#zhongli#zhongli x reader#venti x reader#dainsleif x reader#dainslief#venti genshin x reader#aizawa shÅta#mirko#boku no hero academia hawks#Mha#aether x reader#oc x reader#baizhu#fischl#present mic#mha midnight
38 notes
Ā·
View notes
Text
My Hero Academia Pride Headcanons (Pro Hero Edition)
If you'd like to see my headcanons for the 1-A kids plus Shinsou, it's linked here!!!
This is all about the Pro Heroes!!
Except, not all the Pro Heroes because that would be ridiculously long, and there are a lot of Pros I don't care about or just plain don't know what their sexuality would be.
You think I know the romantic trysts of a washing machine??
You think I know who Best Jeanist likes?? Mans too busy dying and resurrecting over and over to go on any dates.
So, this list will talk about only the Pros I'm interested in covering.
Disclaimer: These are my own thoughts and opinions. This is how Iām choosing to engage with this media in this post. There are other ways to engage with a chosen media and neither way of engagement invalidates the other. Art is subjective. Fandom is ultimately for fun! Donāt take me too seriously!
Now that the long introduction is out of the way:
Yagi Toshinori | All Might (he/him): Bisexual
Not only did All Might have an American Romanceā¢ļø with David Shield, not only did he have an on again/off again with Sir Nighteye, but I'm here to convince you that he is married to his Good Friendā¢ļø Detective Tsukauchi Naomasa.
Go on, tell me I'm wrong. Why does this random police officer know the OFA secret when not even David Shield does, huh??? Tell me. You can't. They're married.
Oh and some people ship All Might with Midoriya's mom and that's valid, I guess. You do your thing, people!
I just don't think All Might is straight. I gave him the Bi-label arbitrarily. He could be pan or gay or something else feasibly. That's the fun thing about headcanons. you can make it the Fuck up!
(but I do like to be realistic within a certain parameter to how the character is written and I just think All Might and the Detective are sus)
Todoroki Enji | Endeavor (he/him): Straight
Some people write Endeavor as homophobic and while I think that's not an out of there interpretation, I think it's more nuanced.
I don't think he's homophobic in the outright hate sense. I think he's obsessed with power and lineage, and that manifests in his control of Shouto specifically.
So, if Fuyumi brought home a girlfriend or Natsuo brought home a boyfriend, it wouldn't be a big deal.
But Shouto bringing a boyfriend would be an issue because Endeavor wants the Ultimate Hero. To do what he couldn't do. That would be hard if his masterpiece brought a boy home.
This is beyond the scope of this discussion, but I do think Endeavor is allowed to atone for his mistakes and abuse. I don't believe he's afforded forgiveness from anyone he's hurt, and he absolutely deserves consequences, but Enji himself is allowed to do better and want to do better.
Where it's relevant to this discussion is I believe he'd drop that mindset against Shouto once he realized it was an overreach of his power and hurting his son. He'd eventually get to a point where he's not as controlling and not as much of an asshole.
Takami Keigo | Hawks (he/him): Bisexual
He's got the charisma! The flamboyance! The Trauma! He's Pro Hero Hawks! All the women and men swoon for him and he's got his pick of the market!
That is, if he actually had time. No, he's too busy making sure his city's safe. Too busy being held in a cage by a combination of Hero Commission conditioning and his own ideals.
Oh and let's not forget that he's spending all his off-hours at the behest of his side-job as a LOV spy.
It's a good thing one of their members is hot. Literally and figuratively. At least it gives him something to look at while he's trying to keep himself together, and oh no, he's caught feelings-
Usagiyama Rumi | Mirko (she/her): Sapphic
She's a lesbian. Most of the time. She likes girls. Loves them. Would date them.
So why is she sleeping with wanted criminal Shigaraki Tomura-
ijessbest on insta made me like this cursed ship so much!! To me, at least, it's a purely physical thing for both parties.
I think she's romantically and physically attracted to women, but also physically attracted to men?? That's my idea, anyway.
Aizawa Shouta | Eraserhead (he/him): Gay
Aizawa would be happy with just his cats and coffee, really, but he just had to like men. And specifically the loudest blond on the planet.
Aizawa absolutely had a crush on Oboro in high school. It's not clear what would've happened if he'd survived. Maybe he would have ended up with Oboro. Maybe they'd have split up. Maybe he was always meant to end up with Hizashi. Aizawa doesn't dwell on it. What happened happened and he wouldn't trade what he has now with Hizashi for anything.
Yamada Hizashi | Present Mic (he/him): Bisexual
On the opposite side of things, Hizashi laid eyes on the transfer student from the support course and was immediately smitten. He was not subtle about it either. He made Shouta a playlist, a playlist, and Shouta said "thanks, but I don't use spotfiy" and Hizashi cried for weeks. Nemuri won't let him live it down even to this day.
Years later, when Shouta asked him out, he nearly busted both their eardrums from the shock of it. He really thought all this time it had been unrequited, and he wasn't about to even attempt to bring up the idea of a relationship after Oboro.
Now they're happily married and Hizashi makes them do cute couple-y things together all the time. But not in front of the kids.
They're actually pretty subtle about it. Not even Midoriya picked up on it, and he picks up on everything. He only realized after Aizawa adopted Shinsou and noticed Hizashi also treating Shinsou like his own.
Kayama Nemuri | Midnight (she/her): Bisexual
Mostly men-attracted, but that doesn't make her any less bi. I also like the idea of her quirk working best on people who are attracted to her, not necessarily guys. I personally ship her with Ms. Joke actually.
Fukukado Emi | Ms. Joke (she/her): Lesbian
Emi (a lesbian) continually asks Aizawa (a gay man) to marry her because she finds it very funny! She likes annoying the shit out of Aizawa. But she likes women. When Emi asked Mic for his friend's number, Mic was about to have a fit before Emi explained it was Midnight's number she wanted. Mic quickly got with the program and (not-so-subtly) encouraged their relationship along.
Nishiya Shinji | Kamui Woods (he/him): Pansexual
As my Pan sister says: Everyone's eligible but none have applied. (or so he thinks)
I don't think he's had much luck with dating, despite being a pro. Is it because of his costume? Will he ever find love? He just hopes it's someone he likes and not someone egotistical like Mt. Lady. Could you imagine him dating her-
Tatsuma Ryuuko | Ryukyu (she/her): Lesbian
Lesbian Dragon Lady!!! Why, you may ask? Vibes. Would marry the Lesbian Dragon Lady.
Sakamata Kugo | Gang Orca (he/him): Aro/Ace
Despite his harsh exterior, he's really a people person, but not a relationship person. He'd love to make a queer-platonic connection and maybe raise a kid one day.
Takeyama Yu | Mt. Lady (she/her): Pansexual
Her ass is out for all genders. You know, like, "Nice to meet your assquaintance"?? No?? (I'm not tryna sexualize her I promise)
I think she's much more worried about ratings and numbers than a relationship. And if she did decide to start dating, she'd have her pick of the pool. She might like people regardless of gender, but she also has standards. She'd never settle for someone like Kamui Woods. Could you imagine her dating him-
Toyomitsu Taishiro | Fat Gum (he/him): Pansexual
He just loves who he loves. Comfortable in his sexuality and his body image. Truly someone to look up to.
Chatora Yawara | Tiger (he/him): Trans
Actually Canon!!! Tiger is a trans man!! It's so nice to see him open and visible! Even if it's just a small blurb. He's treated with respect by the narrative and I appreciate that!
That's it for the Pros!! Next time I'll tackle characters I've missed including some from the LOV and some students in other classes!
Until Next Time!!!
#bnha#pro heroes#all might#erasermic#aizawa shÅta#presentation michael#ms. joke#kamui woods#mt lady#lgbt pride#pride month#lgbtq#bisexual#pansexual#gay#lesbian#trans#aromantic#hawks#mha mirko#endeavor#dabihawks
30 notes
Ā·
View notes
Text
How I would have fixed the pace of BNHA
Okay, one of my main criticism for BNHA is the way the pace seems all over the place, with characters being sidelined and then thrown into the focus and then sidelined again, with the story now rushing to the ending. So. This is how I would have fixed, and distributed the current plot over two years of school time, with the final war happening at the beginning of the third year, just in time for the students to go back to normalcy.
YEAR ONE
Year oneĀ“s pace is pretty much a normal pace for a school story. Everything can easily be kept the same, with the end of year 1 being the Overhaul arc.
The only thing I would add is a smaller arc focusing on the hero world and some more classes. In particular a class about heroes in the rest of the world so we can learn how big of a deal Star and Stripes is, plus a small villain arc about the discrimination of heteromorphic quirks, maybe focusing on Shoji. This can be added as a build up to the Stain arc, with villains feeling emboldened by StainsĀ“ presence and a fringe of them can be anti-heteromorphic quirks, or can be anti quirks in general.
Pepper in some information about how companies work with agencies, and throw in RedestroĀ“s company - and everything would work the same.
Year one would end up with the Overhaul arc (post license) and Deku vs Kacchan 2 (plus Brava and Gentle criminal), so year two would start with their new status quo.
YEAR TWO
Because of the licensing exam and the way kids have been thrown into actual villains like Overhaul, the focus can be given to the Hero Public Safety Commission, with the introduction of Hawks and Mirko, plus the whole Endeavor as first one.
Something about the impossibility of organizing a tournament because of safety reasons, and class 2a vs class 2b as an alternative, plus Shinsou exams.
At the same time, Redestro can keep working as a villain, and Hawks can do his think as a spy. This would allow us to know more about the Safety Hero Commission, and give a spot to Mirko. Maybe they want to hire her too but she is distrustful. This would introduce us to heroes who went rogue against the commission (ex. Nagant) and would explain why Mirko doesnĀ“t want to have interns.
The rest of the second year can be divided in more classes and subplots/arcs can be added. For example, a "festival like" subplot where we see Mirio adapting to being quirkless (we could have some insight on Deku and Aoyama here) and Eri coming to the realization that she doesnĀ“t want to be scared anymore but actually wants to help him recovers his quirk. This could happen in parallel with the internship, so we could have the whole Endeavor dinner and internship (and HawkĀ“s investigation), Aizawa visiting the prison and thinking about his past while training Shinso and Eri, actual... other internships.
They could face some smugglers (working for Redestro) as a major villain arc (with unlocking another quirk plus actual focus on training Catch a Kacchan), and the second year would end up with the League conquering the Liberation front and the start of the war.
This means that at the end of the second year would end up with the start of the war, BakugouĀ“s sacrificing himself and Deku leaving. Because Eri and Mirio were focused on, his sudden quirk recovery would also not be out of nowhere.
YEAR THREE
Deku vigilante can be post year two, basically during the break. So we could see the students actually reacting to his letter, the parents moving in and more about what the new world is.
Maybe even the fall of the Commission, with their dirty secrets coming out as well thanks to prisoners escaping. This would make Nagant a major plotline instead of a one chapter fridged woman. Mirko could even be thrown in here and we could see her starting to want to work with the kids and being vendicated for her distrust of the commission.
The rest could literally be the same, with Deku coming back. Plus a moment of down before the war. For example, something about Mirio, Nejire and Tamaki reassuring Eri etc etc. During this moment of down time, where everyone is resting and recuperating then we can start to see that Aoyama does not seem that happy and the sudden revelation would not seem out of nowhere.
I think I am okay with no more down time from here on, as it looks like they all have a plans for the villains, but because of the surprise elements they are better to be revealed by flashbacks as Hori probably will.
The end of the war, whatever it is, could mean the beginning of normalcy again and their third year of school.
As year one is particularly dense, it would be also easy to move the license exam and Overhaul directly in year two.
12 notes
Ā·
View notes
Text
First patrol (Hawks x reader)
So I got a little carried away writing the beginning of this one, but I just REALLY love Mirko. I wasnāt sure what to use as the readerās quirk so I just did the ability to create telekinetic force fields with energy in different shapes and shit. Also, (h/n) will mean your hero name. Once I finished I writing this I decided it was a little long so I split it into two parts. I guess this first part can be considered a various x reader lol. Iāll post part 2 soon! Iām having a lot of fun with these so please donāt be shy to send requests or asks! Thanks :)
----------------------------------------------------------------------------------------
āTHATāS NOT FUCKING FAIRā Bakugou screeched in the common room.
āI literally donāt know what you want me to say.ā You stared blankly at Bakugou as he was practically foaming at the month. His hands began to emit smoke.
āKacchan, calm down! (Y/n), Iām happy you got such a great opportunity!ā Izuku tries to congratulate you while holding a death grip on Bakugouās arm. āYou and Mirko will make a great duo!ā. You smiled at his reassurance and braced yourself for his detailed mutterings about the specifics of both your and your future mentorās quirks.
āThank you. Iām excited but nervous.ā You shifted in your seat while your hands were in tight fists. āIām excited to prove myself.ā
āIāve met Mirko before. Sheāll enjoy working with you, Iām sure of it.ā Todoroki spoke for the first time all evening from the dinner table as he slurped cold soba. You honestly had forgotten he was there.
āOh yeah! Your father and Mirko team up sometimes, right?ā Izuku mentioned as he turned on the couch to face Todoroki.
āYes.ā Todoroki took a slurp of soba before continuing. āIf you run into him, be wary. Heās more concerned about his reputation than a rookie looking for guidance or protection. Thatās why Hawks does his own thing most of the time. My dad canāt be bothered with anyone else.ā
āIām sure (y/n) will be in good hands with Mirko.ā Izuku tried to ease the tension in the room. As Todoroki is a man of few words, itās rare for him to share things like this. You decided you should head to bed to prepare for your long day tomorrow.
āAlright guys. Thanks for chatting with me. Iām off to bed.ā After replies of good nights and wishes of luck, you tried sleep off the anxiety until tomorrow.
_____________________________________________________________
āReady to rumble, (y/n)?!ā Mirko enthusiastically greeted you when you entered her office.
āYes Maāam! Thank you for letting me join you today!ā You bowed to Mirko and straightened up as you heard her walking toward you.
āNo need to be so formal!ā Mirko gave you a big slap on the back as she passed you. With your back aching and stinging, you closely followed her to the elevator. āI donāt take just anyone out to patrol with me, (Y/n). You got something special, kid.ā She gave you a large smile as the elevator door closed. You were thrilled to finally start your internship, with your idol none the less.
āThank you, Rumi. It means a lot coming from you.ā You tried to calm the reddening of your face as you two descend to the lobby of her agency.
āDonāt sweat it! And remember that on the street Iām Mirko. Right, (h/n)?ā Mirko smiled at you as the elevator rang.
_____________________________________________________
After a few hours of patrol you and Mirko still hadnāt had any calls or serious confrontations. Although popperazzi and other media outlets seemed to follow you both everywhere, they were only taking pictures from a distance as not to interfere. āSorry that this is such a quiet day. I wanted to see you in action!ā Mirko began chatting with you and you two walked.
āNo, itās alright. Something is bound to come up anyway, right?ā You smiled and continued to survey your surroundings. A teenage boy ran up.
āYouāre Mirko, right?ā His face was a deep red.
āThe one and only! Want a picture or something?ā Mirko smiled at the boy. His head whipped around before his eyes frantically landed on you.
āHi. Can you take our picture, please?ā You held up the fanās phone to take a picture with Mirko.
ā1,2,3, smile!ā You continued taking a few pictures until Mirko put her hand up to her ear intercom. You handed the phone back to the guy and awaited news. Mirko nodded at you after coming off the intercom.
āLetās go. No time to waste.ā Mirko turned serious as she dashed off to the lower part of town. You used your quirk to manifest a board to ride on in order to keep up.
As unfamiliar buildings flew past, you couldnāt recall seeing the surrounding landmarks on the sheet of information Mirko gave you about your sector.
āMirko, are we close?ā You grew anxious and unsure as you approached the scene.
āYeah,ā Mirko grinned as she gained momentum by swinging off a lamppost. āStay sharp. This is uncharted territory for you.ā You nodded and picked up speed, feeling the wind press against you.
Finally, you saw the scene you were summoned to. A monstrous villain was holding a car with a family trapped inside above his head. You didnāt recognize the villain, he was most likely an angry civilian that snapped. The villain was towering about thirty feet above you. He was angrily screaming, the veins on his neck and arms were bulging and strained. It was obvious this guy never used his quirk like this before.
āYou think he used an enhancer?ā You kept your eyes glued to the car the villain gripped.
āProbably.ā Mirkoās smirk wavered and her brows furrowed. āBunch of bastards have been juicing up and wrecking shit recently.ā The villain began to shake the car and screech in anger.
āIāll get the car, you get the guy?ā You asked Mirko as your eyes focused in on the car and you activated your quirk.
āRead my mind. Just give me a boost.ā Mirko smirked and slid a foot back in preparation to jump. āLetās go.ā Mirko lept sideways causing the villain to whip his head in her direction. You raised your left hand and manifested a platform under the car. Your right arm shot out as you made a small platform about seven feet in the air for Mirko to vault off of. Your eyes remained on the car as you heard Mirkoās feet pound on the platform and you saw a swift white streak knock the villain from under the vehicle. While Mirko repeatedly kicked the villain into submission, you lowered the car with the clamoring family to the ground. You ran to the car and escorted each member to the side where a small crown was gathered. You turned to see Mirko with the villain in a suffocating leg triangle. The villainās screeching quieted and his body began to lose muscle and shrink.
āMirko, should we take him in for questioning?ā You pulled handcuffs out of your pocket and placed them into Mirkoās outstretched hand.
A gust of wind passed behind you making the hairs on the back of your neck stand up. You also felt an intense warmth behind you. āWe can take him off your hands. Youāre in our jurisdiction after all.ā You spun around to see Endeavor and Hawks. Your hands clenched and your chest tightened at the sight of the two top heroes.
āNumber 1 and 2, always a pleasure.ā Mirko hauled the villain to his feet. āSo what if weāre in your jurisdiction? You guys didnāt get here fast enough. Thatās why we were called.ā Mirko smirked.
āMirko,ā Endeavor began to speak. ā we were being briefed on an important future mission. Our delay was expected so they called you and uhh.. Shoutoās classmate.ā
āāShoutoās classmateā is not the name of my intern, Endeavor.ā Mirko put a hand on her hip and raised a brow at the number one hero. Her ears perked up at the arrival of an idea. āHow about this: we walk this jerk to the precinct and do introductions over some lunch?ā.
āAs long as the place has chicken.ā Hawks smiled at Mirko. You wanted to admire his handsome features, but decided against it out of fear of embarrassment if he caught you. āEndeavor treats since he was the reason weāre late!ā. Endeavor crossed his arms and sighed. He then began walking in the direction of the precinct. Mirko and Hawks shared a laugh and Mirko began hauling the villain behind Endeavor. You paused before following. Your eyes were still trained on Endeavor. You wondered if he would have cooperated at all if you fought with him instead of Mirko. Hell, he didnāt even bother to learn your name after being friends with Shouto for the past year.
āSo whatās your deal, kid?ā. Hawks was suddenly walking by your side. You tensed at his sudden presence and looked ahead towards Mirko.
āMy deal?ā You glanced at him to see if his eyes were still on you, eyes briefly meeting before your head turned.
āYeah. Does Endeavor spook you or something?ā
āNo.ā You could feel your face getting warm. āHeās just intimidating, I guess. And hearing what Shouto has to say about him doesnāt really help.ā You didnāt like being questioned like this.
āI get that. Heās a shitty dad.ā Hawks stretched as you two walked. āHeās also a pretty difficult guy to get to know. Heās starting to change for the better though. But his social skills are still shit.ā Hawks looked over at you to make sure his remark made you smile. He knew if he kept talking youād loosen up and get more comfortable. āHowās your first patrol going?ā
You glanced at him and smiled. āI canāt complain about lunch with the top two heroes.ā Hawks laughed.
āYeah, I guess. Iād say youāre doing pretty well for your first time. Mirko doesnāt team up with just anyone, you know.ā Your face got even warmer as you became flustered once again.
āIām mainly only good for defense and rescue.ā You looked away from Hawks and started to fidget with your hands.
āDonāt be modest, kid. I saw you rescue that family back there.ā Your face was own fire upon hearing his praise. āAlso saw you kick ass at the sports festival. If it were up to me, Iād have you do more offense training.ā
āT-thanks.ā You said shyly as you scanned around you for something to look at to distract you from your own embarrassment. Things remained pretty quiet as you continued to walk to the precinct.
#bnha x reader#bnha imagines#various x reader bnha#hawks x you#hawks x reader#mirko x reader#mha x reader#mha oneshot#mha imagines#my heo academia#boku no hero fanfic#boku no hero academia#keigo takami x reader#keigo takami
402 notes
Ā·
View notes
Text
ćLe Introductionć
Hello all! Ā°Ėā§ā(ā°āæā°)āā§ĖĀ° I used tumblr back in the day but forgot about it for years and remembered it existed again. Slowly getting back in the swing of things on top of learning all the new updates that have been added on! Here are some fun facts as you explore this dumpster fire that is my tumblr page:
mostly based around sharing/posting content related to JJBA
but I share/post other content such as Inuyasha, MHA, HxH, and others from time to time
totally a dog irl don't worry about it
fav character(s) of all time 5ever: dio (jjba), doppio/diavolo (jjba), jotaro (jjba), mikitaka (jjba), sesshomaru (inuyasha), koga (inuyasha), sango (inuyasha), hisoka (HxH), chrollo (HxH), mirko (mha), jin (yyh), koto (yyh), rengoku (demon slayer), and melon (beastars)
if you happen to not like any of the characters I listed above for what I post that is totally okay but please be respectful
here mostly to have fun and share common interests w/ the wonderful folks on this website
my fav food of all time are peanut butter & raspberry jelly sammiches
Thank you for reading! Please enjoy (ā”į“„ā”) ā”
9 notes
Ā·
View notes
Text
BNHA Hamilton!AU(?) part 1
However! I will flesh out the idea around it since I have a few ideas and maybe someone actually wants to write this ^^
Genre-wise I think it would be angst (obviously) and fluff in between, but also drama, ācus yeah, Hawks going to the LOV
Since the reader is Mirkoās sibling then, I guess they would have a similar quirk to hers, rabbit related I mean. Or maybe if the reader is an adoptive sibling or something they could have something completely different.Ā
The chapters would probably base loosely on separate songs, but not all of them, since idk what you could write to some of them
Alexander Hamilton/My Shot:
Basically a recap on Hawksā background story. How his childhood went and his way to become a hero. The chapter would end with him being a well-liked hero, maybe before he became number 2. Also including the plan of him becoming a spy in the LOV and him accepting the mission.
The Schuyler Sisters:
Introduction to Mirko and the reader and their bond between siblings. Maybe they hang out together, Ā maybe Mirko has a day off or something and they have a nice day in the city, going shopping, going to an arcade, or some other nonsense. I imagine the reader being a bit introverted or at least less extroverted then Mirko.
Right Hand Man:
Going a little into detail about the relationship between Hawks and Endeavor. Maybe the events of chapter 186 to 192.
A Winterās Ball/Helpless:
Possibly the announcment of the Japanese Hero Billboard Charts, where the reader is being dragged to by Mirko (not really, they also want to support their sister, but nobody needs to know that).
Aaron Burr, Sir:
Hawks meeting up with Dabi to get into the LOV as a spy and describing the things he has to do to be seen worthy of the league.Ā
The Story of Tonight (+ Reprise):
Hawks being accepted in the inner circle of the LOV andgetting to know them a little better. Dabi(maybe?) & Hawks talk alone at some point where Dabi(again, maybe?) tells Hawks that he knows about the reader and that, if Hawks was to ever betray them he will make them (the reader) hurt. He (Dabi(?)) will search for them, will find them and will kill them. Thatās where the angst comes in if that wasnāt clear ^^āā
- - - - - - - - - - - - - - - - - - - - - - - - -
More ideas coming later (and this post probably being edited too)
Also, yes the songs are supposed to be out of order
#mha#My hero academia#my hero academia ff#bnha#Boku no hero academia#Hawks x Y/N#Hawks#Hawks x Reader#mirko#Endeavor#dabi#hamilton#mha spoilers(?)#bnha spoilers(?)
38 notes
Ā·
View notes
Text
Boku No Hero Academia characters on TikTok
Class 1-A:
Izuku: wholesome pep talks
Kirishima: memes, loves the pov Videos, heĀ“s Always more confident after watching them
Bakugo: refuses to partake in any trend and leaves mean comments on any trend videos
Jiro: Posts Videos of her music
Denki: memes with all of his Friends, leaves nice comments on JiroĀ“s videos
Momo: all the DIY stuff
Todoroki: doesnĀ“t have the app, but Midorya Always makes him watch tiktoks of the heroes and their friends
Aoyama: the classic lip sync, but heĀ“s surprisingly good at it, will also take tiktoks of others if they ask him, he Always gets the best angles
Fumikage: takes Videos of him sitting on a Chair with death metal blasting in the background
Tsuyu: loves playing with the cute filters
Mineta: is blocked from the fucking app
Shoji: Tsuyu teaches him About filters and he loves the triple screen
Ochaco: does collabs with her friends
Iida: will spend Hours on the app to report People who post Ā“inappropriateĀ“ videos
Mina: meme Squad with Denki and Kiri, she Always Comes up with the funniest ideas
Sero: duet king
Ojiro: does Videos of him Holding Things with his tail
Hagakure: Fashion account
Sato: Food videos
Koda: animal videos
Other students:
Shinsou: refuses to duet with Denki, but secretly watches all of his videos
Monoma: leaves hate comments on class 1-AĀ“s Videos and starts a fight with Bakugo and Kendo in the comments every time
Tetsutetsu: Sport challenges, like the 100 push ups etc.
Kendo: is only on the app to Keep Monoma in check
Ibara: her praying with a filter and Choir Music playing in the background
Mirio: random Videos of Tamaki and Nejire, random Videos of Sir Nighteye, does challenges and duets with his Friends and Bubble Girl, does Sports challenges
Tamaki: only there for support, doesnĀ“t take Videos himself, but likes watching other videos
Nejire: Fashion account, leaves positive comments everywhere
Mei: promoting her babies
Teachers:
All Might: doesnĀ“t get the app at first, after an introduction from Mic he does pep talks
Aizawa: doesnĀ“t have the app but makes involuntary guest appearances on ShinsouĀ“s, MicĀ“s and class aĀ“s videos
Mic: YELLING
Midnight: 18+ Stickers everywhere, and all she does is hold her Whip and wink at you (Denki once accidentally liked one of her Videos)
Nezu: Shares his philosophic thoughts
Vlad: takes videos of his class like a proud dad
Heroes:
Endeavour: doesnĀ“t have the app, but gets memed a lot by Hawks
Hawks: memes Endeavor a lot and occasionally makes slowmotion Videos of his quirk
Ms Joke: tells jokes obv
Best Jeanist: Fashion account obv (yes, he does collabs and duets with Nejire and Hagakure)
Mirko: sports challenges and feminism
Mount Lady: sexy sexy Videos, gimme likes
Fatgum: comments on KiriĀ“s and MirioĀ“s (only the ones with Tamaki in it) Videos like a proud dad. Kirishima sometimes does funny and wholesome Videos with him and Tamaki at the internship
Villains:
Shigaraki: dissolves Things and People comment: Ā“satisfyingĀ“
Dabi: only on there for the cosplayers and dms
Toga: is one of the cosplayers, fake blood warning but itĀ“s not fake
Magne: Fashion account, duets with Toga and Twice
Spinner: Stain cosplayer
Twice: does all Kinds of stuff, the Trends, fucks around with filters etc.
Giran: promotes his weapons etc.
Mr Compress: Posts random Videos of the league
Overhaul: roasts peopleĀ“s dirty rooms
Hi, welcome to the end of this post, hereĀ“s my tiktok: @lmorgan_cosplay, so if you like cringe, cursed Videos and bad Cosplay, ya know the drill <3
#boku no hero academia#boku no hero academia imagine#boku no hero academia headcanons#boku no hero academia hcs#bnha#bnha imagine#bnha headcanons#bnha hcs#my hero academia#my hero academia imagine#my hero academia headcanons#my hero academia hcs#mha#mha imagine#mha headcanons#mha hcs
94 notes
Ā·
View notes
Photo
Every year the third Thursday in November marks World Philosophy Day, UNESCO's collaborative 'initiative towards building inclusive societies, tolerance and peaceā. To celebrate, weāve curated a reading list of books and online resources on social and political philosophy, ranging from authority, democracy to human rights, as well as historical texts by philosophers who shaped the modern world. Browse the entries and start reading today.
Celebrate philosophy and explore our collection for more blog posts, articles and reading suggestions. Follow us @OUPPhilosophy on Twitter.
Knowledge and Truth in Plato: Stepping Past the Shadow of Socrates, by Catherine Rowett
Philosophy, Rhetoric, and Thomas Hobbes, by Timothy Raylor
John Locke: Literary and Historical Writings, by J.R. Milton,
Adam Smith: A Very Short Introduction, by Christopher J. Berry
Thomas Paine: Britain, America, and France in the Age of Enlightenment and Revolution, by J. C. D. Clark
The Social and Political Philosophy of Mary Wollstonecraft, by Sandrine Berges and Alan M. S. J. Coffee
Differences: Rereading Beauvoir and Irigaray, edited by Emily Anne Parker and Anne van Leeuwen
Thinking the Impossible: French Philosophy Since 1960, by Gary Gutting
Ā āConfucian Political Philosophyā by George Klosko in The Oxford Handbook of the History of Political Philosophy
Ā Adam Smithās Libertarian Paternalismā, by James R. Otteson in The Oxford Handbook of Freedom
āSophists, Epicureans, and Stoicsā, by Mirko Canevaro and Benjamin Gray, in The Hellenistic Reception of Classical Athenian Democracy and Political ThoughtĀ Ā from The Oxford Scholarship Online
āIn Defense of Uncivil Disobedienceā by Candice Delmas, in A Duty to Resist: When Disobedience Should Be Uncivil from The Oxford Scholarship Online
āA chastened individualism? Existentialism and social thoughtā in Ā Existentialism: A Very Short Introduction: VSIĀ by Thomas Flynn from Ā Very Short Introductions
āLooking At Rightsā, in Human Rights: A VSI by Andrew Clapham from Ā Very Short Introductions
āWhy do we need political philosophy?ā, by David Miller in Political Philosophy: A Very Short IntroductionĀ from Ā Very Short Introductions
#World Philosophy Day#philosophy#bookblr#book list#UNESCO#Plato#Adam Smith#John Locke#Human Rights#politics#holiday
49 notes
Ā·
View notes
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call āwinners vs losersā helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Letās get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Storesā sites use a flat URL structure with no directory paths. Our manual approach wouldnāt work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing āInspect,ā then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientistās bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, āHere is the data I have, let me try different computer science ideas I know until I find a good solution.ā
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamletās observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didnāt start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and Iāve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now letās get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with ācorrectā answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, weāll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
Whatās more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents donāt include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we donāt want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called ābinningā.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that itās never seen before. We do this to prevent our model from simply āmemorizingā the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
Weāre using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesnāt make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko ObkircherĀ commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, Iāve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet BatistaĀ is the CEO and founder of RankSense, an agileĀ SEOĀ platform for online retailers and manufacturers. He can be found on TwitterĀ @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
2 notes
Ā·
View notes
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call āwinners vs losersā helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Letās get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Storesā sites use a flat URL structure with no directory paths. Our manual approach wouldnāt work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing āInspect,ā then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientistās bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, āHere is the data I have, let me try different computer science ideas I know until I find a good solution.ā
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamletās observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didnāt start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and Iāve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now letās get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with ācorrectā answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, weāll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
Whatās more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents donāt include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we donāt want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called ābinningā.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that itās never seen before. We do this to prevent our model from simply āmemorizingā the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
Weāre using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesnāt make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko ObkircherĀ commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, Iāve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet BatistaĀ is the CEO and founder of RankSense, an agileĀ SEOĀ platform for online retailers and manufacturers. He can be found on TwitterĀ @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from IM Tips And Tricks https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/ from Rising Phoenix SEO https://risingphxseo.tumblr.com/post/184297809275
0 notes
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call āwinners vs losersā helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Letās get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Storesā sites use a flat URL structure with no directory paths. Our manual approach wouldnāt work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing āInspect,ā then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientistās bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, āHere is the data I have, let me try different computer science ideas I know until I find a good solution.ā
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamletās observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didnāt start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and Iāve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now letās get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with ācorrectā answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, weāll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
Whatās more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents donāt include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we donāt want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called ābinningā.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that itās never seen before. We do this to prevent our model from simply āmemorizingā the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
Weāre using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesnāt make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko ObkircherĀ commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, Iāve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet BatistaĀ is the CEO and founder of RankSense, an agileĀ SEOĀ platform for online retailers and manufacturers. He can be found on TwitterĀ @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
source https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/ from Rising Phoenix SEO http://risingphoenixseo.blogspot.com/2019/04/using-python-to-recover-seo-site.html
0 notes
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call āwinners vs losersā helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Letās get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Storesā sites use a flat URL structure with no directory paths. Our manual approach wouldnāt work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing āInspect,ā then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientistās bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, āHere is the data I have, let me try different computer science ideas I know until I find a good solution.ā
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamletās observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didnāt start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and Iāve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now letās get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with ācorrectā answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, weāll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
Whatās more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents donāt include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we donāt want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called ābinningā.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that itās never seen before. We do this to prevent our model from simply āmemorizingā the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
Weāre using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesnāt make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko ObkircherĀ commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, Iāve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet BatistaĀ is the CEO and founder of RankSense, an agileĀ SEOĀ platform for online retailers and manufacturers. He can be found on TwitterĀ @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call āwinners vs losersā helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Letās get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Storesā sites use a flat URL structure with no directory paths. Our manual approach wouldnāt work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing āInspect,ā then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientistās bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, āHere is the data I have, let me try different computer science ideas I know until I find a good solution.ā
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamletās observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didnāt start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and Iāve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now letās get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with ācorrectā answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, weāll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
Whatās more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents donāt include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we donāt want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called ābinningā.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that itās never seen before. We do this to prevent our model from simply āmemorizingā the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
Weāre using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesnāt make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko ObkircherĀ commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, Iāve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet BatistaĀ is the CEO and founder of RankSense, an agileĀ SEOĀ platform for online retailers and manufacturers. He can be found on TwitterĀ @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes
Text
Using Python to recover SEO site traffic (Part three)
When you incorporate machine learning techniques to speed up SEO recovery, the results can be amazing.
This is the third and last installment from our series on using Python to speed SEO traffic recovery. In part one, I explained how our unique approach, that we call āwinners vs losersā helps us quickly narrow down the pages losing traffic to find the main reason for the drop. In part two, we improved on our initial approach to manually group pages using regular expressions, which is very useful when you have sites with thousands or millions of pages, which is typically the case with ecommerce sites. In part three, we will learn something really exciting. We will learn to automatically group pages using machine learning.
As mentioned before, you can find the code used in part one, two and three in this Google Colab notebook.
Letās get started.
URL matching vs content matching
When we grouped pages manually in part two, we benefited from the fact the URLs groups had clear patterns (collections, products, and the others) but it is often the case where there are no patterns in the URL. For example, Yahoo Storesā sites use a flat URL structure with no directory paths. Our manual approach wouldnāt work in this case.
Fortunately, it is possible to group pages by their contents because most page templates have different content structures. They serve different user needs, so that needs to be the case.
How can we organize pages by their content? We can use DOM element selectors for this. We will specifically use XPaths.
For example, I can use the presence of a big product image to know the page is a product detail page. I can grab the product image address in the document (its XPath) by right-clicking on it in Chrome and choosing āInspect,ā then right-clicking to copy the XPath.
We can identify other page groups by finding page elements that are unique to them. However, note that while this would allow us to group Yahoo Store-type sites, it would still be a manual process to create the groups.
A scientistās bottom-up approach
In order to group pages automatically, we need to use a statistical approach. In other words, we need to find patterns in the data that we can use to cluster similar pages together because they share similar statistics. This is a perfect problem for machine learning algorithms.
BloomReach, a digital experience platform vendor, shared their machine learning solution to this problem. To summarize it, they first manually selected cleaned features from the HTML tags like class IDs, CSS style sheet names, and the others. Then, they automatically grouped pages based on the presence and variability of these features. In their tests, they achieved around 90% accuracy, which is pretty good.
When you give problems like this to scientists and engineers with no domain expertise, they will generally come up with complicated, bottom-up solutions. The scientist will say, āHere is the data I have, let me try different computer science ideas I know until I find a good solution.ā
One of the reasons I advocate practitioners learn programming is that you can start solving problems using your domain expertise and find shortcuts like the one I will share next.
Hamletās observation and a simpler solution
For most ecommerce sites, most page templates include images (and input elements), and those generally change in quantity and size.
I decided to test the quantity and size of images, and the number of input elements as my features set. We were able to achieve 97.5% accuracy in our tests. This is a much simpler and effective approach for this specific problem. All of this is possible because I didnāt start with the data I could access, but with a simpler domain-level observation.
I am not trying to say my approach is superior, as they have tested theirs in millions of pages and Iāve only tested this on a few thousand. My point is that as a practitioner you should learn this stuff so you can contribute your own expertise and creativity.
Now letās get to the fun part and get to code some machine learning code in Python!
Collecting training data
We need training data to build a model. This training data needs to come pre-labeled with ācorrectā answers so that the model can learn from the correct answers and make its own predictions on unseen data.
In our case, as discussed above, weāll use our intuition that most product pages have one or more large images on the page, and most category type pages have many smaller images on the page.
Whatās more, product pages typically have more form elements than category pages (for filling in quantity, color, and more).
Unfortunately, crawling a web page for this data requires knowledge of web browser automation, and image manipulation, which are outside the scope of this post. Feel free to study this GitHub gist we put together to learn more.
Here we load the raw data already collected.
Feature engineering
Each row of the form_counts data frame above corresponds to a single URL and provides a count of both form elements, and input elements contained on that page.
Meanwhile, in the img_counts data frame, each row corresponds to a single image from a particular page. Each image has an associated file size, height, and width. Pages are more than likely to have multiple images on each page, and so there are many rows corresponding to each URL.
It is often the case that HTML documents donāt include explicit image dimensions. We are using a little trick to compensate for this. We are capturing the size of the image files, which would be proportional to the multiplication of the width and the length of the images.
We want our image counts and image file sizes to be treated as categorical features, not numerical ones. When a numerical feature, say new visitors, increases it generally implies improvement, but we donāt want bigger images to imply improvement. A common technique to do this is called one-hot encoding.
Most site pages can have an arbitrary number of images. We are going to further process our dataset by bucketing images into 50 groups. This technique is called ābinningā.
Here is what our processed data set looks like.
Adding ground truth labels
As we already have correct labels from our manual regex approach, we can use them to create the correct labels to feed the model.
We also need to split our dataset randomly into a training set and a test set. This allows us to train the machine learning model on one set of data, and test it on another set that itās never seen before. We do this to prevent our model from simply āmemorizingā the training data and doing terribly on new, unseen data. You can check it out at the link given below:
Model training and grid search
Finally, the good stuff!
All the steps above, the data collection and preparation, are generally the hardest part to code. The machine learning code is generally quite simple.
Weāre using the well-known Scikitlearn python library to train a number of popular models using a bunch of standard hyperparameters (settings for fine-tuning a model). Scikitlearn will run through all of them to find the best one, we simply need to feed in the X variables (our feature engineering parameters above) and the Y variables (the correct labels) to each model, and perform the .fit() function and voila!
Evaluating performance
After running the grid search, we find our winning model to be the Linear SVM (0.974) and Logistic regression (0.968) coming at a close second. Even with such high accuracy, a machine learning model will make mistakes. If it doesnāt make any mistakes, then there is definitely something wrong with the code.
In order to understand where the model performs best and worst, we will use another useful machine learning tool, the confusion matrix.
When looking at a confusion matrix, focus on the diagonal squares. The counts there are correct predictions and the counts outside are failures. In the confusion matrix above we can quickly see that the model does really well-labeling products, but terribly labeling pages that are not product or categories. Intuitively, we can assume that such pages would not have consistent image usage.
Here is the code to put together the confusion matrix:
Finally, here is the code to plot the model evaluation:
Resources to learn more
You might be thinking that this is a lot of work to just tell page groups, and you are right!
Mirko ObkircherĀ commented in my article for part two that there is a much simpler approach, which is to have your client set up a Google Analytics data layer with the page group type. Very smart recommendation, Mirko!
I am using this example for illustration purposes. What if the issue requires a deeper exploratory investigation? If you already started the analysis using Python, your creativity and knowledge are the only limits.
If you want to jump onto the machine learning bandwagon, here are some resources I recommend to learn more:
Attend a Pydata event I got motivated to learn data science after attending the event they host in New York.
Hands-On Introduction To Scikit-learn (sklearn)
Scikit Learn Cheat Sheet
Efficiently Searching Optimal Tuning Parameters
If you are starting from scratch and want to learn fast, Iāve heard good things about Data Camp.
Got any tips or queries? Share it in the comments.
Hamlet BatistaĀ is the CEO and founder of RankSense, an agileĀ SEOĀ platform for online retailers and manufacturers. He can be found on TwitterĀ @hamletbatista.
The post Using Python to recover SEO site traffic (Part three) appeared first on Search Engine Watch.
from Digtal Marketing News https://searchenginewatch.com/2019/04/17/using-python-to-recover-seo-site-traffic-part-three/
0 notes