#It's really not that hard to find them i just recommend having uBlock
Explore tagged Tumblr posts
bluefuecoco · 5 months ago
Text
i love watching tutorials on emulating because most of them are like "Remember, you have to get your roms from dumping the files yourself [wink nudge]"
but one i saw the other day said like "You'll just have to find them mysteriously. And once you stumble across these forbidden relics--"
28 notes · View notes
kytsuine-blog · 1 year ago
Text
A couple of notes from a techy young person who's been wrasslin' Windows for almost two years because of a bet, but is a Linux native:
* Get comfortable opening a terminal. There's a lot of useful programs that work better when the person telling you how to use them can just say "type this specifically" rather than having to learn all the ways the new update moved around all the graphical elements.
* Use a package manager. It's not worth it to try and remember to keep every little thing updated on its own. Choco can do it for you.
* Honestly? Stop the most egregious data collection from Windows, and give up on stopping everything. Do what you can to learn the ever-rotating steps it takes to block ads (because they are actually an affront to nature) - right now, uBlock Origin for your Firefox (and do use Firefox, everything else is Google these days) and searching for "tips, tricks" and/or "ads" in settings for Windows get most of them. To completely stop Windows data collection, you can use a tool called O&O ShutUp10, (package shutup10 on choco) but knowing what is and isn't safe to turn off is its own learning process. (Generally, turn off things it says are safe to.)
* If you actually need for some reason to be sure you aren't having data collected by the system, switch operating systems. Linux is the classic choice, and honestly is pretty user friendly imo these days. If you don't want to figure it out, though, honestly Macs are fine. I use them for work, they're functional, just too expensive. I'm not aware of nearly as many privacy concerns there, but I also work in about the least security- critical space possible so I've never had to check.
* Use the tool "autoruns" to decide what starts when on your computer. Google "sysinternals live" for instructions on how to run it off of the network - part of their deal with Microsoft when they were bought out is that their tools have to be available freely. They forgot to say it had to be publicised, but hey, it's at least available.
* That backup on a hard drive should really be stored somewhere that isn't your house, and that won't disappear due to interpersonal drama. Keep it with your childhood bestie you don't talk to much these days but who's always in your corner, not with your latest crush. Or with family, if you have family worth keeping it with. (I don't follow this advice, but I also high-key want to get out of that stupid bet.)
* For a specific password manager recommendation, I use Bitwarden. It's free for basically any use most folks could want from it, open source (so folks can and have checked that they aren't secretly stealing your passwords or storing them badly), and available most anywhere. Also, turn on two factor authentication. Where available, I use a YubiKey because it's easier than always being near my phone, but I also have an authentication app on my phone for the many, many places that don't take U2F or FIDO yet. (What those acronyms stand for isn't important, they're just different protocols for how to check to make sure you have a particular physical key to prove you're not just someone who guessed your password.)
Most importantly, never assume you're smart enough that you don't need to keep learning. This journey don't stop, and I don't think it should. Keep on going, and find ways to curate joy even when the digital world is becoming a digital wasteland. There's still people out there who make it worth it.
Me: oh yeah, if you think school photography is hard now, try imagining doing this with film.
The new girl: what’s film?
Me: 
 film. Like
 film that goes in a film camera.
New girl: what’s that mean?
Me: 
 before cameras were digital.
New girl: how did you do it before digital?
Me:
 with film? I haven’t had enough coffee for this conversation
113K notes · View notes
bluescreening · 4 years ago
Text
Internet Safety
Yeah, I know, you’ve all sat through the talks at school telling you never to tell strangers your credit card details or whatever. But it has come to my attention that there are a worrying number of people who don’t know the actual practical things you can do to stay safe and secure while on the web. These tips cover invasions of privacy from anybody including big companies and hackers. It’s probably worthwhile to give ‘em a go.
Personal Safety
Password Safety - Use a different password for every website. I’m not kidding. If you think you’ll struggle to remember that many, you have two options. Firstly, you can use a password manager such as OnePassword, which is probably the safest option. If you’re like me and can’t quite bring yourself to trust one (there’s no reason not to, it just doesn’t sit right with me) you can use variations on a password for unimportant sites, and then come up with secure ones for sites you share more personal info with. 
Have I Been Pwned? - This is a website which tells you if your email has been involved in a data breach. Don’t worry if you have been pwned - you have different passwords for everything, remember! Just be aware of what data has been leaked, and change a password or two if necessary. Sign up for their email notifications to stay on top of recent breaches.
ProtonVPN - A VPN, if you don’t know, stands for virtual private network. Picture all the different connections between devices in a network, linked through WiFi or cables, as highways. VPNs section off a lane for your own private use, so nobody can see what you’re sending or receiving. It’s unlikely that anyone will be looking on your home network, but on public WiFi networks it’s important to prevent anyone seeing anything they shouldn’t - it’s not hard to packet sniff! You can also use them to bypass school and workplace website blocking, and access sites blocked in your country. Obviously ProtonVPN isn’t the only one, but I’d recommend em as they encrypt everything and have some pretty beefy systems in place to prevent tracking. It’s available on all devices for free.
ProtonMail - Yes, yes, more ProtonStuff, but this is a really good one. I’ll get onto why Google tracking you is a bad thing later, but if you want to break out of Google’s ecosystem, ProtonMail is a good alternative to GMail. It encrypts all your emails, which means nobody intercepting the email will know what it says. That means it’s great for private matters that you want to keep secret or avoid Google telling people about, like banking and stuff. It’s also a bit more customisable than GMail.
Social Media Checkup - Do you know exactly how much someone can find out about you, just by looking at your social media? Facebook is a special offender for that one (I don’t even have an account there anymore - and dear lord was deleting it a struggle) but Insta, Snapchat, Twitter and yes, even Tumblr, might provide a creep more info than you bargained for. Think about how much you want to make public, or how much the app has on you at all. There are plenty of tutorials on how to adjust your settings.
HTTPS Everywhere - A very handy extension that forces websites to encrypt all your data as you send it back and forth.
Avoiding Tracking
Why? - I know it might seem weird that a large company, or even the government, might want to keep track of little old you. Sure, they can target you with relevant ads, but whatever, you use an ad-blocker anyway. That is, until you realise that behind the scenes, on almost every website you visit, data-brokers are collecting info on you and what you do online, and building a profile of you. It’s not anonymous. And it can be used for anything from determining your creditworthiness and insurance premiums to detailed surveillance. Yeah. With all the protests going on lately, it would make sense to keep these people from learning about you for your own safety and your future.
DuckDuckGo - Start by using this search engine instead of Google, and installing the Privacy Essentials extension. It’s a good search engine, for one thing. For another, it prevents tracking and lets you know whose schemes you’ve foiled, you meddling kid. It gives each site you visit a privacy rating, and lets you know how much it’s increased that by. For example, Tumblr usually receives a D, but DuckDuckGo has blocked some trackers and improved it to a B. It has also informed me that trackers have been found and dealt with on over 50% of the websites I visit. Google is unsurprisingly the main culprit.
Alternative Browsers - There are lots of things you can use instead of Chrome, and many of them work really well! I recommend Firefox, since it’s almost exactly like Chrome but open-source, and it also protects you from trackers and has lots of fun extensions. There are some other good PC ones too like Opera and Vivaldi, but I haven’t used them before so I wouldn’t know how good they are. DuckDuckGo has its own mobile browser which is currently my main one.
Adblockers - You can’t get targeted ads if you don’t get ads! You can choose who to show ads for too, so if you want to support a certain site you can whitelist them. Try UBlock Origin, or Adblock Plus. Install ‘em as extensions for whatever browser you’re using.
Privacy Checkup - Go through your Google account with a fine-toothed comb and check what is being tracked about you. Pause your YouTube history, your Maps history, your Google Assistant history. Clear what you can. Check Amazon too. Also, never ever use Cortana or Siri or Alexa or anything like that. Ever. No matter how cool having a robot assistant is.
And that should be that! I’ll try to keep updating this post with new tips as I find them, but this is everything I do for the minute to ensure I’m protected online. 
UPDATE #1 (9/8/20): I started using Vivaldi and goddammit is it brilliant!!! Extreme customisation, it's chromium-based so you have all your fancy Chrome extensions and it has a lovely mobile app too. My current browser setup on both desktop and mobile is Vivaldi with Firefox as a backup, both with DuckDuckGo and adblockers.
102 notes · View notes
wickedbananas · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
from The Moz Blog https://ift.tt/2skU6gW via IFTTT
1 note · View note
fencesandfrogs · 4 years ago
Text
just a reminder that switching to firefox is fast easy & free. [tips below]
okay cool you’re still here? well first off someone better than me made this guide so you can just read that but if you want a quick and dirty overview:
download firefox. you can keep chrome open for this entire thing. anyway do its auto import thing.
if u don’t have google chrome one tab, install it for easy tab porting. when you’re reading to move tabs over, just hit the button, then follow the import/export instructions (you’ll have to install one tab on firefox too but it’s easy enough)
move ur gc add ons over. not all of them will exist. mourn the ones that don’t. if you find an add on where mozilla is like “btw we don’t know if this is secure,” then make sure that whatever it provides is worth security risks. for example: i decided honey wasn’t worth it, but an add on that makes my firefox solarized color scheme was. don’t over think this tho.
also there’s a mozilla approved alternative for grammarly that i’ve started w and first off, you can turn off checking capitalization on certain sites so that’s already better than grammarly imo
next, a full list of privacy related add ons i have (these r all listed in the post i linked too): duckduckgo, ublock origin, privacy badger, https everywhere, ghostery, facebook container (only tangentially privacy related, but if you also use multi-account containers, you can fiddle with some of those settings, as well as segregating any other accounts you might wish to. extra credit, if you will)
go pick out a theme for ur hard work. i might recommend zen fox solarized because i think the solarized theme is nice, i couldn’t find a nord theme, and you get to pick what ur accent color is for both dark and light so that’s fun
now make sure firefox is ur default browser, log in to some key accounts to make sure there aren’t any issues, then close google chrome and unpin it from ur taskbar. i would suggest keeping it for a little bit in case there are issues.
congrats! go browse with a little more security.
but never forget: this is not the end of it. it’s really easy to do one thing and be like “great i’m safe forever now!” which is not how this works. i’m not saying u should give up if ur not about to cut all ties with google, every step is a good one, just don’t forget that as long as the free internet exists, people will try to profit of you. you are not the consumer. you are the product. so make sure you know who’s making money off of you.
0 notes
imapplied · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it’s the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Tumblr media
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes – “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes – “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Tumblr media
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
First Found Here
from https://www.imapplied.co.za/social-media/how-much-data-is-missing-from-analytics-and-other-analytics-black-holes/
0 notes
tainghekhongdaycomvn · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
https://ift.tt/2q13Myy xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B BáșĄn cĂł thể xem thĂȘm địa chỉ mua tai nghe khĂŽng dĂąy táșĄi đñy https://ift.tt/2mb4VST
0 notes
majorasheart · 3 years ago
Text
While I can't stop you from seeing ads irl, I can provide these:
System wide ad blocker for iOS and Android
For the iOS version, you'll have to get it off the app store, while the Android version has an APK. It functions like a VPN and will pretty much always stay active except when it updates and when you restart your phone. The only downsides I've had it with it are A.) Some sites will completely stop working with this enabled, and you'll have to disable it to see the site B.) Because it functions like a VPN, you can't have another VPN with this one active, and C.) There are certain apps (Youtube, Twitter, Tumblr, etc) that it can't block ads on because they're embedded within the app itself, and aren't provided by a website.
Youtube Vanced, Youtube Premium without paying
This is a version of the Youtube app that provides everything Youtube Premium offers and more, including removing ads. This one is my personal favorite, I've had such a good time with it. However, the downside is that once you get it set up, if you were to change the password of the account you signed in with you have to re-sign in, which can really be a pain in the ass with how buggy it can be sometimes.
That being said, I should also say that the only reason why it can seem buggy is because Google is really tight with their apps, so it can be hard for the developers to work around their bullshit. There's also a pretty active subreddit which you can turn to for help if necessary.
Spotify Premium app
(Android only, sorry iOS users. This is the file I used, but if it doesn't work let me know and I'll update it)
Free Spotify Premium. That's....it really. Only thing that sucks is that there isn't really a cracked desktop app since that's harder to do. If you want my personal opinion, you really shouldn't even be using Spotify. Just invest in an MP3 player app. For desktops, VLC media player has always worked wonderfully for me. It doesn't take a lot of processing power to have open, and it can have playlists too. For phones, you'll have to hunt one down yourself.
For downloading audio files, I can't recommend youtube-dl enough. I'm just going to say right now, it's pretty complicated to set up and I can't explain it all in this post, but it operates using cmd. If you figure out how to set it up, it's really nice for downloading high quality audio. Otherwise if you don't care about audio quality and want something easy to use, just use ytmp3 or something.
Other resources:
Every time you see an ad on Twitter, go to the top right of the post, tap on the three dots, and block the company. If you start doing this to every single ad you see, ads will start to become fairly rare. Though I can't do anything about ads embedded into other videos.
uBlock Origin is a browser add-on I recommend above any other ad blocker. This is because it allows you to enter custom scripts into the add-on, and if you go hunting for scripts other people have made and add them, you can block more then just ads. You can usually find scripts on the subreddit, otherwise just google around for them.
If you're an iOS user, a lot of what you can do will be limited because of how needlessly tight Apple is about giving you freedom. This is why I highly suggest learning how to jailbreak your phone. There are mounds of cracked versions of apps available if you do, which will make it much easier to find versions of apps that remove ads. However, if you do this, there's a risk of your apple account being banned if they find out. If you decide to do this, please be careful and follow a well trusted guide very closely.
That's about all I can provide. If anyone else has anything to share, please add it to this post for others to find. Also, my inbox and messages are pretty much always open. If any of the links are broken or you need help setting up one of these programs, feel free to reach out to me.
all advertising needs to be destroyed im sick of ads on the free apps that *came with the computer i bought*.. on MY computer! im sick of 15 seconds of advertising before i watch a video made by a zillenial then paid to recite how much they love the new mocoa cocoa drink mix im sick of brands pretending to be my friend im sick of urban space used only to sell you products (later, somewhere else) im sick of subscription services im sick of copyright im sick of new roads for new customers for our new walmart im SICK!!!! burn it all down i can't live like this!!!!
90K notes · View notes
isearchgoood · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
via Blogger https://ift.tt/2kwAy68 #blogger #bloggingtips #bloggerlife #bloggersgetsocial #ontheblog #writersofinstagram #writingprompt #instapoetry #writerscommunity #writersofig #writersblock #writerlife #writtenword #instawriters #spilledink #wordgasm #creativewriting #poetsofinstagram #blackoutpoetry #poetsofig
0 notes
lawrenceseitz22 · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
from Blogger https://ift.tt/2KZaOKK via IFTTT
0 notes
swunlimitednj · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
from Blogger https://ift.tt/2J9fNey via SW Unlimited
0 notes
rodneyevesuarywk · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
https://ift.tt/2LCPWKo
0 notes
conniecogeie · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
https://ift.tt/2LCPWKo
0 notes
christinesumpmg1 · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
https://ift.tt/2LCPWKo
0 notes
maryhare96 · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
https://ift.tt/2LCPWKo
0 notes
tainghekhongdaycomvn · 6 years ago
Text
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
How Much Data Is Missing from Analytics? And Other Analytics Black Holes
Posted by Tom.Capper
If you’ve ever compared two analytics implementations on the same site, or compared your analytics with what your business is reporting in sales, you’ve probably noticed that things don’t always match up. In this post, I’ll explain why data is missing from your web analytics platforms and how large the impact could be. Some of the issues I cover are actually quite easily addressed, and have a decent impact on traffic — there’s never been an easier way to hit your quarterly targets. ;)
I’m going to focus on GA (Google Analytics), as it's the most commonly used provider, but most on-page analytics platforms have the same issues. Platforms that rely on server logs do avoid some issues but are fairly rare, so I won’t cover them in any depth.
Side note: Our test setup (multiple trackers & customized GA)
On Distilled.net, we have a standard Google Analytics property running from an HTML tag in GTM (Google Tag Manager). In addition, for the last two years, I’ve been running three extra concurrent Google Analytics implementations, designed to measure discrepancies between different configurations.
(If you’re just interested in my findings, you can skip this section, but if you want to hear more about the methodology, continue reading. Similarly, don’t worry if you don’t understand some of the detail here — the results are easier to follow.)
Two of these extra implementations — one in Google Tag Manager and one on page — run locally hosted, renamed copies of the Google Analytics JavaScript file (e.g. www.distilled.net/static/js/au3.js, instead of www.google-analytics.com/analytics.js) to make them harder to spot for ad blockers. I also used renamed JavaScript functions (“tcap” and “Buffoon,” rather than the standard “ga”) and renamed trackers (“FredTheUnblockable” and “AlbertTheImmutable”) to avoid having duplicate trackers (which can often cause issues).
This was originally inspired by 2016-era best practice on how to get your Google Analytics setup past ad blockers. I can’t find the original article now, but you can see a very similar one from 2017 here.
Lastly, we have (“DianaTheIndefatigable”), which just has a renamed tracker, but uses the standard code otherwise and is implemented on-page. This is to complete the set of all combinations of modified and unmodified GTM and on-page trackers.
Two of Distilled’s modified on-page trackers, as seen on https://www.distilled.net/
Overall, this table summarizes our setups:
Tracker
Renamed function?
GTM or on-page?
Locally hosted JavaScript file?
Default
No
GTM HTML tag
No
FredTheUnblockable
Yes - “tcap”
GTM HTML tag
Yes
AlbertTheImmutable
Yes - “buffoon”
On page
Yes
DianaTheIndefatigable
No
On page
No
I tested their functionality in various browser/ad-block environments by watching for the pageviews appearing in browser developer tools:
Reason 1: Ad Blockers
Ad blockers, primarily as browser extensions, have been growing in popularity for some time now. Primarily this has been to do with users looking for better performance and UX on ad-laden sites, but in recent years an increased emphasis on privacy has also crept in, hence the possibility of analytics blocking.
Effect of ad blockers
Some ad blockers block web analytics platforms by default, others can be configured to do so. I tested Distilled’s site with Adblock Plus and uBlock Origin, two of the most popular ad-blocking desktop browser addons, but it’s worth noting that ad blockers are increasingly prevalent on smartphones, too.
Here’s how Distilled’s setups fared:
(All numbers shown are from April 2018)
Setup
Vs. Adblock
Vs. Adblock with “EasyPrivacy” enabled
Vs. uBlock Origin
GTM
Pass
Fail
Fail
On page
Pass
Fail
Fail
GTM + renamed script & function
Pass
Fail
Fail
On page + renamed script & function
Pass
Fail
Fail
Seems like those tweaked setups didn’t do much!
Lost data due to ad blockers: ~10%
Ad blocker usage can be in the 15–25% range depending on region, but many of these installs will be default setups of AdBlock Plus, which as we’ve seen above, does not block tracking. Estimates of AdBlock Plus’s market share among ad blockers vary from 50–70%, with more recent reports tending more towards the former. So, if we assume that at most 50% of installed ad blockers block analytics, that leaves your exposure at around 10%.
Reason 2: Browser “do not track”
This is another privacy motivated feature, this time of browsers themselves. You can enable it in the settings of most current browsers. It’s not compulsory for sites or platforms to obey the “do not track” request, but Firefox offers a stronger feature under the same set of options, which I decided to test as well.
Effect of “do not track”
Most browsers now offer the option to send a “Do not track” message. I tested the latest releases of Firefox & Chrome for Windows 10.
Setup
Chrome “do not track”
Firefox “do not track”
Firefox “tracking protection”
GTM
Pass
Pass
Fail
On page
Pass
Pass
Fail
GTM + renamed script & function
Pass
Pass
Fail
On page + renamed script & function
Pass
Pass
Fail
Again, it doesn’t seem that the tweaked setups are doing much work for us here.
Lost data due to “do not track”: <1%
Only Firefox Quantum’s “Tracking Protection,” introduced in February, had any effect on our trackers. Firefox has a 5% market share, but Tracking Protection is not enabled by default. The launch of this feature had no effect on the trend for Firefox traffic on Distilled.net.
Reason 3: Filters
It’s a bit of an obvious one, but filters you’ve set up in your analytics might intentionally or unintentionally reduce your reported traffic levels.
For example, a filter excluding certain niche screen resolutions that you believe to be mostly bots, or internal traffic, will obviously cause your setup to underreport slightly.
Lost data due to filters: ???
Impact is hard to estimate, as setup will obviously vary on a site-by site-basis. I do recommend having a duplicate, unfiltered “master” view in case you realize too late you’ve lost something you didn’t intend to.
Reason 4: GTM vs. on-page vs. misplaced on-page
Google Tag Manager has become an increasingly popular way of implementing analytics in recent years, due to its increased flexibility and the ease of making changes. However, I’ve long noticed that it can tend to underreport vs. on-page setups.
I was also curious about what would happen if you didn’t follow Google’s guidelines in setting up on-page code.
By combining my numbers with numbers from my colleague Dom Woodman’s site (you’re welcome for the link, Dom), which happens to use a Drupal analytics add-on as well as GTM, I was able to see the difference between Google Tag Manager and misplaced on-page code (right at the bottom of the <body> tag) I then weighted this against my own Google Tag Manager data to get an overall picture of all 5 setups.
Effect of GTM and misplaced on-page code
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Chrome
100.00%
98.75%
100.77%
99.80%
94.75%
Safari
100.00%
99.42%
100.55%
102.08%
82.69%
Firefox
100.00%
99.71%
101.16%
101.45%
90.68%
Internet Explorer
100.00%
80.06%
112.31%
113.37%
77.18%
There are a few main takeaways here:
On-page code generally reports more traffic than GTM
Modified code is generally within a margin of error, apart from modified GTM code on Internet Explorer (see note below)
Misplaced analytics code will cost you up to a third of your traffic vs. properly implemented on-page code, depending on browser (!)
The customized setups, which are designed to get more traffic by evading ad blockers, are doing nothing of the sort.
It’s worth noting also that the customized implementations actually got less traffic than the standard ones. For the on-page code, this is within the margin of error, but for Google Tag Manager, there’s another reason — because I used unfiltered profiles for the comparison, there’s a lot of bot spam in the main profile, which primarily masquerades as Internet Explorer. Our main profile is by far the most spammed, and also acting as the baseline here, so the difference between on-page code and Google Tag Manager is probably somewhat larger than what I’m reporting.
I also split the data by mobile, out of curiosity:
Traffic as a percentage of baseline (standard Google Tag Manager implementation):
Google Tag Manager
Modified & Google Tag Manager
On-Page Code In <head>
Modified & On-Page Code In <head>
On-Page Code Misplaced In <Body>
Desktop
100.00%
98.31%
100.97%
100.89%
93.47%
Mobile
100.00%
97.00%
103.78%
100.42%
89.87%
Tablet
100.00%
97.68%
104.20%
102.43%
88.13%
The further takeaway here seems to be that mobile browsers, like Internet Explorer, can struggle with Google Tag Manager.
Lost data due to GTM: 1–5%
Google Tag Manager seems to cost you a varying amount depending on what make-up of browsers and devices use your site. On Distilled.net, the difference is around 1.7%; however, we have an unusually desktop-heavy and tech-savvy audience (not much Internet Explorer!). Depending on vertical, this could easily swell to the 5% range.
Lost data due to misplaced on-page code: ~10%
On Teflsearch.com, the impact of misplaced on-page code was around 7.5%, vs Google Tag Manager. Keeping in mind that Google Tag Manager itself underreports, the total loss could easily be in the 10% range.
Bonus round: Missing data from channels
I’ve focused above on areas where you might be missing data altogether. However, there are also lots of ways in which data can be misrepresented, or detail can be missing. I’ll cover these more briefly, but the main issues are dark traffic and attribution.
Dark traffic
Dark traffic is direct traffic that didn’t really come via direct — which is generally becoming more and more common. Typical causes are:
Untagged campaigns in email
Untagged campaigns in apps (especially Facebook, Twitter, etc.)
Misrepresented organic
Data sent from botched tracking implementations (which can also appear as self-referrals)
It’s also worth noting the trend towards genuinely direct traffic that would historically have been organic. For example, due to increasingly sophisticated browser autocompletes, cross-device history, and so on, people end up “typing” a URL that they’d have searched for historically.
Attribution
I’ve written about this in more detail here, but in general, a session in Google Analytics (and any other platform) is a fairly arbitrary construct — you might think it’s obvious how a group of hits should be grouped into one or more sessions, but in fact, the process relies on a number of fairly questionable assumptions. In particular, it’s worth noting that Google Analytics generally attributes direct traffic (including dark traffic) to the previous non-direct source, if one exists.
Discussion
I was quite surprised by some of my own findings when researching this post, but I’m sure I didn’t get everything. Can you think of any other ways in which data can end up missing from analytics?
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
báșĄn xem thĂȘm táșĄi: https://ift.tt/2mXjlRS How Much Data Is Missing from Analytics? And Other Analytics Black Holes xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B xem thĂȘm táșĄi: https://ift.tt/2mb4VST để biáșżt thĂȘm về địa chỉ bĂĄn tai nghe khĂŽng dĂąy giĂĄ ráș» How Much Data Is Missing from Analytics? And Other Analytics Black Holes https://ift.tt/2GWKq1B BáșĄn cĂł thể xem thĂȘm địa chỉ mua tai nghe khĂŽng dĂąy táșĄi đñy https://ift.tt/2mb4VST
0 notes