All Posts • Blog

Using Google Analytics to Get Insight into Your User's Preferred Languages

Feb 12th 2019 Google Analytics Photo by Jon Tyson on Unsplash

Google Analytics provides a lot of data that remains undiscovered by many users. Even if some are aware, it requires a fair amount of technical work to make it actionable. Language data is one of those datasets.

We set out to answer the question What languages do our users speak? to drive a handful of actions for our users:

  1. Decisions around Translations
  2. Decisions around Voice and Tone
  3. Monitoring Reach

Decisions around Translations

Whether your business is a startup, a local mom-and-pop shop, or a bustling enterprise, it's critical to understand reach and revenue opportunity by investing in translations. Here in Burlington, for example, businesses on our ever-delightful and always-bustling Church Street cater to Vermonters and Québécois alike, requiring (if they're savvy!) translations for those French-speaking Canadians coming down for the weekend.

"But wait," busy business owners exclaim, "Google Translate automatically translates our website for nearly any language!" That might be true, but at some point, if a certain percentage of your audience is translating your website to a specific language, control your message rather than leaving translations up to the technology Gods. Further, text within images and other complexities means Google Translate might not be perfect (go try to buy tickets to a show at Blue Note in Tokyo, oof).

There's no doubt translations are a hassle, but even if you're a small business without a development team, services like Squarespace offer a solution and Fiverr offers the talent to get it done inexpensively.

Decisions around Voice and Tone

So you decided investing in translations is too expensive or you don't quite have the audience in a particular language to warrant the level of effort involved in yet-another-thing for every small tweak in your copy. That's great! You made a data-driven decision, and we couldn't be happier.

But, just because you decided to stick with English doesn't mean you're free and clear. For the small percentage of traffic from other dialects and languages, will they understand your pop culture references? Your puns and memes? Not only may they not translate correctly, they may be a miss for your English-speaking audience from a different dialect (not to mention age, but that's a topic for another day!).

Monitoring Reach

If you're in a diverse area, you may intentionally try to reach a specific audience, whether you've invested in translations or not. Those northern Vermont businesses we mentioned earlier – do they have pages that target our Montreal friends? If so, monitoring growth in a particular language code may serve to complement the geographic data you're (hopefully!) already using to validate your efforts.

Another example: you're a New York City business, over eight million residents strong, with over 100 languages spoken just in the school system and many more overall. Measuring your audience by geography is a great first step, but if your business focuses on the New York City area anyway, language provides another level of segmentation to ensure you're maximizing your audience by spending your marketing dollars most efficiently.

Insights from Bilinguists

If you mistake consistency for accuracy in web analytics, you're doing it wrong, and the language data supplied from browsers is no exception. We spoke to a handful of bilingual folks and found that despite their first (and often preferred) language was not English, they set their browsers to English because the vast majority of the internet is in English.

Applying this learning to your business decisions, you can reasonably assume that a small percentage of users accessing your website with a language code set is indicative of a greater number of users browsing in English that would prefer other options. Give them an in-house, controlled translation, and you've taken one significant step towards a downright delightful user experience.

Nitty Gritty Google Analytics Details

We'd love to say Google Analytics gets you 90% of the way to building actionable recommendations to the above, but unfortunately that's just not the case. Upon navigating to the Google Analytics' language report (Audience > Geo > Language), you, likely someone just not that interested in language codes, are greeted with a table of them, almost certainly inclusive of some kind of noise.

Ah yes, es-es.

Ah yes, es-es.

Behind the scenes, Tell Analytics automates common analytical steps to clean and process the data into meaningful insight. We do this via three main steps:

  1. Split the Language from the Dialect
  2. Remove the Noise
  3. Provide Insight

Split the Language from the Dialect

First and foremost, a single language code comes in two dash separated parts: first, the language, and second, the dialect. From most of the datasets we've worked with, it's not immediately apparent, or easy, to determine how many of your users speak English (or any other language) because several rows in the table will be attributed to said language. By aggregating our metrics by the first half of the language code, we can start to better understand key languages independent of their dialect.

Remove the Noise

Blank language codes? c? posix? All kinds of bots access websites every day, creating all sorts of strange data. We strip out language codes attributed to automation and bots to focus on the known languages and dialects set by users. As anyone who's tried to maintain the ever-elusive perfectly filtered dataset knows, try as you might, you'll always need another filter! And once that filter is added, do all your analytics users know what date range does or does not need c filtered out? Sure, you can add annotations, but do people really use those? What about your data scientists using Reporting API calls?

Provide Insight

Finally, no analytics stakeholder wants to decipher dashboards suggesting en-us is their top language, or that es-mx is trending up. Behind the scenes, Tell Analytics maps language codes to their readable counterparts, so we can say with confidence that your website serves a primarily English-speaking audience, but Mexican-based Spanish should be on your radar.

Wrapping Up

One key to successful data-driven decisions involves running an analysis consistently, but with small businesses often staffed with individuals that wear many hats and enterprise that often can't dedicate the time to automate all the things, one quick click within Tell Analytics can provide a robust, consistently-built, and readable answer to the questions you need to drive action.

For Tell Analytics, this is all in a days work, worrying about the details and then automating and scaling enterprise-quality analysis for anyone with a stake in their digital presence.

We specialize in making data actionable.

Terms Privacy team@tellanalytics.com   Burlington, VT

© 2020 Tell Analytics, LLC