Business intelligence is one of the most popular and fast-growing areas of software that can add substantial value to virtually all business processes in your company.
BI analytics software delivers actionable insights and empowers your organization with valuable data, which helps you improve your decision-making process. It assists you in determining possible future market trends, measuring your efficiency, and identifying potential growth areas.
Looker is a web-based service aimed to help business users pull useful insights from the data that their company generates. It was created in 2012 by a US startup, acquired by Google in 2019, and is now a part of Google Cloud.
There are many various solutions on the market apart from Looker, like Redash, Tableau, or PowerBI. Those other tools may differ in aspects like:
Different tools can also feature different maturity levels, resulting in various sets of available visualizations, on-the-fly transformations, etc.
What’s important, this article focuses on Looker, but Google has competitors with products of very similar nature. This makes most of the following analysis universal for all of those business intelligence tools.
One of our clients has recently decided to replace Redash, an open-source service running on a dedicated instance—in their case—in the same cloud where their data was stored. They chose Looker, the modern commercial and hosted counterpart of Redash.
The main reason our client started looking for a Redash replacement was to reduce the time necessary to produce a dashboard that contained all the data vital to their business, from formulating their business needs all the way to actually delivering that dashboard.
In Redash’s case, every tile of each dashboard is directly based on an SQL query. Creating a new tile or modifying an existing one means writing query code, and that makes SQL competence a must-have, and effectively limits the number of people that could prepare them. Therefore, the bottleneck was the requested processing capacity of the data team.
Looker is a service that connects existing databases, data warehouses, and data lakes. It still has to use SQL in the background, but the difference here is that there are intermediary layers between the data and the dashboards. The most important features of Looker are: “models and views,” “explores,” “calculations,” and the “drill-down.”
The data schema is mapped into Looker “models,” which regroup “views.” “Views” may be direct wrap-ups of the actual data tables or transformed tables created in order to make business insight explorations more straightforward.
What the business users see in their interface are the Looker “explores”—pages with a list of fields that you can select, format, and make visualizations of by simply clicking, drag-and-dropping, picking colors from the palette, etc. The only need to use a keyboard is when inputting a dashboard tile title.
If you want to push the exploration even further, it’s enough to define some filters, declare a couple of “calculations” (excel-like column operations), or select one of the fields to be the pivot of a table.
The cherry-on-top feature for any business analyst is the “drill-down”—the ability to zoom into the details behind any statistics. You can do it by, for example, clicking on a bar of the chart representing revenues by month, which would result in a more fine-grained slicing or even the complete listing of all items sold that month.
All business intelligence tools promise their users that they’ll be able to focus on clicking through all the valuable information with no need for technical knowledge and with very little time needed to deliver presentable content.
That convinced our client to invest in the Looker license and start the migration. Now let’s check if Looker is keeping its promises in practice.
Whereas the idea of Looker “explores” is indeed convenient for the analyst, it won’t actually be such a painless experience if the “explores” themselves are not well prepared beforehand.
Formatting values—using them first as filters, then as charts in dashboards—is simply impossible if they are incorrectly defined and linked to data behind the scenes.
Unfortunately, you should consider all the limitations and requirements from the very beginning. They make for a true wall of technical dependencies that need to be understood and solved before the dashboards can be as insightful as they are supposed to be in Looker and other similar tools.
Let me now outline the most bothersome, complicated aspects of Looker that will serve us as a critical guide to implementing almost any business intelligence analytics software in your company.
I’ve mentioned Looker “views” before. They are defined in LookML, a hybrid of YAML and JSON, that uses additional special syntaxes, for example, Liquid, to enable some advanced features.
This isn’t a technical article, so I won’t get much more into the implementation details and the code itself, but you need to consider that there is a steep learning curve for the technical and data team to face when dealing with Looker for the first time.
And even once you know LookML, there is still loads of work to be done before a business user can reap the benefits.
Some of the work is actually automated by the tool itself. For example, the “views” are generated automatically for the existing tables that Looker discovers for you.
This is crucial, because if your tables have several hundred fields, then the code for the corresponding Looker “dimensions,” which populate Looker “views,” can be pre-created for you.
And since those “dimensions” are translated into the ones you see in a Looker “explore,” you might think that the backend work is done, but that’s hardly the case. First, a developer needs to check if all the “dimensions” have the correct Looker data type and if it matches the data in the data source.
From my experience, working with data lakes can be a lot of work, for example, where the schema is less strictly defined. Changing this manually for hundreds of fields is really time-consuming.
Even with all the “dimensions” defined correctly, the business user would still have very few insights to mine from. SQL can perform transformation such as:
Replacing SQL will require translating those concepts to LookML.
Even though there is a possibility to define each such transformation with traditional SQL in Looker, it is not recommended.
It may prove complicated to maintain an SQL code and look after its reusability across different dashboards. However, while “joins” can be defined in Looker “models,” table unions don’t exist yet and require some special hacks depending on the use case.
Table aggregations are handled by the Looker “measures.” This vocabulary needs to be learned by the business users, as “measures” are present on the same list of fields as “dimensions” in “explores,” but they relate to aggregation functions, rather than the raw, or transformed, fields present in the original table.
For example, `amount` may be a field representing the money paid for a given order, and `sum_amount` may be a “measure” of the total amount of money paid for all orders by date.
`SELECT date, amount FROM orders` is not the same as `SELECT date, sum(amount) FROM orders GROUP BY date`, after all.
Looker automatically translates the selected “dimensions” and “measures” into SQL sent to the database, which is very useful. But if an analyst needs to know the ratio of the day’s total purchase to the month’s total, it won’t be there to click and select from the basic Looker “view” generated automatically.
Even that fact alone demystifies the bypassing of the data scientist’s bottleneck promise.
Creating custom “dimensions” directly in the “explore” is less of a problem with Looker’s “calculations,” but first, you’ll need to know yet another special Looker syntax to define those “calculations.” Second, as those transformations are not performed on the database side, using them can negatively affect the performance of the Looker instance.
When using filters and pivots in “explores,” you’ll soon learn that only some of the logical conditions are possible in the GUI. I’ve personally found myself unable to easily select items related to a period in the past, for example before 3 months ago and after 5 months ago.
Pivots are operations performed on the client’s side, thus having too many of those can cause your dashboard to refresh for a noticeably long time.
Looker’s response to the problem of performance issues is, besides its handy caching mechanism, performing persistent calculations as physical tables in the data source.
This is packaged as a feature proposed to be automatically performed by the platform. But it translates to the same thing as having respective Extract-Transform-Load (ETL) mechanisms directly on the underlying data platform.
If you don’t have a data engineer on board, it might be interesting to use that functionality. But in order to set everything up correctly with schedules and triggers that perform best for your data, some data engineering skills are already required.
You might end up with a similar conclusion to mine: that including any aggregation pipelines into the existing infrastructure might make more sense than doing some of them within Looker.
Finally, the drill-down functionality doesn’t come free, either. The fields that you want to see when digging deeper into the data need to be explicitly defined and can’ be changed by the end user.
The data presented that way can’t be sorted or filtered, but you may click again to start a new exploration from that detailed view. The default drill-down Looker behavior is also limited and can prove unusable for “views” using some custom metrics—so for most of them, in reality.
In such cases, Looker has a solution: even more code.
Developers can override the built-in functionality with custom HTML enhanced with Liquid conditional syntax and a panel of system variables that you might reference.
But this also means more time to deliver meaningful components for the business user. Here, Looker also requires some more technical competencies from the tech and data team than just writing SQL queries.
In the context of those big promises given to the business user, I’d like to mention that they might cause additional frustration for both you and your developers.
The former might feel lost in all the Looker-related nomenclature, because nothing comes easy from the first set of basic “views”—especially if those map 1-to-1 to the relational database, knowing a schema of the tables is required.
The latter would also be challenged with learning all the functionalities of this gigantic BI tool. The dev users would also feel pressure to define all the perfect “measures,” pre-calculations, and pre-formattings, so that the data is understandable and most efficiently processed from the very beginning.
There is a huge difference between presenting several tables with hundreds of fields each as they exist in the relational database, and grouping them into logical lots. The same way there is a difference between presenting them with more human-readable labels, and making more commonly used fields more accessible, too.
On the bright side, there are very few things that are impossible with Looker. You get numerous different parameters to learn and use in your projects. If you don’t find the solution in the concepts that Looker introduces, you may fall back to the custom SQL/HTML/Liquid code. And eventually, you will have to.
Developers can also use an built-in IDE and integrations with a git-based code versioning service. Some operations, like creating git branches, committing code, and pushing to and pulling from the master branch, can be performed directly via Looker.
Push/pull requests and code reviews can be set as mandatory but would, naturally, need to be done on a git repository service provider’s side. More complex git operations, like rebases, would require a command line intervention.
Note that there is documentation and a community forum to help you get started and going. And should you have any questions, their support chat is pretty responsive, as well.
Looker is one of the most popular business intelligence tools. Some of its popularity is certainly due to the Google brand behind it, but the promise of having quick access to valuable insights about your business is very powerful, too.
However, as with most technologies, there is a huge technical foundation that makes it possible. The choice of a solution for your business will surely affect the cost and the set of functionalities you get for it.
But as the underlying system for data storage doesn’t change, the actual engineering work responsible for accessing and processing that data will always be required.
In Looker and similar BI solutions, the advanced data exploration functionalities promised to business users with the next-to-the-null amount of dev skills come at the cost of having someone else perform those dev tasks.
And since an intermediary layer of abstraction is introduced there, mapping the data and its schema into that layer and related concepts requires an extra initial dev effort.
Ideally, this extra entry-point work should pay off later on, when the project is mature enough for the business users to really benefit from the freedom of data exploration.
But even then, every new data source and every new custom metric would still need to be introduced in the backend code, and the most popular data transformations on the client’s side might need to be implemented as ETLs for performance optimization reasons.
Should you decide to take up Looker, be aware that it’s not easy for the technical team to prepare it so that it’s flexible, maintainable, and efficient, and that there is a considerable entry cost for both data teams and business analysts to pay.
Thank you for reading my article on Looker. I hope it will help you draw some valuable conclusions on choosing the best BI analytics software for your business.
In the end, the choice of the most suitable solution comes down to your individual needs. You should carefully look at the product features to check if the tool aligns with your business requirements.
At STX Next, we have many experienced data scientists and data engineers who will be more than happy to help you make that choice and guide you through the decision-making process.
Our data engineering services are bespoke and include data gathering, storing, processing, access management, and visualization. The result will help you transform your business into a data-driven company and allow you to make better-informed decisions that will eventually boost your growth and give you a competitive advantage in the long run.
If you’re interested in learning more about data science, data engineering, or business intelligence tools in general, here are a few resources that can help:
Do you have any more questions about business intelligence tools and big data analytics software? Or maybe you need some help developing an intelligence platform of your own? If so, our expert teams of software engineering professionals will be more than happy to support you. Reach out to us and get all the answers you seek!