Query logic (Google Analytics data)

Structure of the Google Analytics Query logic

For data from Google Analytics data source the Query logic has the following structure, you have an array of one or more request with one or more queries in them. For each query you need to specify parameters separately. Each query has to include the required parameters: dateRanges and metrics. The use of multiple requests and queries is specified in the next part. The main query parameters are described in detail below. Here is a full example of the simplest Query logic content.

"requests": [{
    "id": "sessions",
    "queries": [{
      "dateRanges": [{
        "startDate": "yesterday",
        "endDate": "yesterday"
      }],
      "metrics": [{
        "expression": "ga:sessions"
      }]
   }]
}]
Waaila tip:
While the order of the individual components is not important, it is good to follow a consistent structure for better comparison of two queries and clearer overview in case of any misspecification.

Use of multiple requests and queries

While there are powerful tests that require only a single query, often you need to compare data that cannot be fitted within a single table. For example, if you need to determine the share of data that satisfy a certain filter or if you want to compare multiple separate conditions. To receive a result with multiple tables of data, you can either put more queries into one request or specify multiple requests.

Multiple queries within one request have easier structure of results and thus easier further manipulation. They are practical for example for comparison of data with and without a filter. However, they do not allow between-query output reference and the dateRange parameter has to be the same in all queries within one request.

Multiple requests allow more complex structure. Using multiple requests, you can use part of the results in a filter of another query. For example, you can filter the transactions to analyse them only for the most visited hostname. To perform this, you first need to get the hostnames sorted by sessions, extract the hostname with maximum sessions and then use the extracted hostname in a filter of a query with transactions.

ID of request

Each request has its ID in the id parameter which serves for identifying the results. You select the ID for the request yourself to be able to easier extract the results. It is recommended to be a short word connected to the content of the queries within the request. For example, if all queries within the request are connected to sessions metric, you can assign to the id parameter the name "sessions". Alternatively, if you search in your first request for the combination of dimensions with the highest value of some metric (e.g. hostname with highest number of sessions) and in your second request you filter by this combination (e.g. to find the number of transactions for this hostname), you can use for the first id parameter "aggregated" and for the second one "filtered", such as in the following structure.

"requests": [{
    "id": "aggregated",
    "queries": [{}]
   }, {
    "id": "filtered",
    "queries": [{}]
   }]
}]

To use the results of a request, you then subset the results with the selected id parameter: results['aggregated'] and results['filtered']. While it is not obligatory to include the id parameter if you have only one request, it is still recommended for better clarity when referencing the data.

Date range

First you need to specify the dates range for which you require data. The dateRanges consists of an array with at least one object which contains a combination of date thresholds called startDate and endDate. In particular, the parameter startDate is the first date to include data for and the endDate is the last date to include data for. The dateRanges is a required parameter for the query.

There are 3 options for the format of the parameters:

  1. "today" or "yesterday" = to start or end with data from yesterday
  2. "[N]daysAgo" where instead of [N] you fill in a positive whole number
  3. Full date in format "YYYY-MM-DD", for example "2020-06-01"

A full example of the dateRanges loading data for last 7 days preceding the current date:

"dateRanges": [{
        "startDate": "7daysAgo",
        "endDate": "yesterday"
      }
]
Note:
To compare data from same queries within different time periods, you can use different sets of startDate-endDate combinations as the input of the dateRanges parameters. This allows you to compare two subsequent periods or to compare a shorter vs. longer time period (e.g. how the performance differed between last 7 days compared to last 28 days).
"dateRanges": [{
        "startDate": "7daysAgo",
        "endDate": "yesterday"
      },{
        "startDate": "28daysAgo",
        "endDate": "yesterday"
      }
]

Metrics

The second and only other required parameter is the field for metrics, individual measurements of user activity on your property, such as sessions and pageviews, that can be extracted from your Google Analytics data. When a query contains only dateRanges and metrics, the resulting data is the aggregation of the values for all selected days.

There can be one or more metrics expression, specified as array of objects. To include a metric, select an existing name of a Google Analytics metric and input it as the value for the "expression" key. The names can be searched and browsed in the official Dimensions and metrics explorer. An example of the metrics array follows:

"metrics": [{
        "expression": "ga:sessions"
      },{
        "expression": "ga:users"
      }
]
Waaila tip:
In case you want to perform a simple operation on the metrics, you can already use it as the value for "expression" key in the query. There is a limit on the operations allowed (Plus (+), Minus (-), Negation (Unary -), Divided by (/), Multiplied by (*), Parenthesis, Positive cardinal numbers (0-9), can include decimals) and on the total number of characters (up to 1024 characters). The following query asks for data with sessions and session to users ratio. Both for a single metrics name and for an operation on metrics, you can specify what the resulting name of the column will be by using the key "alias". With expressions calculated from metrics it can also be useful to specify the format of the output as the default is integer which may not be so informative in case of ratios.
"metrics": [{
        "expression": "ga:sessions",
        "alias": "Sessions",
      },{
        "expression": "ga:sessions/ga:users",
        "alias": "Sessions per User ratio",
        "formattingType": "FLOAT"
      }
]

Dimensions

To extract more than just the aggregate values for the selected metrics, you need to specify dimensions. When you include the dimensions parameter, there can be one or more dimensions names within, specified as array of objects. To include a dimension, select an existing name of a Google Analytics dimension and input it as the value for the "name" key. The names can be searched and browsed in the official Dimensions and metrics explorer. An example of the dimensions array follows:

"dimensions": [{
        "name": "ga:date"
      },{
        "name": "ga:hostname"
      }
]

Ordering

To order the requested data, you can specify the OrderBys parameter. It is an array of objects consisting of the following parameters:

  1. fieldName = name of one of your metrics or dimensions
  2. orderType = how is the order determined (default is "VALUE", see Tip 2)
  3. sortOrder = either "ASCENDING" (from smallest to largest) or "DESCENDING" (from largest to smallest)

The following example shows ordering by sessions from highest to lowest amount:

"orderBys": [{
        "fieldName": "ga:sessions",
        "orderType": "VALUE",
        "sortOrder": "DESCENDING"
      }
]
Waaila tip:
If you only need to access the maximum value of a metric, you can use the aggregate information of the request results and you don’t need to order the values. However, if you need to learn the dimensions’ values which correspond to the maximum value of a metric, it is easy to order the values descending according to this metric and use the dimensions’ and metrics’ values in the first row of the results.
Waaila tip:
If you want to order the data by a dimension which is a string but it has the same format as integer (that is it does not have leading zeros), for example product ID or group ID, it is possible to set the OrderType to "DIMENSION_AS_INTEGER". This option will order the data as if it would be a numeric value, for example:
["5", "211", "333", "10101"],
whereas setting the OrderType to "VALUE" would treat it as a string variable and would order the values alphabetically, for example:
["10101", "211", "333", "5"].
Note:
If you want to order by a metric for which you have specified a name using the alias parameter, you need to use the new name in the filedName parameter instead of the original Google Analytics name.

Filters’ options

One-line filter expression

For simpler filters it is easier to use the filtersExpression parameter. It takes as its value a string with potentially more filters combined together. Below the basics are summarized. If you need a detailed documentation, see the Google Analytics guide to filters.

The basic syntax for a single filter is:

ga:[name] [operator] [expression]

where [name] can be a name of a dimension or a metric. For metrics the operator means a comparison of the metric value and so it allows numerical comparison, such as == (equals to), != (does not equal to), > (greater than), < (smaller than) and combinations. For dimensions there are several options based on matching of the value: == (exact match), != (does not match), =@ or !=@ (contains or does not contain substring) and =~ or !~ (match or does not match regular expression).

To combine multiple filters, use

  • a comma character "," as an OR operator for including filters where at least one needs to be satisfied
  • a semicolon character ";" as an AND operator for including filters where all filters need to be satisfied

As an example of combining multiple filters, below is a filter including all groups of data which have one of the specified browsers and at the same time have at least 10 sessions.

"filtersExpression": "ga:browser==Firefox,ga:browser==Chrome;ga:sessions>=10"
Note:
An OR operator has precedence over the AND operator – that means that first all the OR operators are evaluated and then the AND operators are evaluated. Furthermore, the OR operator cannot be used on metrics and dimensions in the same expression.

Specialized filters

To include more specific filters you use the dimensionFilterClauses or metricFilterClauses which allow you to specify a more complicated expressions in a more readable way or help you when you reach the limit for characters in the filtersExpression. These filters are specified for the dimensions and metrics separately as the dimensionFilterClauses and metricFilterClauses parameters. In each parameter you can have multiple filters combined either by AND or by OR operator.

Waaila tip:
The dimensionFilterClauses parameter can be used to filter a dimension based on a list, by specifying the filter operator parameter as "IN_LIST" and inputting into the expressions list of possible values. This can be used to check measured hostnames against a list of expected hostnames: if you want to query data with hostnames within the list, leave the parameter "not" at default value "false"; if you want to query data not in the specified list, change the parameter "not" to "true".
"dimensionFilterClauses": [{
        "filters": [{
            "dimensionName": "ga:hostname",
            "not": false,
            "operator": "IN_LIST",
            "expressions": [
                "www.example.com",
                "www.another_example.com"
            ]
            }]
        }]
Waaila tip:
To keep the values of a metric that are within a given range, you can use the metricFilterClases with two filters and the operator parameters equal to "AND". There are only few types of metric filters (as specified by the operator parameter within the filter): "EQUAL", "GREATER_THEN", "LESS_THEN" and "IS_MISSING". If you want to filter values less or equal to a value, you can use the combination of the operator "GREATER_THAN" and the parameter "not" equal to "false". For example the filter below extracts the data where sessions are greater than zero but not greater than 1000 (i.e. includes data where sessions are 1000).
"metricFilterClauses": [{
        "operator": "AND",
        "filters": [{
            "metricName": "ga:session",
            "not": false,
            "operator": "GREATER_THAN",
            "comparisonValue": 0
            }, {
            "metricName": "ga:session",
            "not": true,
            "operator": "GREATER_THAN",
            "comparisonValue": 1000
            }]
        }]