Query logic of the test (AT Internet data)

The Query logic which is located on the Test editor page allows you to formulate you requests for data. There are two types of the Query logic for AT Internet data based on the version of the Data Query which Waaila is connected to. For tests on data from version 2 Data Query on AT Internet, the Query logic is just a line for inserting a link to data located above the Test logic field. For tests on data from version 3 Data Query on AT Internet, the Query logic contains a JSON-structured query and additionally there is a button to insert the query as a link. These two types are described more in detail below.

Query logic for AT Internet Data Query version 3

For version 3 Data Query the data queried can have two versions: a JSON query or a URL link. Both can be copied from the AT Internet Data Query Designer using a button "Send to" which is located above right corner of the data table. As the beginning of the link (https://api.atinternet.io/v3/beta/cdf/data/getData?) is universal, you have two options (both approaches are equivalent):

  • copy the full link and insert it using the button "Insert URL" above Query logic,
  • extract and insert only the parameters in JSON query.

The obligatory parameters in the query are space, period, columns and sort, along with the max-results and page-num parameters. The first three parameters ( space, period and columns) define which data should be obtained while the rest (sort, max-results and page-num) defines what part of the data is displayed.

space

The space parameter specifies the exact site you are querying data from. This corresponds to the dataset code that you inserted when creating a dataset. There can be multiple spaces listed within the parameter.

"space": {
    "s": [
      000000,
      000001
    ]
  }
Waaila tip:
When using AT Internet data you are not limited to the use of the pre-specified dataset. You have the option to use any data that the account has access to by specifying that in the space parameter. However, if you want to use the dataset code directly, you can use a placeholder [code] instead of the actual number. This is recommended for tests saved to the library as otherwise the query would load the same data in any dataset irrespective of the dataset code.
"space": {
    "s": "[code]"
  }

period

In the period parameter you can specify one or two data periods which you want to receive data for. Using the period parameter you can select both the granularity of the data as well as the time location. You have two options for the period

  • fixed (specifying exact date or dates)
  • relative to the current period (for example 8 days ago, last month or this year)

There are multiple granularity options: Y: year, Q: quarter, M: month, W: week, D: day. The most common granularity is daily (D). To query a specific fixed day, you set the type parameter as the granularity and use the date in format YYYY-MM-DD for both start and end parameters. You can query data for multiple days by stating the start and end of the selected period or for a single day when the start and end dates equal.

"period": {
  "p1": [
    {
      "type": "D",
      "start": "2020-09-01",
      "end": "2020-09-07"
    }
  ]
}

To query data relative (R) to current period, you need to mark the period as relative and specify the granularity and the size of the setback, i. e. how many periods back. For relative time period, you always select data only from a single day/ (calendar) week/ (calendar) month (unless you use a comparison with further periods as described in the next paragraph). For example, data from yesterday can be obtained using:

"period": {
    "p1": [
      {
        "type": "R",
        "granularity": "D",
        "offset": -1
      }
    ]
  }
Note:
You can select data from this period (i.e. today, current week, month or year) by setting the offset parameter to 0. However, data are typically available with a slight delay. When no data are available for the selected period, the query will fail with an error. This can happen both for today and for current month on the first of the month.

Multiple periods can be used to compare data with previous periods, for example you can evaluate how visits have changed for different domains yesterday compared to a week before yesterday. This comparison is especially relevant when data has weekly cycles and weekdays have some regular patterns. When multiple periods are specified, all metrics must refer to one of the periods (metrics can be included multiple times, each referring to a different period). The following example shows the full query for comparison of visits on different domains from yesterday with data from 8 days ago. The period p1 is interpreted as the main period and period p2 as the comparison (earlier period).

{
  "columns": ["site_domain", "p1.m_visits", "p2.m_visits"],
  "sort": ["-p1.m_visits"],
  "space": {
    "s": ["[code]"]
  },
  "period": {
    "p1": [
      {
        "type": "R",
        "granularity": "D",
        "offset": -1
      }
    ],
    "p2": [
      {
        "type": "R",
        "granularity": "D",
        "offset": -8
      }
    ]
  },
  "max-results": 50,
  "page-num": 1
}
Note:
When using the designer to generate a query, note that it always creates a query with fixed dates. Even if you select to display data for "yesterday", the generated query sets the period for yesterday's date. If you want to query a relative period you need to follow the guidelines above to set the relative period yourself.

columns

In the columns parameter all included metrics and dimensions are listed. Metrics' names start with prefix m_ : e.g. m_visits, m_events, m_unique_visitors, m_page_loads, m_transactions, while dimensions' names do not need to start with any prefix: src (which stands for source of traffic), date, page, site_domain (which stands for hostname). You can search the dimensions and metrics in the Data Query designer, https://dataquery.atinternet-solutions.com/#/.

When you include multiple time periods, you need to prefix each metric by one of the periods (i.e. "p1.m_events"). Additionally, you can include for each metric calculated columns with difference between periods and variation between periods. Difference is calculated as [metric value in period p1] - [metric value in period p2] and shows by how much has the metric changed from comparison period p2 to main period p1 (if negative, it has decreased). The variation is calculated as ([metric value in period p1] - [metric value in period p2]) / [metric value in period p2]. Therefore, the variation shows by how much has the metric changed from comparison period p2 to main period p1 relative to the size in period p2 (again, if negative, it has decreased). When creating the query in the designer, the calculated metrics are included by enabling comparison in the periods setting. When writing the query, you need to add the entry for given column to the calculatedColumns for every difference or variation added to columns. Below is an example of events comparison for different groups of traffic sources (like "Direct traffic", "Paid marketing" and other).

"columns": [
    "src",
    "p1.m_events",
    "p2.m_events",
    "dif_p2_p1_m_events",
    "var_p2_p1_m_events"
  ],
"calculatedColumns": {
    "metrics": [
      {
        "key": "dif_p2_p1_m_events",
        "op": "dif",
        "values": [
          "p2.m_events",
          "p1.m_events"
        ]
      },
      {
        "key": "var_p2_p1_m_events",
        "op": "var",
        "values": [
          "p2.m_events",
          "p1.m_events"
        ]
      }
    ]
  }
Note:
The variation as provided by AT Internet data has a slight disadvantage that it is both equal to zero when there has been no change between period p1 and p2 and when the value in period p2 was zero (and the variation could not be calculated). To avoid this, you can either exclude and treat differently rows with zero value in period p2 or calculate the variation as the difference divided by the value in period p2 (which gives the value of infinity for cases with zero value in period p2).

sort

The parameter sort is obligatory. It needs to contain a name of at least one of the included metrics. Sorting can be set by one or multiple columns. To sort, you list the names of the columns to sort by: "sort": [ "m_visits", "m_page_loads" ]. For descending sorting a minus sign is written before the column name: "sort": [ "-m_visits" ]. If you included multiple time periods, the column name needs to specify which period is the metric for, same as the columns parameter: "sort": [ "p1.m_visits" ].

max-results and page-num

The max-results and page-num parameters effectively separate the queried data into bins to be queried or displayed. The parameter max-results sets the maximum number of rows of data to be obtained as a results of one query run. It can be set to an integer between 1 and 10,000 (including both limits). If you need to query more data, you need to use the page-num parameter which allows you to obtain following rows. For example when querying long set of data, for values "max-results": 10000, "page-num": 1 you receive rows 1 to 10,000, while for values "max-results": 10000, "page-num": 2 you receive the following rows 10,001 to 20,000. The parameter page-num depends on the value for max-results, there is a total maximum of 200,000 rows that can be received (so for max-results 10,000, the page-num can be at most 20).

filter

In the optional filter parameter, there are all the filters on both dimensions and metrics included. Filters for multiple variables have always the 'AND' relation, that means that all of them must be satisfied for each included row.

For example, when comparing events for different groups of traffic sources (like "Direct traffic", "Paid marketing" and other) you can filter out traffic that comes from webmails by requiring that the property groups of traffic sources ("src") is not equal to "Webmails" ("$neq" stands for "not equal" operator). At the same time, the filter below includes only rows with at least 1000 events ("$gte" stands for "greater than or equal" operator). Multiple filters for properties or metrics can be wrapped in "$AND" or "$OR" objects.

 "filter": {
    "property": {
      "src": {
        "$neq": "Webmails"
      }
    },
    "metric": {
      "p1.m_events": {
        "$gte": 1000
      }
    }
  }

More about filters for v3 ATI query can be found on https://developers.atinternet-solutions.com/api-documentation/v3/#filter.

Query logic for AT Internet Data Query version 2

For version 2 Data Query the data are queried using a URL link. This link can be copied from the AT Internet Data Query Designer from the bottom of the page. The beginning of the link depends on the authentication method used (https://apirest.atinternet-solutions.com/data/v2/json/getData? for Basic Auth authentication). You can either copy the full link into the Query logic field or extract and insert only the parameters. The beginning of the link is automatically selected inside Waaila based on the authentication method for given dataset.

The obligatory parameters in the query are space, period and columns, along with the max-results and page-num parameters. The first three parameters define which data should be obtained while the last two define what part of the data is displayed.

space

The space parameter specifies the exact site you are querying data from (space={s:000000}). This corresponds to the dataset code that you inserted when creating a dataset. When using AT Internet data you are not limited to the use of the pre-specified dataset. You have the option to use any data that the account has access to by specifying that in the space parameter. However, if you want to use the dataset code directly, you can use a placeholder [code] instead of the actual number (in particular you can writespace={s:[code]}). This is recommended for tests saved to the library as otherwise the query would load the same data in any dataset irrespective of the dataset code.

period

In the period parameter you can specify one or two data periods which you want to receive data for. Using the period parameter you can select both the granularity of the data as well as the time location. You have two options for the period

  • fixed (specifying exact date or dates)
  • relative to the current period (for example 8 days ago, last month or this year)

There are multiple granularity options: Y: year, S: semester, Q: quarter, M: month, W: week, D: day, H: hour, MN: minutes. The most common granularity is daily (D). To query a specific day, you specify the granularity and write in the date in format YYYY-MM-DD: period={p1:{D:2020-09-01}}. To query data relative (R) to current period, you need to mark the period as relative and specify the granularity and the size of the setback, i. e. how many periods back. For example, data from yesterday can be obtained using: period={p1:{R:{D:'-1'}}}.

When specifying a fixed period, you can query data for multiple days even within a single period by stating the start and end of the selected period, such as period={p1:{D:{start:'2020-09-01',end:'2020-09-07'}}}. For relative time period, you cannot set a start and an end, you always select data from a single day/ (calendar) week/ (calendar) month (unless you use a comparison with further periods as described in the next paragraph). On the other hand, you can select data from last week or even this week without the need to set which dates it covers.

Note:
When multiple periods are specified, all metrics must refer to one of the periods (metrics can be included multiple times, each referring to a different period). This approach can be used to compare data with previous periods, for example you can evaluate by how have visits changed for different sources yesterday compared to a week before yesterday.
https://apirest.atinternet-solutions.com/data/v2/json/getData?&columns={d_source,p1.m_visits,p2.m_visits}&sort={-p1.m_visits}&space={s:[code]}&period={p1:{R:{D:'-1'}},p2:{R:{D:'-8'}}}&max-results=50&page-num=1

columns

In the columns parameter all included metrics and dimensions are listed. Metrics' names start with prefix m : e.g. m_visits, m_visitors, m_page_views, m_conversions1_visits, while dimensions' names start with d : d_source, d_time_date, d_page. You can search the dimensions and metrics in the Data Query designer, https://apps.atinternet-solutions.com/DataQuery/Designer/ .

max-results and page-num

The max-results and page-num parameters effectively separate the queried data into bins to be queried or displayed. The parameter max-results sets the maximum number of rows of data to be obtained as a results of one query run. It can be set to an integer between 1 and 10000 (including both limits). If you need to query more data, you need to use the page-num parameter which allows you to obtain following rows. For example when querying long set of data, for values max-results=10000&page-num=1 you receive rows 1 to 10000, while for values max-results=10000&page-num=2 you receive the following rows 10001 to 20000.

Other parameters

In the optional filter parameter, there are all the filters on both dimensions and metrics included. Filters for multiple variables have always the 'AND' relation, that means that all of them must be satisfied for each included row. More about filters for v2 ATI query can be found on https://developers.atinternet-solutions.com/rest-api-en/response-structure-parameters-rest-en/filter-parameter-rest-en/.

Another optional parameter, sort, sets the order to display data in. The sorting can be set by multiple columns. To sort, you list the names of the columns to sort by. For descending sorting a minus sign is written before the column name.