In this example,
"NMAH5373" denotes the
id of this entity in your own system or database. Here we store data to many columns (
year, etc), in several types. We use quotes for string columns, text in
"YYYY-MM-DD" format for date columns, integer values for integer columns, decimal values for decimal columns, and
false values for boolean columns.
Time-series data are those which need both a value and a date or time information. Therefore, they have two parameters:
"value", with string, integer or decimal values, and
"YYYY-MM-DDThh:mm:ssTZ" format according to ISO 8601.
Here we are storing different values to two different entities with ids
"IVI6829". Note that we don't need to store data to all columns.
A count query returns the number of unique entities in a project according to a query condition. In the example below we use the
equal parameter to get the total number of entities where the value of the
model column is
"Toyota Corolla" and the
year value is anything different than
2016. A select expression accepts the
"not" boolean operators, that would be equivalent to a JOIN in a relational database.
When querying event columns you can use the parameter
"minfreq". This parameter means the minimum number of times the event like
"New York" must occur in a particular time range like
"2016-04-01" until "2016-04-05" in order to be considered. In the example below, if this event happend for an entity at least 2 times during the specified time range, that entity will be counted as 1 for the purposes for calculating the count of unique entities for the query.
We can also group together multiple queries for the purposes of calculating the overall minimum frequency considering all conditions, for this we use the
"freqgroup" parameter. Notice that in this case the
"minfreq" is related to all queries within the
"freqgroup". In this particular example this means that: summing all the events on the queries ("New York" events between "2016-04-01" until "2016-04-05" and "Tokyo" events between "2016-04-01" until "2016-04-05") the number of events for each entity should be at least 2 for an entity to be considered (counted) in this query.
A count entities query returns the number of unique entities in a project according to a query condition. The
"range" parameter can be used in
enumerated columns to get the quantity of entities that have a value within some specific range. The example below can be read like: Give me the number of unique entities, where their
year lies from
count events query returns the number of times that a time-series event happened during a period of time. This is different from the previous example because instead of returning the total number of unique entities that had that event, it returns total number of times the event happened among all entities. The example below query returns how many times the
"New York" city was visited between the dates
"2016-01-08" and "2016-07-01".
Top values is an operation that returns the most stored values of a column. In the example below it will return the
10 most stored values for the
The aggregation is one of the most powerful queries and serves to generate insights and discovery, as it computes all the possible combinations between the values of the columns used in the aggregation. A single-level gets the top values of the given column (in the example: top 3 values of
A multi-level aggregation computes all the possible combinations between the values of the columns used in the aggregation. This operation first gets the top values of the outer query column (in the example: top 3 values of
"model"), then it crosses those values with all the stored values of the inner query column (in the example: all the values from
"year"), then it take the top values (in this case 3) most common values resulting from this crossing and returns the quantity of entities for each combination. This process continues for each inner query that we add.
It's important to note that on aggregations the order of the columns does alter the results, as it will always take the first level top values and results to compute with all the values of the second level column.
Another common use case for aggregations is to extract metrics from the stored data, such as minimum, maximum, and average, instead of returning the most common values. In the example, we want to first get the top values for the
"year" column and, next, cross them with the average value for
A result query is equivalent to a SQL
"select column_name" statement. On SlicingDice it returns the list of IDs from all entities in a project according to a query. If you want it to return any specific column value, you must pass the
"columns" parameter specifying which columns you want, otherwise it will return only the entities IDs. The example below will return all the entity IDs and their respective values for the column
"model" from the entities where its
"model" column value is different from
"Toyota Corolla". Differently from all the other types of queries, the result queries uses the
A score query returns the list of IDs from all entities in a project according to a query (just like the result query), but it also returns a score value for each entity, based on the time-series events fulfilled by the query. In the example below, for each
"New York" event from the
"travels" column, that took place between
"2016-01-08" and "2016-07-01", the entity will receive
1 point of score. In this case, the more
"New York" events an entity has, the higher its score will be. Differently from all the other types of queries, the score queries uses the
Congratulations on storing data! You are ready for the 2º step: making queries on stored data.
Congratulations on making a query! Keep learning about SlicingDice with our quickstart guide.