About Advanced Queries
Note: The Advanced Query feature is provided "as is" - Firstup Support is not able to assist with constructing queries. However, if you suspect an issue with the tool itself, please contact Firstup Support.
The query builder discussed in the Defining Group Membership Using Filters section is a flattened query optimized for using one type of operator only. If you are comfortable with query language, you may want to write an advanced query. If you write your own query, you can nest operations and control the order in which they are applied.
This article describes the Firstup query language and the syntaxes of its specific features. To write an advanced query, click Advanced in the Query Builder page.
About Our Query Language
Our query language is the syntax for building segments and querying users. It is based on the Lucene Query Syntax, but the Firstup version applies a few more assumptions, enhancements, and limitations. Our engine parses, validates, and recompiles all queries to add enhancements and ensure that the query results will be as expected once it is executed against the index.
This article focuses on the syntaxes specifically added or enhanced by the Groups Engine. Readers should first be familiar with the Lucene Query Syntax.
Expressions
Queries can consist of full-text searches, criterion expressions, or groups of expressions. Each expression may also use a variety of binary and unary operators.
In this article, we use "expression" to refer to any atomic part of a query that could stand as a query of its own. It may contain operators like 'AND', 'OR', or 'NOT', and it may or may not include a field and value to match. The term "operand" is also used in this document, and it refers to the part of the expression without the operators. For an expression such as `NOT status:not_blocked`
, "status:not_blocked"
is the operand.
Operators
Every expression implies a positive operation. The query will return any user whose data does match the expression. Expressions have the option to use a negation operator of `NOT` in order to reverse this effect, where `NOT foo` will match any user without a value "foo". Additionally, when using multiple expressions within a query, boolean operators can express how the expressions should interact with each other. A query can use the `AND` or `OR` operators between expressions to yield different inclusions.
Examples
foo OR bar
foo AND bar
foo AND bar AND baz AND zoo
NOT
is a unary operator which means an expression may use it without the existence of another operator.
NOT
foo AND NOT bar
NOT foo AND NOT bar
Conversely, AND
and OR
are binary operators. An expression using a binary operator is invalid unless it is preceded by another expression.
AND foo # INVALID
AND NOT foo # INVALID
OR foo AND bar # INVALID
Criteria Value Expressions
The most basic way to construct a query is using criteria value expressions. The operands consist of a field and value in the syntax `field:value`
, where the foo is the criteria key and the value is the string to match. For values that include spaces or other special characters, enclose the value in quotes: `field:"match a phrase"`
. All `field:value`
expressions are interpreted as criteria value expressions unless the syntax of another feature listed in this document overrides that behavior (like group references).
Whenever a criterion is referenced as a field in a query, the value accompanying it is validated at against the criterion configuration. The value must match the criterion type and any other validations the criterion applies. Some criteria may apply more specific validations such as range limits or character counts.
Examples
Match users who have a status value of "not blocked":
status:not_blocked
Match users who joined on a specific day:
joined:2017-10-25
Match users who follow one topic:
topics.count:1
Match users who have a name of "Bruce Chelmsford":
name:"Bruce Chelmsford"
Use multiple criteria value expressions:
status:not_blocked AND topics.count:1
Custom attributes are included in the list of criteria that can be queried against. All custom attribute fields are namespaced under "custom" in the index, so a query against a custom attribute named "department" would be:
custom.department:Sales
Custom attributes use a special dynamic type of criteria and are mostly treated as Text type criteria in the index.
Full-Text Search
A query operand without a field will be treated as a full-text search and is configured to search across multiple fields the way a normal index text search would do. Each term (each word) is treated as a separate expression, however, so if a phrase must be matched, wrap it in quotes.
Examples
Match users who have any value indexed as "foobar"
foobar
Use multiple expressions:
foobar AND status:not_blocked
Expression Grouping
Not to be confused with Group objects, expression grouping is a series of expressions that act as a single operand within the context of the containing expression. In this way, operators can be applied within the grouping to be isolated from the operators on the outside, and parts of a query that contains multiple expressions can logically group in order to act as a single expression. An expression grouping itself may also use binary or unary operations.
Expression grouping syntax uses parentheses: `status:not_blocked AND (joined:this_week OR joined:last_week)`
. In this example, the `joined`
expressions are grouped and act as a single operand within the root level expression. A user with a status value of "not_blocked" must also have a joined value matching this week or last week to be returned by the query.
Nesting boolean logic operators within a grouping may inherently change their expression within the query and this is what makes the language more composable and flexible. The queries `foo AND (bar OR baz)`
and `foo AND bar OR baz`
are inherently different. The desired result of the former could, of course, be expressed without expression grouping, but the mechanism can be used for better flexibility and semantics within the query's logic, and groupings may contain other nested groupings.
Examples
status:not_blocked AND (joined:this_week OR joined:last_week)
status:not_blocked AND NOT (joined:this_week OR joined:last_week)
status:not_blocked AND NOT (joined:this_week OR joined:last_week) AND topics.count:1
NOT (joined:this_week OR joined:last_week)
Expressions can be grouped in as much tree depth as necessary:
status:not_blocked AND (joined:this_week OR joined:last_week OR (joined:last_week AND last_active:this_week))
Using parenthesis around single expressions is also valid:
(status:not_blocked) AND (joined:this_week)
Mixing Expressions AND and OR
If combining different expressions such as AND and OR, you must switch from the Basic Query to Advanced Query and use parentheses to group the queries as desired.
status:not_blocked AND location:seattle OR location:"san francisco" # invalid
status:not_blocked AND (location:seattle OR location:"san francisco")
Expression Aliases
Some criteria may declare aliases that, when detected as atomic operands in a query, are replaced in place with an expanded query expression. The effect is something like a shorthand for more efficiently building queries against known possible values, and greatly enriches the use of domain-relevant language in a query expression.
Expression aliases use a prefix character "&" to indicate their intent as an alias. Each criteria defines its own expression aliases.
Examples
Given an expression alias that matches "registration_complete" exist, with a replacement expression of "status:registered":
®istration_complete AND platforms:ios
# => (status:registered) AND platforms:ios
Given an expression alias that matches "mobile_platforms" exist, with a replacement expression of "platforms:ios OR platforms:android":
&mobile_platforms
# => (platforms:ios OR platforms:android)
status:not_blocked AND &mobile_platforms
# => status:not_blocked AND (platforms:ios OR platforms:android)
Given an expression alias that matches "not_mobile_platforms" exist, with a replacement expression of "NOT (platforms:ios OR platforms:android)":
status:not_blocked AND ¬_mobile_platforms
# => status:not_blocked AND (NOT (platforms:ios OR platforms:android))
Group Referencing
One group's query can be leveraged by any other query, so long as the query is executed against the same program. During query compilation, the parser will look up the referenced group in the database and inject its query into the parent expression.
There are two ways to reference groups in a query. On uses the normal field/value syntax: `group:name_of_group`
. The other uses a prefix character similar to the way Expression Aliases works. For group references, use the hash character "#". Also similar to Expression Aliasing, the entire query is embedded in the parent expression and is wrapped in an expression grouping (parentheses) to encapsulate any operators the child may include. In a way, groups can be used by a program to create their own aliases or to componentize their own user groups.
Example
Given a program that has a custom group named "mobile_platform_users" with a query of `platforms:ios OR platforms:android`:
group:mobile_platform_users
# => (platforms:ios OR platforms:android)
status:not_blocked AND group:mobile_platform_users
# => status:not_blocked AND (platforms:ios OR platforms:android)
And equivalent expression with hashtag syntax:
#mobile_platform_users
# => (platforms:ios OR platforms:android)
status:not_blocked AND #mobile_platform_users
# => status:not_blocked AND (platforms:ios OR platforms:android)
Any type of group can be referenced in a query. There are cases where two groups with the same name may exist within a single program (such as if they are of different types). When the compiler looks up a group reference, it follows the precedence order where custom groups are given precedence over standard groups, and standard groups are given precedence over generated groups. The group type can be specified, however, using a type prefix on the group name: `group:generated.also_a_custom_group_name`
. This overloading of a name allows programs to override standard groups if desired but still have the ability to reference the standard group if necessary.
Example
Given the existence of a standard group named "members" with a query of `role:member` and a program that has a custom group named "members" with a query of `role:member AND status:not_blocked` (notice that the queries are different):
group:members
# => (role:member AND status:not_blocked)
group:custom.members
# => (role:member AND status:not_blocked)
group:standard.members
# => (role:member)
Group queries may also reference other groups so that one might find a deeply nested stack of group references within a single query. There is a limit to the depth of this stacking for performance reasons, but generally speaking, the following is valid:
Given a program that has a group named "a" with a query of "group:b" and a group named "b" with a query of "group:c" and a group named "c" with a query of "foo":
group:a
# => (((foo)))
Or if group "a" has a query of "NOT group:b" and group "b" has a query of "NOT group:c":
group:a
# => (NOT (NOT (foo)))
Queries that contain group references that create circular references will raise an error when compiled.
Given a program that has a group named "a" with a query of "group:b" and a group named "b" with a query of "group:a":
group:a
# => Will raise a compiler error
Values
There are numerous ways to express the value side of an operand, or the actual value being matched.
Case Sensitivity
The query is not case sensitive. You can query upper or lower case, and the values can be in the database in upper or lower case, and we can still match the query.
Terms and Phrases
Normally an expression's value is a simple term, but in cases where special characters or spaces exist, the value must be expressed as a phrase using quotations.
Term values:
topics.name:Home
Phrase values:
topics.name:"Other Topic"
The quotes are required to allow the parser to treat "Other Topic" as an entire value, while it would otherwise interpret the space between those words as the beginning of a new expression.
Wildcard
The `*` character is special in that it will match any value.
Match user with any name:
name:*
This syntax will specifically match any user with a value for that field. If a user does not have a name at all, they will not be matched.
Match user with partial string,
name:b*
email:*arrowlogistics.com
department:*operations*
Note, the wildcard does not work with quotation marks or spaces - we recommend that you remove quotation marks and replace spaces with the asterisk:
department:"united states"*
department:united*states*
Alternatively, if a non-alphanumeric character is necessary for your query, you can escape special characters using `\`:
department:"united states"*
department:united\ states*
Issue Note: if the wildcard is not returning the number of expected users, please switch to Advanced Query and check to see if there are quotation marks outside the wildcard. If there are quotation marks, remove them and save your changes.
email:"*arrowlogistics.com"
email:*arrowlogistics.com
Value Aliases
During compilation, every value in a criterion expression operand is passed to the criteria configuration. Some criteria types and some specific criteria may swap the value with a new value if the original matches an alias. Value aliases are most useful for replacing common but complex values, or values that should express a relativity to their runtime.
The most common value aliases are used by Timestamp type criteria, and include aliases such as "this_week", "last_week", "this_year", and "today". The returned values for an alias may be either normal values or range values (see the Range Values section below).
Given that today is April 20:
joined:this_week
# => joined:[2017-04-16 TO 2017-04-22]
joined:today
# => joined:2017-04-20
Range Values
Some fields contain inherently orderable data. Expressions against these fields offer the ability to match values within a range using a special syntax:
topics.count:[1 TO 5]
Range syntax allows for inclusive ranges, which include matches to the exact value in the expression (use `[]`
characters), or exclusive ranges which do not match three exact value in the expression (use `{}`
characters). A mix of the two is also valid:
topics.count:[1 TO 5]
# => matches users with values of 1, 2, 3, 4, or 5
topics.count:{1 TO 5}
# => matches users with values of 2, 3, or 4
topics.count:[1 TO 5}
# => matches users with values of 1, 2, 3, or 4
topics.count:{1 TO 5]
# => matches users with values of 2, 3, 4, or 5
Reversing bounds is also valid:
topics.count:[5 TO 1]
# => matches users with values of 1, 2, 3, 4, or 5
Infinity can also be used as a bound in a range using wildcards:
topics.count:[2 TO *]
# => matches users with values of 2 or greater
topics.count:[* TO 2]
# => matches users with values of 2 or less
topics.count:[* TO *]
# => matches users with any value
A shorthand syntax is also available, which implies one bound's use of infinity:
topics.count:>=2
# => matches users with values of 2 or more
topics.count:>2
# => matches users with values of 3 or more
topics.count:<=2
# => matches users with values of 2 or less
topics.count:<2
# => matches users with values of less than 2
Numeric, Timestamp, and Text type criteria can be used with ranges:
topics.name:[A TO C]
# => matches users with values that start with A, B, or C
joined:[2017-10-01 TO 2017-10-31]
# => matches users with a joined value any time in October 2017
Value aliases can be used in ranges as well, so long as the returned value is not also a range:
joined:[yesterday TO today]
# => matches users who joined yesterday or today, as the values are replaced with timestamp values relevant to the current date
joined:[last_month TO today]
# => will result in a compiler error because last_month returns a range of dates
Range values are subject to the same validations that normal values are. If a criterion requires the values to be numeric, providing a range that includes a non-numeric value will result in a compiler error:
topics.count:[A TO 5]
# => will result in a compiler error
Value Grouping
Similar to expression grouping (see the Expression Grouping section above), value grouping chains value-only expressions within the value side of an operand. These can be chained together using operators the same way normal expressions are.
topics.count:(1 OR 3 OR 5)
# => matches users with a value of 1, 3, or 5
This match could be expressed without value grouping, but in this syntax it is clearer to read.
Since some criteria can index multiple values for a given user, the AND operator may also be useful within the same field:
topics.name:(Home AND Sales)
This particular query will match users who follow both "Home" and "Sales" topics. It is equivalent to the query `topics.name:Home AND topics.name:Sales`
.
Phrases can also be used within value grouping:
topics.name:(Home AND "Human Resources")
It is valid to use expression aliases within value grouping, as each value is validated and run through the same process a normal value is. Given that today is April 20:
joined:(yesterday OR today)
# => joined:(2017-04-19 OR 2017-04-19)
Range values are also permissible within a value group:
topics.count:([1 TO 5] OR [10 TO 15])
Value grouping is not restricted to criterion expressions.
The Model
This portion of the article describes a few useful facts about our backend and the underlying data model of the Groups Engine.
Criteria and Facts
Criteria are a set of configurations at the core of the engine in its ability to determine group membership. Each describes a single type of field of data, not unlike a column definition for a database schema. Criteria result in atomic fact records for each program for each member. A fact can be thought of as the name and value of a criteria for a specific user in a simple and portable form. These configurations allow the engine to know how the fact data should be stored and how it can be used to include users in a group when queried.
Each criterion has at minimum a key and type. The key is used to reference the specific criterion wherever it is used in a query. The type determines how the engine will store the value and what types of operations can be performed with the criterion's fact data. Criteria can be one of the following types: boolean, enum, dictionary, numeric, text, or timestamp.
Facts, the atomic values of data for a specific user, are stored individually in the database to maximize individual updatability and the expandability of the list of criteria without maintaining wide or sparse tables. These data are pivoted and indexed into a row-per-program-membership document store in order to facilitate efficient querying. Queries only interact with this search index, as the granular fact data only act as the building blocks and master records for maintaining the index.
Criteria Types
- Boolean
- Enum
- Dictionary
- Numeric
- Text
- Timestamp
Query Compilation
At its core, the engine is a search engine supporting both full-text search and matching of specific criteria values. At this point, the compiler is able to validate that the query, its grammar, and its expressions meet the expectations of the engine and the features it supports. This gatekeeper role allows it to validate that searches against criteria are being performed in the way the criteria have declared their configuration to allow and adds richer debugging information than would otherwise be available.
Comments
0 comments
Article is closed for comments.