Wildcards

Previous Next  Print this Topic

For effective, thorough, and quick searching, Phoenix VCM offers the availability of employing wildcard characters throughout all applications. Wildcards are useful for searching and filtering stored video, files and file names, directories and directory names, and content within your Information Repository. See also Filters.

The two types of wildcards Phoenix VCM recognizes are:

POSIX standard wildcards - For use in path names, filtering, and metadata searching.
Regular expressions - For use only in text or content searching. For example, a word in a file.

POSIX Standard Wildcards

Use these standard wildcards when creating file and directory filters or when searching for files and metadata. These wildcards are convenient to use in text boxes on windows dealing with the Vault, Media Name, and Storage Pools. Wildcards for file filters operate on file names; directory filters operate on directory name.

A single wildcard character represents a single variable in a query and wildcard operators search for words with variable characters.

The following are things to keep in mind when using the standard wildcards:

Wildcards can be used in combination with other wildcard characters.
Searches are not case sensitive.
These wildcard operators do not work for content searching (you must use the regular expressions).
For predictable results when using wildcards for metadata, it is best to avoid using * and #.

POSIX Standard Wildcard Table

The table below lists all the wildcard operators for conducting file, metadata, and window text box searches. Two basic and common operators (they are the first two in the table), are highlighted for quick query reference:

? = a single character wildcard, to represent a single variable.

* =  a string wildcard, to represent a variable string of zero or more characters.

Wildcard

Description

?

Any single character

Substitutes a single character or can be combined to represent multiple characters. Use the ? for specific alternative spellings.

? substitutes a single character

?? substitutes two characters

??? substitutes three characters, and so on.

If used in a path, it cannot replace a forward slash (/).

Use ? within or at the end of a phrase.

Examples:

The query cell? searches names containing cell+<one additional character>. (cells and cello, but not cell)

The query .xl? searches names containing .xl+<a third character>. (.xls or .xlr)

The query wom?n searches names containing wom+<a fourth character>+n. (woman or women)

The query carbon fib?? searches names containing carbon fib+<two characters>. (carbon fiber or carbon fibre)

*

Zero or more characters

Substitutes zero or more characters or can be used to truncate multiple characters. Use * for alternative  spellings and an unlimited number of characters within a word.

If used in a path, it cannot replace a forward slash (/).

Examples:

The query behavi*r searches names containing behavi+<any character or number of characters>+r. (behaviour, behavior, or behavi123.zr)

The query patent* searches names containing patent+<a character or any number of characters>. (patents, patentable, patented, patent123)

The query patent*.jpg searches names containing patent+<a character or any number of characters>+.jpg.

(patents.jpg, patentable.jpg, patented.jpg, patent123.jpg, etc; *.jpg substitutes for only all names containing <any character or set of characters>+.jpg

*?

One or more characters

Substitutes a minimum number or more characters. The minimum number of characters matched is equal to the number of ? that follow the *.

*?? substitutes for at least two characters.

*??? substitutes for at least three characters, and so on.

If used in a path, it cannot replace a forward slash (/).

Use *? within or at the end of a word.

Examples:

The query carbon fib*?? searches for carbon fiber, carbon fibre, carbon fibers, carbon fibres, and carbonfib123.

The query actua*???? searches for filenames containing actua+<four or more additional characters>. (actuarial, actuaries, actualization, actua12345.xls, but not actual)

#

Zero or more characters in a pathname, including slashes

Substitutes for zero or more characters in a path name; valid for use only within a path.

Use to denote the forward slash (/).

Example:

The query document#Q42007 searches for document+<any character, including slashes or no character>+Q42007. (documentQ42007, documentsQ42007, document1Q42007, and document/Q42007)

#?

One or more characters in a pathname, including slashes

Substitutes for a minimum number or more characters in a path name; valid for use only within a path. The minimum number of characters matched is equal to the number of ? that follow the #.

Matches the forward slash (/).

Example:

The query document#?Q42007 searches for document+<any character, including slashes>+Q42007. (documentsQ42007, document/Q42007, and  documents/123Q42007)

[ ]

Set of characters

Substitutes a set and includes any one of the characters in the set. Or, an ASCII sequential run of characters can be indicated by using a hyphen to separate the first character in the sequence from the last character in the sequence: [<beginning character>-<end character>].

Examples:

The query [afe] matches a, f, or e.

The query [a-m] matches any single character in the range of a to m.

[!]

Characters not in set

Substitutes any one character that is not specified in a set. Or, an ASCII sequential run of characters can be indicated by using a hyphen to separate the first character in the sequence from the last character in the sequence: [!<beginning character>-<end character>].

Examples:

The query [!afe] searches for any single character other than a, f, or e.

The query [!a-m] searches for any single character that is not in the range of a-m.

!

Entire pattern not in set

When ! is the first character of a pattern, the entire pattern is omitted.

Example:

The query !primary returns everything except "primary".

When using this wildcard in extended metadata searches, it is best to use "is equal to". To obtain predictable results, avoid "contains", "does not contain", and "is not equal to."

\

Escape next character

Causes the character following the backslash to be treated as a literal character. Normally, this is used with one of the wildcard characters to denote that character itself, not its wildcard expression meaning.

Example:

To process Accounting*2007, type Accounting\*2007 since * is normally interpreted by the Information Repository as a wildcard.

Multiple wildcards

Along with using wildcards individually, combine them to create powerful filters and search criteria.

Example:

The query [abc]*.xls searches any phrase beginning with a, b, or c+<any character or number of characters>+.xls.

The query [!abc]*.xls searches any phrase not beginning with a, b, or c+<any character or number of characters>+.xls.

The query and*.??? searches and+<any number of characters>+.+any three characters.

More POSIX Standard Wildcard Examples

The examples in the following table illustrate contrast in the way wildcards function differently from one another.

Wildcard

Sample Phrase

Matches

Does not match

?

Any single character in a name

path?.doc

path1.doc

paths.doc

pathq.doc

etc.

path

path/.doc

path.doc

path_sample.doc

etc.

*

Zero or more characters in a name

path*.doc

path.doc

path1.doc

path_sample.doc

etc.

path

path/1.doc

path1.docs

etc.

*?

One or more characters in a name

path*?.doc

path1.doc

paths.doc

path_sample.doc

etc.

path

path.doc

path/.doc

etc.

#

Zero or more characters in a pathname, including slashes

path#.doc

path.doc

path1.doc

path_sample.doc

path/1.doc

etc.

path

path1.docs

sample/path.doc

etc.

#?

One or more characters in a pathname, including slashes

path#?.doc

path1.doc

path_sample.doc

path/1.doc

etc.

path

path.doc

path1.docs

etc.

[ ]

Set of characters

path[1-3].doc

path1.doc

path2.doc

path3.doc

path

path.doc

path.docs

path4.doc

path5.doc

patha.doc

pathq.doc

etc.

[!]

Characters not in set

path[!1,2,3].doc

path0.doc

path4.doc

path5.doc

patha.doc

pathq.doc

etc.

path1.doc

path2.doc

path3.doc

path

path.doc

path.docs

etc.

!

Entire pattern not in set

!path.doc

all other files in repository

!path.doc

Regular Expressions (Content Searching)

Use regular expressions to search for specific content or text manipulation.

The following are things to keep in mind when using the standard wildcards:

With or without quotation marks, search is not case sensitive.
These wildcard operators do not work for file or directory search and filtering (you must use the POSIX standard wildcards).

Regular Expressions Table for Content Searching

This functionality allows you to put limitations on files that are processed to those containing a particular word or phrase.

Quotation marks - To process only files that contain a specific phrase, be sure to enclose the phrase in quotation marks. Otherwise, all files containing any word typed is processed in the query. For example, if searching "Information Repository", only files that contain the complete phrase "Information Repository" are processed. In contrast, without quotation marks in place, any file containing Information Repository, Repository Information, Information, or Repository are processed.

The table below lists all the wildcard operators only for content searching.

Wildcard

Description

?

Any single character

Substitutes a single character or can be combined to represent multiple characters. Use the ? for specific alternative spellings.

? substitutes a single character

?? substitutes two characters

??? substitutes three characters, and so on.

If used in a path, it cannot replace a forward slash (/).

Use ? within or at the end of a phrase.

Examples:

The query cell? searches names containing cell+<one additional character>. (cells and cello, but not cell)

The query .xl? searches names containing .xl+<a third character>. (.xls or .xlr)

The query wom?n searches names containing wom+<a fourth character>+n. (woman or women)

The query carbon fib?? searches names containing carbon fib+<two characters>. (carbon fiber or carbon fibre)

*

Zero or more characters

Substitutes zero or more characters or can be used to truncate multiple characters. Use * for alternative  spellings and an unlimited number of characters within a word.

If used in a path, it cannot replace a forward slash (/).

Examples:

The query behavi*r searches names containing behavi+<any character or number of characters>+r. (behaviour, behavior, or behavi123.zr)

The query patent* searches names containing patent+<a character or any number of characters>. (patents, patentable, patented, patent123)

The query patent*.jpg searches names containing patent+<a character or any number of characters>+.jpg.

(patents.jpg, patentable.jpg, patented.jpg, patent123.jpg, etc; *.jpg substitutes for only all names containing <any character or set of characters>+.jpg

*?

One or more characters

Substitutes a minimum number or more characters. The minimum number of characters matched is equal to the number of ? that follow the *.

*?? substitutes for at least two characters.

*??? substitutes for at least three characters, and so on.

If used in a path, it cannot replace a forward slash (/).

Use *? within or at the end of a word.

Examples:

The query carbon fib*?? searches for carbon fiber, carbon fibre, carbon fibers, carbon fibres, and carbonfib123.

The query actua*???? searches for filenames containing actua+<four or more additional characters>. (actuarial, actuaries, actualization, actua12345.xls, but not actual)

~

Proximity searches

Use a proximity search to find words that are within proximity to one another. To execute, use ~ (the tilde symbol) and a value at the end of a phrase.

Example:

To query for "compliance" and "December" within 10 words of each other use compliance December~10.

~

Fuzzy searches

Use a fuzzy search to find an approximate spelling. To execute, use ~ (the tilde symbol) at the end of a single term.

Example:

To query for a term similar in spelling to "report" use report~. This search find terms like deport, reports, and reporting.

You can specify the level of similarity. Using a value between 0 and 1, only terms with a higher similarity are matched with values closer to 1. For example, roam~0.8. The default is 0.5 if a value is not defined.

[]

{}

Range searches (TO)

Use range searches to match documents where field values are between the lower and upper limits specified by the search, and the searches can be inclusive or exclusive of the limits. Inclusive range searches are marked by brackets. Exclusive range searches are marked by curly braces.

Example:

To query a date, mod_date:[20020101 TO 20030101] finds files whose mod_date fields have values between 20020101 and 20030101, inclusive.

To query a non-date, title:{accounting TO sarbanes} finds files whose titles are between accounting and sarbanes, but excluding accounting and sarbanes.

^

Boosting a term

Control the relevance of a document by boosting the value of a specific term to search for.

To boost a term, use ^ (the caret symbol) with a boost factor (a number) at the end of the search term. The higher the boost factor, the more relevant the returned term will be. By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (for example, 0.2).

Example:

To query corporate compliance and you want the term compliance to appear more relevant, boost it using ^ along with the boost factor next to the term. Type corporate compliance^4.

Boost phrase terms by enclosing a phrase in quotation marks. "corporate compliance"^4

OR

AND, &&

+

NOT, !, -

boolean operators

Boolean operators allow terms to be combined through logic operators. AND, OR, +, and NOT are supported as boolean operators.

Boolean operators must be in capital lettering.

OR

OR links two terms and finds a matching document when either of the terms exist in a document. || can be used in place of OR. OR is the default conjunction operator, meaning that if there is no boolean operator between two terms, OR is assumed.

Example:

To query documents that contain corporate compliance or compliance, you would use "corporate compliance" OR compliance. The quotes indicate a single phrase.

AND

AND matches documents where both terms exist anywhere in the text of a single document. && can be used in place of AND.

Example:

To query documents that contain corporate compliance and Second Quarter 2007, use "corporate compliance" AND "Second Quarter 2007". The quotes indicate a single phrase.

+

The plus symbol requires that the term after + exists somewhere in a single file.

Example:

To query files that contain compliance and may contain corporate you would use corporate + compliance.

NOT

NOT excludes documents that contain the term after NOT. ! (exclamation point) or - (minus symbol) can be used in place of NOT.

NOT cannot be used with just one term.

Example:

To query documents that contain corporate but not compliance use, corporate NOT compliance.

Grouping

To control the boolean logic for a query, use parentheses to group clauses to form sub queries.

Example:

To query for either corporate or compliance and Second Quarter 2007 you would use (corporate OR compliance) AND "Second Quarter 2007". This ensures the phrase after AND must exist and either term within the parentheses may exist.

Field Grouping

Use parentheses to group multiple clauses to a single field.

Example:

To query a file containing both return and the phrase overdue accounts you would use title:(+return +"overdue accounts").

\

Escaping special characters

Escaping special characters can be a part of the query syntax. Special characters are:

+ - && || ! ( ) { } [ ] ^ " ~ * ? : \

To execute a search using these characters literally (to escape any of these character), use \ before the character.

Example:

To query for (1+1):2, use \(1\+1\)\:2. In this case, if you do no use the escape character for (1+1):2, the query would search for ( OR 1 OR AND 1 OR ) OR : OR 2.

It is recommended to avoid using characters that Phoenix VCM interprets as special. If you need to use a character literally that Phoenix VCM uses as a wildcard, the character must be preceded by an escape character ( \ ).

Metadata Ingest and Wildcards

When ingesting metadata, special characters normally interpreted as wildcards (?, *, #, !, ~, ^, &, [ ], { }, \, ") are not interpreted as wildcards. But, when searching for metadata, wildcards are recognized. If ingested metadata does contain special wildcard characters, then any searches need to include the escape character (\) to denote that the wildcard is intended literally, and not as a wildcard. For example, if 16*37 is ingested as metadata, and then you want to search for this expression, the search query should be 16\*37.