Querying

Introduction

Quering is action to retrieve data from search indexes.

Preface

Plone queries are performed using portal_catalog tool. Calling this class instance is a shortcut to query method itself, but it is also possible to use other query functions:

# portal_catalog is defined in the site root
portal_catalog = site.portal_catalog

# The following call does not return the actual objects,
# but brains instead
# The call takes list of indices and match values as arguments
brains = portal_catalog(Title="Get all objects with Title "Foobar")

Warning

Usually if you pass in None as the match value it will match all objects instead of zero objects.

Brain objects

portal_catalog queries return iterable of catalog brain objects.

Brains contain subset of the actual content object information. Available subset is defined by metadata columns in portal_catalog.

You can access the brain object information by index name using Python dictionary look-up:

# Return arbitary metadata field by metadata field name
title = brain["Title"]

To see available index names, visit in portal_catalog index tab in ZMI.

Getting the real object

portal_catalog() query returns indexed brain objects. If you want to get the actual object, from which the search data was indexed, use the following:

# Load the actual object from the database (SLOW!)
# and modify it
object = brain.getObject()
object.setSomething("foobar")

Note

This has performance implications. Waking up each object needs a separate query to the database.

URL of item

Example:

# Return object absolute_url()
url = brain.getURL()

Path of item

Example:

# Return object physical path (in the database) -
# this will include Plone site id inside Zope application server
path = brain.getPath()

Text format

Since most indexes use Archetypes accessors to index the field value, the returned text is UTF-8 encoded. This is limitations inherid from the early ages of Plone.

To get unicode value for e.g. title you need to do the following:

title = brain["Title"]
title = title.decode("utf-8")

if title[0] == u"å":
    # Unicode text matching etc. functions work correctly now
    pass

Dumping portal catalog content

Following is useful in unit test debugging:

# Print all objects visible to the currently logged in user
for i in portal_catalog(): print i.getURL()

Bypassing query security check

Note

Security: All portal_catalog queries are limited to the current user permissions by default.

If you want to bypass this restrictions, use method unrestrictedSearchResults().

Example:

# Print absolute content of portal_catalog
for i in portal_catalog.unrestrictedSearchResults(): print i.getURL()

Bypassing language check

Note

All portal_catalog() queries are limited to the selected language of current user. You specially need to bypass language check if you want to do multilingual queries.

Example how to bypass language check:

all = portal_catalog(language="ALL")

Expired content check

Plone and portal_catalog has a mechanism to list only active (non-expired) content by default.

Below is an example how the expired content check is made:

mtool = context.portal_membership
show_inactive = mtool.checkPermission('Access inactive portal content', context)

contents = context.portal_catalog.queryCatalog(show_inactive=show_inactive)

See also:

* ::doc::<Listing /content/listing>

Querying by path

ExtendedPathIndex is the index used for content object paths. Path index stores the physical path of the objects.

** Warning: ** If you ever rename your Plone site instance, path index needs to be rebuild.

Example:

portal_catalog(path={ "query": "/myploneinstance/myfolder" }) # return myfolder and all child content

Query by content type

To get all catalog brains of certain content type on the whole site:

campaign_brains = self.context.portal_catalog(portal_type="News Item")

To see available type names, visit in portal_types tool in ZMI.

Query published items

By default, the portal_catalog query does not care about the workflow state. You might want to limit the query to published items.

Example:

campaign_brains = self.context.portal_catalog(portal_type="News Item", review_state="published")

review_state is a portal_catalog index which reads portal_workflow variable “review_state”. For more information, see what portal_workflow tool Content tab in ZMI contains.

Getting a random item

The following view snippet allows you to get one random item on the site:

import random

def getRandomCampaign(self):
    """
    """


    campaign_brains = self.context.portal_catalog(portal_type="CampaignPage", review_state="published")

    # Filter out the current item which we have

    bad_ids = [ "you", "might", "want to black  list some ids here" ]

    items = [ brain for brain in campaign_brains if brain["getId"] not in bad_ids ]

    # Check that we have items left after filtering

    items = list(items)

    if len(items) >= 1:
        # Pick one
        chosen = random.choice(items)
        return chosen.getObject()
    else:
        # Fallback to the current content item if no random options available
        return self.context

Querying by date

See DateIndex.

Example:

items = portal_catalog(effective_date = {'date': {'query':(DateTime('2002-05-08 15:16:17'),
                                        DateTime('2062-05-08 15:16:17')),
                               'range': 'min:max'})

Another example how to get news items for a particular year in the template code:

<div metal:fill-slot="main" id="content-news"
 tal:define="boundLanguages here/portal_languages/getLanguageBindings;
             prefLang python:boundLanguages[0];
             DateTime python:modules['DateTime'].DateTime;
             start_year request/year| python: 2004;
             end_year request/year| python: 2099;
             start_year python: int(start_year);
             end_year python: int(end_year);
             results python:container.portal_catalog(
                portal_type='News Item',
                sort_on='Date',
                sort_order='reverse',
                review_state='published',
                id=prefLang,
                created={ 'query' : [DateTime(start_year,1,1), DateTime(end_year,12,31)], 'range':'minmax'}
                );
             results python:[r for r in results if r.getObject()];
             Batch python:modules['Products.CMFPlone'].Batch;
             b_start python:request.get('b_start',0);
             portal_discussion nocall:here/portal_discussion;
             isDiscussionAllowedFor nocall:portal_discussion/isDiscussionAllowedFor;
             getDiscussionFor nocall:portal_discussion/getDiscussionFor;
             home_url python: mtool.getHomeUrl;
             localized_time python: modules['Products.CMFPlone.PloneUtilities'].localized_time;">
    ...
</div>

Query by language

You can query by language:

portal_catalog({"Language":"en"})

Note

Products.LinguaPlone must be installed.

Combining queries using Boolean operators

See AdvancedQuery.

Example:

from Products import AdvancedQuery

portal_catalog = self.portal_catalog # Acquire portal_catalog from higher hierarchy level

path = self.getPhysicalPath() # Limit the search to the current folder and its children

# object.getPhysicalPath() returns the path as tuples of path parts
# Convert path to string
path = "/".join(path)

# Limit search to path in the current contex object and
# match all children implementing either of two interfaces
# AdvancedQuery operations can be combined using Python expressions & | and ~
# or AdvancedQuery objects
query = AdvancedQuery.Eq("path", path) & (AdvancedQuery.Eq("getMyIndexGetter1", "foo") | AdvancedQuery.Eq("getMyIndexGetter2", "bar"))

# The following result variable contains iterable of CatalogBrain objects
results = portal_catalog.evalAdvancedQuery(query)

# Convert the catalog brains to a Python list containing tuples of object unique ID and Title
pairs = []
for nc in results:
    pairs.append((nc["UID"], nc["Title"]))


# query = Eq("path", diagnose_path) & Eq("SearchableText", text_query_target)

query = Eq("path", diagnose_path) & Eq("SearchableText", text_query_target)

return self.context.portal_catalog.evalAdvancedQuery(query)

Sorting results

portal_catalog query takes sort_on argument which tells the index used for sorting. sort_order defines sort direction. It can be string “reverse”.

Sorting is supported only on FieldIndexes. Due to nature of searchable text indexes (they index split text, not strings) they cannot be used for sorting. For example, to do sorting by title, an index called sortable_tite should be used.

Example how to sort by id:

results = context.portal_catalog.searchResults(sort_on="id",
                                               portal_type="Document",
                                               sort_order="reverse")

Unique values

ZCatalog has uniqueValuesFor() method to retrieve all unique values for a certain index. It is intended to work on FieldIndexes only.

Example:

# getArea() is Archetype accessor for area field
# which is a string and tells the contet area.
# Custom getArea FieldIndex indexes these values
# to portal catalog.
# The following line gives all area values
# inputted on the site.
areas = portal_catalog.uniqueValuesFor("getArea")