Saturday, February 28, 2015

Filtering .raw fields with Python Elasticsearch DSL High-Level Client

It took me a while to figure out how to search the not_analyzed ".raw" fields created by Logstash in Elasticsearch indices, using the high-level Python Elasticsearch client. Because keyword arguments can't have attributes, Python throws an error if you try it the intuitive way (this assumes you've already set up a client as es and an index as i, as shown in the docs):

In [144]: tt = Search(using=es,index=i)\
.filter('term',TargetUserName.raw='Domain Admins')\
.filter('term',EventID=4728)
File "<ipython-input-144-1b746eb83e6f>", line 1
tt = Search(using=es,index=i)\
.filter('term',TargetUserName.raw='Domain Admins')\
.filter('term',EventID=4728)
SyntaxError: keyword can't be an expression
view raw gistfile1.py hosted with ❤ by GitHub
Instead, you create a dictionary with your parameters and unpack it using the ** operator:

In [142]: d
Out[142]: {'TargetUserName.raw': 'Domain Admins'}
In [143]: tt = Search(using=es,index=i)\
.filter('term',**d).filter('term',EventID=4728)
view raw gistfile1.py hosted with ❤ by GitHub
This produces the Elasticsearch query we want:

{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"EventID": 4728
}
},
{
"term": {
"TargetUserName.raw": "Domain Admins"
}
}
]
}
},
"query": {
"match_all": {}
}
}
},
"sort": [
"@timestamp"
]
}
view raw gistfile1.txt hosted with ❤ by GitHub