Using SQL in Elasticsearch

Use SQL in Elasticsearch to download data from the search index. All fields within Elasticsearch can be accessed using SQL.

🚧

Important Information

You can find the Elasticsearch login credentials in the searchblox.yml file, found in the following file path:
\SearchBloxServer\webapps\ROOT\WEB-INF

Viewing Results from Elasticsearch

Using SQL, you can view search results directly from Elasticsearch using the following URL

https://localhost:9200/_opendistro/_sql?sql=select * from idx001

You can also export the results in CSV format using the following URL

https://localhost:9200/_opendistro/_sql?format=csv&sql=select url, title from idx001

Basic Queries

Use the following queries for specific search results you need.

Queries for specific file types

https://localhost:9200/_opendistro/_sql?sql=select * from idx001 where contenttype='pdf'

Queries for specific URL or filename

https://localhost:9200/_opendistro/_sql?sql=select * from idx001 where url='http://www.bbc.com'

Queries for specific keywords

https://localhost:9200/_opendistro/_sql?sql=select * from idx001 where keywords='cnn'
https://localhost:9200/_opendistro/_sql?sql=select * from idx001 where keywords IN ('cnn','world')

Results from multiple indices

https://localhost:9200/_opendistro/_sql?format=csv&sql=select * from idx001,idx003,idx003

Results with only URLs from indices

https://localhost:9200/_opendistro/_sql?format=csv&sql=select url from idx001,idx002

Query with ORDER BY

https://localhost:9200/_opendistro/_sql?format=csv&sql=select uid, title from idx001,idx002 order by size

Query with LIMIT

https://localhost:9200/_opendistro/_sql?format=csv&sql=select uid from idx001,idx002 limit 10

Query with INCLUDE, EXCLUDE
Use exclude ('fieldname') and include ('fieldname') to exclude fields or include fields in display. Partial name of fields can be given along with * character.

https://localhost:9200/_opendistro/_sql?format=csv&sql=select exclude('a*' from idx001,idx003

https://localhost:9200/_opendistro/_sql?format=csv&sql=select exclude('c*') from idx001,idx003

Aggregations

Using GROUP BY and COUNT
Group by and count together can be used to determine the count of file types in results.

https://localhost:9200/_opendistro/_sql?format=csv&sql=select count(*) from idx001 group by contenttype

Group by can be also used to determine sentiment in sentiment analysis.

https://localhost:9200/_opendistro/_sql?format=csv&sql=select count(*) from idx001 group by content.sentiment

Query with sum function

https://localhost:9200/_opendistro/_sql?format=csv&sql=SELECT sum(size) from idx001

Queries with min and max functions

https://localhost:9200/_opendistro/_sql?sql=select min(size) from idx001
https://localhost:9200/_opendistro/_sql?sql=select max(size) from idx001

Queries with statistic functions

https://localhost:9200/_opendistro/_sql?sql=select avg(size) from idx001
https://localhost:9200/_opendistro/_sql?sql=select extended_stats(size) from idx001
https://localhost:9200/_opendistro/_sql?sql=select percentiles(size) from idx001

Beyond SQL features

Query with IS MISSING, IS NOT MISSING

https://localhost:9200/_opendistro/_sql?format=csv&sql=select * from idx001 where author is missing
https://localhost:9200/_opendistro/_sql?format=csv&sql=select * from idx001 where author is not missing

Query to search for multiple terms using IN_TERMS

https://localhost:9200/_opendistro/_sql?format=csv&sql=select * from idx001 where author= IN_TERMS(hattie,alis,george)

Query to search for a specific term using TERM

https://localhost:9200/_opendistro/_sql?format=csv&sql=select * from idx001 where author= TERM(george)

Special queries related to statistics that can be used in Elasticsearch

https://localhost:9200/_opendistro/_sql?format=csv&sql=SELECT extended_stats(age) FROM idx001
https://localhost:9200/_opendistro/_sql?format=csv&sql=SELECT percentiles(age) FROM idx001