@@ -35,7 +35,7 @@ The contents of the \texttt{test-elasticsearch.sh} script are given in Listing~\
\item Add an author document type to the library index by running the \texttt{add-author.sh} script, in a similar manner as Listing~\ref{listing:start-elasticsearch-run}.
The contents of the \texttt{add-author.sh} script are given in Listing~\ref{listing:add-author.sh}. The \texttt{cURL} command creates the library index if it does not already exist, adds the author document type if it does not already exist and adds one document. The document is defined using JSON and contains one field, which is the author name. In the current version of Elasticsearch an index can only have one document type.
The contents of the \texttt{add-author.sh} script are given in Listing~\ref{listing:add-author.sh}. The \texttt{cURL} command creates the \texttt{authors} index if it does not already exist and adds one document. The document is defined using JSON and contains one field, which is the author name.
@@ -47,7 +47,7 @@ The contents of the \texttt{list-indices.sh} script are given in Listing~\ref{li
\item List the authors that are available by running the command \texttt{get-authors.sh}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run}.
The contents of the \texttt{get-authors.sh} script are given in Listing~\ref{listing:get-authors.sh}. The \texttt{cURL} command returns all authors, since no filter constraints have been given.
The contents of the \texttt{get-authors.sh} script are given in Listing~\ref{listing:get-authors.sh}. The \texttt{cURL} command returns all authors, since no filter constraints have been included.
\item Delete the library index by running the script \texttt{delete-index-authors.sh}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run}.
Elasticsearch includes two text field types that are \texttt{text} and \texttt{keyword}. The \texttt{author} field is a \texttt{text} field. The characters within a \texttt{text} field are considered in lower case when a wildcard or regular expression (regex) query is run. This implies that an upper case search string such as \texttt{"*Hig*"} returns no results for a wildcard or regex query, even though the document text field \texttt{author} contains \texttt{"*Hig*"}. To return the expected results, the search string has to be given in lower case or alternatively the \texttt{"case\_insensitive"} flag should be set to \texttt{true}. This is demonstrated in Listing~\ref{listing:wildcard-uppercase} that does not return results and Listing~\ref{listing:wildcard-uppercase-ok} that returns the same results as Listing~\ref{listing:search-authors.sh}.
\item Add publications to a new library index by running the script \texttt{add-publications.sh}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run}.
\begin{lstlisting}[caption={An upper case query that returns no results.},label=listing:wildcard-uppercase,numbers=none,language=java]
"query": {
"wildcard": {
"author": {
"value": "*Hig*"
}
}
}
\end{lstlisting}
\begin{lstlisting}[caption={An upper case query that returns results.},label=listing:wildcard-uppercase-ok,numbers=none,language=java]
"query": {
"wildcard": {
"author": {
"value": "*Hig*",
"case_insensitive": true
}
}
}
\end{lstlisting}
\item Add publications to a publications index by running the script \texttt{add-publications.sh}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run}.
The contents of the \texttt{add-publications.sh} script are given in Listing~\ref{listing:add-publications.sh}. This script adds three publication documents with the indices 1, 2 and 3. The documents contain several fields, including a list, text string fields and an integer or whole number field.
...
...
@@ -91,13 +112,17 @@ The contents of the \texttt{search-publications-2.sh} script are given in Listin
The contents of the \texttt{search-publications.py} script are given in Listing~\ref{listing:search-publications.py}. This Python script opens a connection to Elasticsearch, using the localhost and the port number. It then uses the \texttt{search} function to build and run a search. The response from the search is assigned to \texttt{response}. The body of the search is given within the \texttt{body} dictionary. Once the search has run the results are printed, in terms of the number of documents that are found and each of the documents that are found. The \texttt{pprint} function is used to create a formatted output that is easier to read.
\item Run the script \texttt{delete-index-publications.sh}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run} to delete all of the publications.
\item Run the script \texttt{delete-publications.sh}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run} to delete all of the publications.
\item Run the Python script \texttt{add-publication.py}, in a similar manner as Listing~\ref{listing:start-elasticsearch-run}. Then list the publications to confirm that a publication has been added.
The contents of the \texttt{add-publication.py} script are given in Listing~\ref{listing:add-publication.py}. This Python script connects to the Elasticsearch server and attempts to create a new index, document type and document entry. The body contains the document definition.
The contents of the \texttt{add-publication.py} script are given in Listing~\ref{listing:add-publication.py}. This Python script connects to the Elasticsearch server and attempts to create a new index and add one document. The \texttt{body} contains the document definition.
@@ -105,10 +130,14 @@ The contents of the \texttt{add-publication.py} script are given in Listing~\ref
The contents of the \texttt{add-planets.py} script are given in Listing~\ref{listing:add-planets.py}. The script loads the \texttt{planets.json} file into memory. Then the JSON is converted into a list of dictionaries, which is stored in the \texttt{json\_data} variable. A connection is made to the Elasticsearch server. Then each of the planets are added to Elasticsearch. The Moon is skipped, since it is not a planet. If the request to add a planet fails, it is caught at Line~30.
\item Try re-running the \texttt{add-planets.py} script and see what happens. How many document records are present before and after \texttt{add-planets.py} has been run? Use the example \texttt{get-author.sh} and modify the command to get one of the planet documents. Then check the version number.
\end{enumerate}
The \texttt{authors}, \texttt{planets} and \texttt{publication} index can be deleted by running the \texttt{delete-authors.sh}, \texttt{delete-planets.sh} and \texttt{delete-publications.sh} scripts, respectively.