Commit 764e1c81 authored by Maiken's avatar Maiken
Browse files
parents c31e9689 b62f954b
Prerequisites
=============
The tool works by converting JURA's archived records in XML format into JSON
and submitting them to a specified ES instance. It is therefore assumed that
the archiving option for JURA is turned on in arc.conf.
It is of course also assumed that ARC is installed on the machine. If not,
the tool would fail since it would not find the files it needs.
Please note, that the tool is being tested only against the latest version of
the ES, and with the latest 'elasticsearch-py' module. The current version of
the master tree is not guaranteed to work with non-latest ES.
Installation
Requirements
============
1. Make sure the next python modules are present on the endpoint (many are a
part of default python installation):
Python 2.7 is recommended. Python 2.6 can be used if you manage to install
elasticsearch python module v. 6.3.0 with it.
The next python modules are needed (many are a part of default
python installation):
* xml
* json
* glob
......@@ -27,14 +34,32 @@ part of default python installation):
* traceback
* elasticsearch
* watchdog
2. In the source code directory, run 'python setup.py install'.
2a. If you need to use other ES host/port and index, than the hardcoded
defaults,
run 'python setup.py install --eshost <hostname> --esport <portnumber>
* platform
* subprocess (only if you use setup.py)
Installation
============
1. setup.py only supports RedHat systems 6 and 7. If you have other OS,
see 'python jura_to_es.py --help' and launch the tool manually.
1a. Remember to rerun setup.py if any parameter (ES host, index, etc.)
changes itself in the future.
2. On Redhat 6.x:
2a. In the source code directory, run 'python setup.py install'.
2b. If you need to use other ES host/port and index, than the hardcoded
defaults, run
'python setup.py install --eshost <hostname> --esport <portnumber>
--esindex <indexname>'
3. Start the tool with '/etc/init.d/jura_to_es start'
4. If needed, make 'jura_to_es' service startable at the boot with chkconfig
or systemctl.
2c. Start the tool with '/etc/init.d/jura_to_es start'
2d. If needed, make 'jura_to_es' service startable at the boot with chkconfig.
3. On Redhat 7.x:
3a. In the source code directory, run
'python setup.py install --juradir <path_to_your_jura_archiving_dir>'.
3b. If you need to use other ES host/port and index, than the hardcoded
defaults, see 2b above.
3c. Start the tool with 'systemctl start jura_to_es'
3d. If needed, make 'jura_to_es' service startable at the boot with
'systemctl enable jura_to_es'.
Logging
=======
......@@ -44,6 +69,7 @@ also installs a logrorate entry for this file.
Some technical stuff
====================
* It's recommended to apply the mapping to your index before you start submit
data:
curl -XPUT http://your_es_host:your_es_port/your_es_index \
......@@ -51,6 +77,11 @@ data:
* The init script has 'dmytrok_arc_test' index hardcoded in itself. Should be
changed when/if we create new production indices for ARC data.
* See also 'python jura_to_es.py --help'.
* On RedHat 6.x, if launched in monitoring mode manually, the process will
create a fork and exit. On RedHat 7.x it won't fork and will stay in foreground.
Otherwise the tool would not work in systemd, when installed using setup.py.
Systemd daemonizes the given commands itself, and does not like when the tool
daemonizes itself internally.
* The tool currently works not in the most efficient way. It picks up a new
file from the JURA archive dir, converts it, and submits to the ES endpoint.
Since ARC creates one record per accounting endpoint, for the same job -- it
......
......@@ -209,8 +209,18 @@ def handler(signum, frame):
def monitor_dir(xml_dir, es, es_index, record_prefix, pidfile, logfile):
logging.basicConfig(filename=logfile, level=logging.INFO, format='%(asctime)s [%(levelname)s] %(message)s')
# create daemon
daemonize(pidfile)
import platform
on_redhat_7 = False
OS_name = platform.platform().split("with-", 1)[1]
if (OS_name.startswith("centos") or OS_name.startswith("redhat")):
OS_version = OS_name.split("-")[1]
if OS_version.startswith("7"):
on_redhat_7 = True
if not on_redhat_7:
# create daemon
daemonize(pidfile)
# add signal handler
signal.signal(signal.SIGTERM, handler)
......
[Unit]
Description=JURA to ES conversion tool
After=multi-user.target
[Service]
Type=idle
ExecStart=#CMDLAUNCHLINE#
[Install]
WantedBy=multi-user.target
from distutils.core import setup
import sys, os, re
import sys, os, re, platform
es_host = None
es_port = None
......@@ -20,31 +20,87 @@ if '--esindex' in sys.argv:
sys.argv.pop(index) # Removes the '--esindex'
es_index = sys.argv.pop(index) # Returns the element after the '--esindex'
tool_command_line = "CMD=\"$CMD -c 'import jura_to_es; jura_to_es.main([\\\"-m\\\", \\\"-d\\\", \\\"$JURA_ARCHIVING_DIR\\\""
if es_host:
tool_command_line += ", \\\"-h\\\", \\\"" + es_host + "\\\""
OS_name = platform.platform().split("with-", 1)[1]
if not(OS_name.startswith("centos") or OS_name.startswith("redhat")):
print "Unsupported OS. Please, read the help of the python script itself"
print "and launch the command directly. Exiting."
sys.exit(1)
if es_port:
tool_command_line += ", \\\"-p\\\", \\\"" + es_port + "\\\""
OS_version = OS_name.split("-")[1]
if OS_version.startswith("6"):
install_version = "6"
elif OS_version.startswith("7"):
install_version = "7"
else:
print "Unsupported OS version. Please, read the help of the python script itself"
print "and launch the command directly. Exiting."
sys.exit(1)
if es_index:
tool_command_line += ", \\\"-i\\\", \\\"" + es_index + "\\\""
if install_version == "6":
tool_command_line = "CMD=\"$CMD -c 'import jura_to_es; jura_to_es.main([\\\"-m\\\", \\\"-d\\\", \\\"$JURA_ARCHIVING_DIR\\\""
tool_command_line += "])'\""
if es_host:
tool_command_line += ", \\\"-h\\\", \\\"" + es_host + "\\\""
if es_port:
tool_command_line += ", \\\"-p\\\", \\\"" + es_port + "\\\""
if es_index:
tool_command_line += ", \\\"-i\\\", \\\"" + es_index + "\\\""
tool_command_line += "])'\""
init_template_name = 'jura_to_es.init.d'
init_file_name = 'jura_to_es'
init_location = '/etc/init.d/'
elif install_version == "7":
if '--juradir' in sys.argv:
juradir = sys.argv.index('--juradir')
sys.argv.pop(juradir) # Removes the '--juradir'
jura_dir = sys.argv.pop(juradir) # Returns the element after the '--juradir'
else:
print "For this OS version you have to specify the location of JURA archiving dir."
print "Please, rerun setup.py with --juradir parameter specified."
print "Exiting."
sys.exit(1)
tool_command_line = "/usr/bin/python -c 'import jura_to_es; jura_to_es.main([\"-m\", \"-d\", \"" + jura_dir + "\""
if es_host:
tool_command_line += ", \"-h\", \"" + es_host + "\""
if es_port:
tool_command_line += ", \"-p\", \"" + es_port + "\""
if es_index:
tool_command_line += ", \"-i\", \"" + es_index + "\""
tool_command_line += "])'"
init_template_name = 'jura_to_es.service.unit'
init_file_name = 'jura_to_es.service'
init_location = '/lib/systemd/system/'
cmdlaunchline = re.compile('#CMDLAUNCHLINE#', re.DOTALL)
f = open('jura_to_es.init.d', 'r')
f = open(init_template_name, 'r')
init_script = f.read()
f.close()
completed_init_script = cmdlaunchline.sub(tool_command_line, init_script)
f = open('jura_to_es', 'w')
f = open(init_file_name, 'w')
f.write(completed_init_script)
f.close()
os.chmod('jura_to_es', 0755)
data_files = [('/etc/logrotate.d/', ['jura-to-es']), (init_location, [init_file_name])]
if install_version == "6":
os.chmod(init_file_name, 0755)
elif install_version == "7":
os.chmod(init_file_name, 0644)
setup(name='jura_to_es',
version = '1.0',
......@@ -54,5 +110,9 @@ setup(name='jura_to_es',
url = 'https://neic.nordforsk.org/activities/nt1/',
long_description = "This tool converts the archived job records, created by JURA, to JSON and submits them to ElasticSearch cluster.",
py_modules = ['jura_to_es'],
data_files = [('/etc/init.d/', ['jura_to_es']), ('/etc/logrotate.d/', ['jura-to-es'])]
data_files = data_files
)
if install_version == "7":
from subprocess import call
call(["systemctl", "daemon-reload"])
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment