Dependencies¶
Component Versions¶
Below is a list of some of the major components Kylo uses along with the version that Kylo currently supports:
Category | Item | Version | Description |
---|---|---|---|
Persistence | MySQL | 5.x (tested with 5.1.73) | Used to store both the Modeshape (JCR 2.0) metadata and the Operational Relational (Kylo Ops Manager) metadata |
Persistence | Postgres | 9.x | Not fully supported yet. Kylo Operations Manager piece should work in Postgres; however we haven’t fully tested it with Feed Manager. (See below on the Persistence Note) |
Persistence | Modeshape | 5.0 | Jboss Modeshape Java Content Repository (JCR 2.0) |
JMS | ActiveMq | 5.x (tested with 5.13.3) | Used to send messages between different modules and to send Provenance from NiFi to Kylo |
NiFi | NiFi | 1.0,(HDF 2.0) | Either HDF or open source NiFi work. |
Spark | Spark | 1.5.x, 1.6.x, 2.x | NiFi and Kylo have routines that leverage Spark. |
UI | Tomcat | 8.0.32 | Tomcat is the default engine for Spring Boot. If needed Spring Boot allows you to change to a different server (i.e. Jetty) but this hasn’t been tested. |
Java | Java | Java 8_92+ | The Kylo install will setup its own Java Home so it doesn’t affect any other Java versions running on the machine. |
Search | Elasticsearch | 2.3.x | For index and search of Hive metadata and indexing feed data when selected as part of creating a feed |
OS | Linux | Various | Tested with RHEL and CentOS 6.x, 7.x, SUSE v11 |
Service Accounts¶
Required new linux service accounts are listed below. Within enterprises there are often approvals required and long lead times to obtain service accounts. Kerberos principals are required where the service interacts with a Kerberized Hadoop cluster. These services are not typically deployed to control and data nodes. The Nifi, activemq, Elastic services and Kylo metastore databases (mysql or postgres) are IO intensive.
Service | Purpose | Local Linux Users | Local Linux Groups | Keytab file | upn | spn |
---|---|---|---|---|---|---|
kylo-services | Kylo Coordinator | kylo | kylo, hdfs or supergroup | /etc/security/keytabs/kylo.headless.keytab | *kylo@EXAMPLE.COM* | |
kylo-ui | Provides Kylo feed and operations user interface | kylo | kylo, hdfs or supergroup | |||
nifi | Orchestrate data flows | nifi | nifi, hdfs or supergroup | /etc/security/keytabs/nifi.headless.keytab | *nifi@EXAMPLE.COM* | |
activemq | Broker messages between components | activemq | activemq | |||
elastic | Manages searchable index | elastic | elastic | |||
mysql or postgres | Metastore for Kylo feed manager and operational metadata | mysql or postgres | mysql or postgres |
Persistence Usage¶
Kylo captures and stores three types of metadata:
- Setup/Configuration metadata. This data is captured in the Kylo Feed Manager which describes how Feeds, Categories, Templates, etc are structured.
- This is stored using Java Content Repository (JCR 2.) using Modeshape as the JCR implementation and persisted to MySQL.
- Operational/Transactional metadata. This data is captured by the system when it processes data, feeds, slas, etc.
- This is stored in MySQL.
- Searchable data index. This data is captured by the data flows where ‘Index’ field check box is marked.
- This is stored in Elasticsearch.