Skip to main content

2 posts tagged with "engine"

View All Tags

· 3 min read
Casion

This article mainly guides you how to download the non-default engine installation plug-in package corresponding to each version.

Considering the size of the release package and the use of plug-ins, the binary installation package released by linkis only contains some common engines /hive/spark/python/shell. Very useful engine, there are corresponding modules flink/io_file/pipeline/sqoop in the project code (there may be differences between different versions), In order to facilitate everyone's use, based on the release branch code of each version of linkis: https://github.com/apache/linkis, this part of the engine is compiled for everyone to choose and use.

linkis versionengines includedengine material package download link
1.5.0jdbc
pipeline
io_file
flink
openlookeng
sqoop
presto
elasticsearch
trino
impala
1.5.0-engineconn-plugin.tar
1.4.0jdbc
pipeline
io_file
flink
openlookeng
sqoop
presto
elasticsearch
trino
impala
1.4.0-engineconn-plugin.tar
1.3.2jdbc
pipeline
io_file
flink
openlookeng
sqoop
presto
elasticsearch
trino
seatunnel
1.3.2-engineconn-plugin.tar
1.3.1jdbc
pipeline
io_file
flink
openlookeng
sqoop
presto
elasticsearch
trino
seatunnel
1.3.1-engineconn-plugin.tar
1.3.0jdbc
pipeline
io_file
flink
openlookeng
sqoop
presto
elasticsearch
1.3.0-engineconn-plugin.tar
1.2.0jdbc
pipeline
flink
openlookeng
sqoop
presto
elasticsearch
1.2.0-engineconn-plugin.tar
1.1.3jdbc
pipeline
flink
openlookeng
sqoop
1.1.3-engineconn-plugin.tar
1.1.2jdbc
pipeline
flink
openlookeng
sqoop
1.1.2-engineconn-plugin.tar
1.1.1jdbc
pipeline
flink
openlookeng
1.1.1-engineconn-plugin.tar
1.1.0jdbc
pipeline
flink
1.1.0-engineconn-plugin.tar
1.0.3jdbc
pipeline
flink
1.0.3-engineconn-plugin.tar

engine type

Engine nameSupport underlying component version
(default dependency version)
Linkis Version RequirementsIncluded in Release Package By DefaultDescription
SparkApache 2.0.0~2.4.7,
CDH >= 5.4.0,
(default Apache Spark 2.4.3)
>=1.0.3YesSpark EngineConn, supports SQL , Scala, Pyspark and R code
HiveApache >= 1.0.0,
CDH >= 5.4.0,
(default Apache Hive 2.3.3)
>=1.0.3YesHive EngineConn, supports HiveQL code
PythonPython >= 2.6,
(default Python2*)
>=1.0.3YesPython EngineConn, supports python code
ShellBash >= 2.0>=1.0.3YesShell EngineConn, supports Bash shell code
JDBCMySQL >= 5.0, Hive >=1.2.1,
(default Hive-jdbc 2.3.4)
>=1.0.3NoJDBC EngineConn, already supports Mysql,Oracle,KingBase,PostgreSQL,SqlServer,DB2,Greenplum,DM,Doris,ClickHouse,TiDB,Starrocks,GaussDB and OceanBase, can be extended quickly Support other engines with JDBC Driver package, such as SQLite
FlinkFlink >= 1.12.2,
(default Apache Flink 1.12.2)
>=1.0.2NoFlink EngineConn, supports FlinkSQL code, also supports starting a new Yarn in the form of Flink Jar Application
Pipeline->=1.0.2NoPipeline EngineConn, supports file import and export
openLooKengopenLooKeng >= 1.5.0,
(default openLookEng 1.5.0)
>=1.1.1NoopenLooKeng EngineConn, supports querying data virtualization engine with Sql openLooKeng
SqoopSqoop >= 1.4.6,
(default Apache Sqoop 1.4.6)
>=1.1.2NoSqoop EngineConn, support data migration tool Sqoop engine
PrestoPresto >= 0.180>=1.2.0NoPresto EngineConn, supports Presto SQL code
ElasticSearchElasticSearch >=6.0>=1.2.0NoElasticSearch EngineConn, supports SQL and DSL code
TrinoTrino >=371>=1.3.1NoTrino EngineConn, supports Trino SQL code
SeatunnelSeatunnel >=2.1.2>=1.3.1NoSeatunnel EngineConn, supportt Seatunnel SQL code

Install engine guide

After downloading the material package of the engine, unzip the package

tar -xvf 1.0.3-engineconn-plugin.tar
cd 1.0.3-engineconn-plugin

Copy the engine material package to be used to the engine plug-in directory of linkis, and then refresh the engine material.

For the detailed process, refer to Installing the EngineConnPlugin Engine.

· 3 min read
Peacewong

Overview

openLooKeng is an "out of the box" engine that supports in-situ analysis of any data, anywhere, including geographically remote data sources. It provides a global view of all data through a SQL 2003 interface. openLooKeng features high availability, auto-scaling, built-in caching and indexing support, providing the reliability needed for enterprise workloads.

openLooKeng is used to support data exploration, ad hoc query and batch processing with near real-time latency of 100+ milliseconds to minutes without moving data. openLooKeng also supports hierarchical deployment, enabling geographically remote openLooKeng clusters to participate in the same query. With its cross-region query plan optimization capabilities, queries involving remote data can achieve near "local" performance. Linkis implements the openLooKeng engine to enable Linkis to have the ability to virtualize data and support the submission of cross-source heterogeneous queries, cross-domain and cross-DC query tasks. As a computing middleware, Linkis can connect more low-level computing and storage components by using openLooKeng's connector based on the connectivity capability of Linkis' EngineConn.

Development implementation

The implementation of openLooKeng ec is extended based on the EngineConn Plugin (ECP) of Linkis. Because the OpengLooKeng service supports multiple users to query through the Client, the implementation mode is the implementation mode of the multi-user concurrent engine. That is, tasks submitted by multiple users can run in one EC process at the same time, which can greatly reuse EC resources and reduce resource waste. The specific class diagram is as follows:

【Missing picture】

The specific implementation is that openLooKengEngineConnExecutor inherits from ConcurrentComputationExecutor, supports multi-user multi-task concurrency, and supports docking to multiple different openLooKeng clusters.

Architecture

Architecture diagram: image

The task flow diagram is as follows: image

The capabilities based on Linkis and openLooKeng can provide the following capabilities:

    1. The connection capability of the computing middleware layer based on Linkis allows upper-layer application tools to quickly connect to openLooKeng, submit tasks, and obtain logs, progress, and results.
    1. Based on the public service capability of Linkis, it can complete custom variable substitution, UDF management, etc. for openLooKeng's sql
    1. Based on the context capability of Linkis, the results of OpengLooKeng can be passed to downstream ECs such as Spark and Hive for query
    1. Linkis-based resource management and multi-tenancy capabilities can isolate tasks from tenants and use openLooKeng resources
    1. Based on OpengLooKeng's connector capability, the upper-layer application tool can complete the task of submitting cross-source heterogeneous query, cross-domain and cross-DC query type, and get a second-level return.

Follow-up plans

In the future, the two communities will continue to cooperate and plan to launch the following functions:

  • 1.Linkis supports openLooKeng on Yarn mode
    1. Linkis has completed the resource management and control of openLooKeng, tasks can now be queued by Linkis, and submitted only when resources are sufficient
    1. Based on the mixed computing ability of openLooKeng, the ability of Linkis Orchestrator is optimized to complete the mixed computing ability between ECs in the subsequent plan.