Q O S I L

Your Big Data Partner

DATA CATALOG, DATA WRANGLING AND WAREHOUSING

Self-service data ingest with data cleansing, validation, and automatic profiling. Organizations can expend significant engineering effort moving data into Hadoop yet struggle to maintain governance and data quality. Nucleus dramatically simplifies data ingest by shifting ingest to data owners through a simple guided UI. Nucleus can connect to most sources and infer schema from common data formats. Nucleus’s default ingest workflow moves data from source to Hive tables with advanced configuration options around field-level validation, data protection, data profiling, security, and overall governance. Using Nucleus’s pipeline template mechanism, IT can extend Nucleus’s capabilities to connect to any source, any format, and load data into any target in a batch or streaming pattern.

 

INGEST

Self-service data ingest with data cleansing, validation, and automatic profiling.
Organizations can expend significant engineering effort moving data into Hadoop yet struggle to maintain governance and data quality. Nucleus dramatically simplifies data ingest by shifting ingest to data owners through a simple guided UI.
Nucleus can connect to most sources and infer schema from common data formats. Nucleus’s default ingest workflow moves data from source to Hive tables with advanced configuration options around field-level validation, data protection, data profiling, security, and overall governance.
Using Nucleus’s pipeline template mechanism, IT can extend Nucleus’s capabilities to connect to any source, any format, and load data into any target in a batch or streaming pattern.

DESIGN

Design batch or streaming pipeline templates in Flow and register with Nucleus to enable user self-service. IT Designers can extend Nucleus’s feed capabilities around ingest, transformation, and export by developing new pipeline templates in Flow. Flow provides a visual canvas with over 180 data connectors and transforms for batch and stream-based processing. Nucleus and Flow together act as an “intelligent edge” able to orchestrate tasks between your cluster and data center.

Designers develop and test new pipelines in Flow and register templates with Nucleus determining what properties users are allowed to configure when creating feeds. This embodies the principle of write-once-use-many and enables data owners instead of engineers to create new feeds while IT retains control over the underlying dataflow patterns.
Nucleus adds a suite of Flow processors for Spark, Sqoop, Hive, and special purpose data lake primitives that provide additional capabilities.

 

PREPARE

Wrangle data with visual sql and an interactive transform through a simple user interface. Preparing data is the first step in any analytics project. Using Nucleus’s transformation feature, IT can step aside and let power users such as data analysts take control of their own data preparation tasks. Nucleus leverages the latest capabilities of Apache Spark to create interactive data transformation. Organizations can throw out their old ETL tools and save hundreds of thousands of dollars in license and maintenance fees.

MONITOR

Monitor health of feeds and services in the data lake. Track SLAs and troubleshoot performance. IT Operations are the carekeepers of your production data lake. Nucleus departs from traditional monitoring tools to provide health indicators from a feed-centric perspective. This means Operations has not only visibility on service issues impacting availability, but can track service levels associated with data arrival and distinct quality metrics. Using Nucleus, IT can give users confidence in data maintained in the data lake.

 

DISCOVER

Search and explore data and metadata, view lineage, and profile statistics. What’s the point of having a data lake if users can’t find data or trust what is there? Nucleus includes an integrated metadata repository and key capabilities for data exploration. Users can perform Google-like searches against data and metadata to discover entities of interest. Visual process lineage and provenance provides confidence in the origin of data. Automatic data profiling provides capabilities for data scientists and assurance in data quality.

 

EXPLORE THE QCLOUD COMPONENTS

qosil-logos_Qosil-Full-Black

Copyright © 2020 Qosil, LTD
Subsidy of The Freedom Nation

X