by Michael Rink

Hitachi Vantara Updates Pentaho

Hitachi Vantara, a wholly owned subsidiary of Hitachi, Ltd. (TSE: 6501), today announced Pentaho 8.2 will be generally available on December 6, 2018; just a few short days away. Pentaho is the company’s data integration and analytics platform software. 8.2 brings improvements in integration with Hitachi Content Platform (HCP); better support for third-party tools; unstructured data pipelines; and improved JSON support.

You can now access the Hitachi Content Platform (HCP) distributed storage system from Pentaho's Virtual File System browser. Within HCP, access control lists grant user privileges to perform various file operations. Namespaces are used for logical groupings, access, and object metadata (such as retention and shred settings). Pentaho 8.2 will be able to prepare, cleanse and normalize data within HCP. Hitachi hopes that Pentaho can also be used to better manage cloud infrastructure costs by precisely targeting cloud targets with just the information they need. In support of this, 8.2 will add support for three third-party technologies:

  • AMQP support:Pentaho customers can access this popular messaging protocol that helps organizations read and publish streaming data from edge devices to the cloud for addressing emerging IoT use cases.
  • Python Executor Step:The Python Executor step incorporates the CPython scripting language into your transformations. This new PDI step is useful for data scientists and engineers who want to leverage machine learning and deep learning methods, model management strategies, and integration with data science notebooks. With native support for Pandas dataFrames and NumPy arrays, the Python Executor step can read data from various sources, modify and derive values from the data, then provide the output as a set of PDI fields. The step features two methods for executing a script: running the script file from a local or hosted location or manually embedding the script inside the step.
  • OpenJDK Support. Pentaho now supports both Oracle JDK 8 and OpenJDK 8. This support extends to the Adaptive Execution Layer (AEL). When using AEL with Amazon EMR, you no longer need to install Oracle JDK 8 to run in OpenJDK 8.

With Pentaho 8.2 users will be able to build data pipelines that include both structured and unstructured data sources – such as text, video, audio, images, social media, clickstreams and log files. Hitachi expects this to allow them to better support banking customers as they can now address compliance requirements by correlating trading transaction data with email communications. Customers, such as law enforcement and medical researchers, will be able to attach image and video records to their reports.


Pentaho 8.2 will be generally available on December 6, 2018.

Hitachi Pentaho

Discuss this story

Sign up for the StorageReview newsletter

Related News and Reviews