Apache SystemML Roadmap

Planned for Future SystemML 1.0

  • Compression (additional operations)
  • Experimental Features:
    • Deep Learning
    • Single GPU support
    • Native BLAS support
    • Code Generation
  • Rigorous Performance and Scalability Testing (Bug Fixes)
  • Remove Deprecated APIs/Functions

Feature Candidates for Future Releases

  • Completion of Prior Experimental Features
  • Python DSL
  • New Algorithms -- e.g. Decomposition Algorithms
  • Common DSL Architecture
  • R Interfaces: R DSL and R Wrappers
  • Native Zeppelin Notebook Support
  • Sum Product Optimizations
  • Tree-based Data Structures
  • Global Dataflow Optimizations
  • Tooling

Current Release

  • SystemML 0.14.0-incubating (released in May, 2017) details
    • Runtime feature extensions (new libsvm-binary data converters, parfor spark buffer pool handling, parfor block partitioning of fixed size batches of rows or columns, native dataset support in parfor spark datapartition-execute)
    • Compiler feature extensions (improved parfor execution type selection, improved literal replacement for nrow/ncol, simplified instruction generation across back-ends, consolidated static/dynamic rewrite utilities)
  • Experimental Features
    • New Code Generation capabilities for automatic operator fusion (basic code generator, compiler integration, runtime integration, in-memory source code compilation, extended explain tool, support for right indexing and replace in cellwise and row aggregate templates, support for row, column, or no aggregation in rowwise template). Note code generation provides significant performance gains with fewer read/write intermediates, reduced scans of inputs and intermediates, and enhanced sparsity exploitation. To enable this feature, set codegen.enabled property to true in SystemML-config.xml file.
    • New instructions and operators for GPU support (relu_maxpooling, conv2d_bias_add, bias_multiply)
  • Removals
    • Removed support for Java 6 and Java 7
    • Removed parfor perftesttool and cost estimator

Prior Releases

  • SystemML 0.13.0-incubating (released in March, 2017) details
    • Updated build for Spark 2.1.0
    • New simplification rewrites for stratstats
    • New fused operator tack+* in CP and Spark
    • New dmlFromResource capability in Python (equivalent to Scala)
    • Add input float support to MLContext
  • Documentation Enhancements
    • Deploy versioned documentation to main project website
    • Add python mlcontext example to engine dev guide
    • Add MLContext info functionality to docs
    • Update DML Language Reference for write description parameter
  • Deprecations, Removals
    • Deprecate old MLContext API
    • Deprecate parfor perftesttool
    • Deprecate SQLContext methods
    • Replace deprecated Accumulator with AccumulatorV2
    • Replace append with cbind for matrices
    • Migrate Vector and LabeledPoint classes from mllib to ml
  • Experimental Features / Algorithms
    • Compressed Linear Algebra v2 (new DDC encoding format, hardened sample-based estimators, debugging tools, new column grouping algorithm, additional operations)
  • SystemML 0.12.0-incubating (released in February, 2017) details
    • Support pip install of new python package
    • Allow NumPy arrays, Pandas DataFrame and SciPy matrices as input to MLContext
    • Improve SystemML Python DSL for NumPy
    • Updated build for Spark 1.6.0
    • DML utility script to shuffle input dataset
  • Experimental Features / Algorithms
    • GPU Enhancements
  • SystemML 0.11.0-incubating (released in November, 2016) details
    • SystemML frames
    • New MLContext API
    • Transform functions based on SystemML frames
  • Experimental Features / Algorithms
    • New built-in functions for deep learning (convolution and pooling)
    • Deep learning library (DML bodied functions)
    • Python DSL Integration
    • GPU Support
    • Compressed Linear Algebra
  • New Algorithms
    • Lasso
    • kNN
    • Lanczos
    • PPCA
  • Deep Learning Algorithms
    • CNN (Lenet)
    • RBM
  • SystemML 0.10.0-incubating (released in June, 2016) details
    • Different types of Spark Matrix Blocks: MCSR, CSR, COO
    • SystemML Frame support in JMLC/CP
    • Initial Deep Learning support
    • API/Scripts: parser error handling, SystemML configuration handling,
  • Include Algorithms in SystemML jar, print matrix
    • New fused operator: wdivmm with variations
    • Performance Features: cache-conscious operations, more multithreaded
  • Operations, New Simplification Rewrites
    • New Algorithms: kNN
    • Documentation: javadocs, Jupyter/Zeppelin notebook examples
  • SystemML 0.9.0-incubating (released in January, 2016) details
    • Improvements to MLContext and MLPipeline wrappers
    • New converter utilities for RDDs and DataFrames
    • New Optimizations for Spark Backend, e.g. eager RDD caching and
  • Repartitioning, RDD Checkpointing, On-Demand Creation of SparkContext
    • New Runtime Operators for mmult, multihreaded readers and operators.
    • New Algorithms: ALS, Cubic Splines
    • Online Documentation