• Liberate Your Data

    RAW Data Fusion is a data management platform for data engineers, data scientists and data analyst to seamlessly query heterogenous data sources in real-time and transform them into new data sets for consumption by analytics tools, ML-Models and enterprise applications.

    Available on-premise and in the cloud
    Request a demo
  • RAW Labs joins startup program of 
    HPE Switzerland to help companies accelerate digital transformation.

    Read More
  • No Extract-Transform-Load
    No Data Loading
    No Schema Creation
    No Index Creation
    No Database Tuning
    No Flattening of XMLs or JSONs
    No Weird SQL for Nested Data

  • RAW Labs CEO receives prestigeous award - read more

Harnessing the data value chain

Connect and discover any type of data in real-time

Connect and query any data in its original format and in real-time without any preparation. RAW code generates source specific adaptors for each source optimized for high performance data retrieval. RAW automatically detects the format, the schema and the encoding to make it transparent to the user. RAW supports both simple and complex data including CSV, Excel, Word, Amazon S3, machine logs, streams, JSON, HJSON, XML, AVRO, HDFS, HDF5, Parquet, NoSQL relational data bases, columnar databases, multi-dimensional arrays, APIs and more.

Transform data sources into value adding data sets

RAW's extended version of SQL support all functions necessary to create workable data sets. Join, clean and transform data with different file formats all at the same time. Applying data cleansing algorithms to correct and improve the quality of your data. No need to create tables or perform heavy ETL processes. RAW supports very complex queries such arbitrarily nested queries allowing you to mix together incompatible data.

Enhance your datasets with ML & AI

Use your favourite data science tools to turn your datasets into Smart Datasets. As a data scientist you can quickly bring together data, or use the queries created by your colleagues to run your experiments. Add new predictive data points directly from e.g. Python notebooks using Scikit Learn. Save the enhanced data as a Smart Dataset and make it automatically available in the RAW Virtual Lake. If your model works well, save it and share it with your colleagues. You can even embed your model directly in a RAW query and run it on the Edge.

Virtualize and share the data

Datasets are automatically virtualized in RAW either in near real-time, or through high performance caches. RAW's caching engine takes into consideration the structure and format of the datasets and optimizes the caching accordingly. A simple dataset based on CSV files will be cached different than a data set based on multidimensional arrays. RAW updates its caches as the datasets grow without having to refresh the entire cache, saving time and maintaining high response times. RAW allows the users to name and describes its datasets, thus providing a simple, yet effective to build on a repository of datasets that can then be shared with the appropriate user access rights across the organisation.

Deploy and create business value

Any dataset in RAW can be exposed in multiple formats: As RestAPIs, as csv files, excel, SQLLite, Python and in many other formats. Take your datasets and deploy it as a service to increase the value of your enterprise applications such as BI tools, supply chain system or marketing automation. Or use the Smart Datasets to create new data driven applications such as predictive maintenance, fraud detection or diabetes prediction for your end users.
Request a Demo

Access all

enterprise data

  • The Power of One Unified query language
  • Logically integrate all data sources
  • Enhance with ML & AI
  • Break dow-data silos
  • Enable data-driven applications



  • Plug & Play: Up and running in hours  
  • Feels like SQL and easy to learn 
  • No data preparation time,  no duplication
  • Add new data sources on the fly
  • Automatic schema & format detection
  • No performance tuning required.           


reduced TCO

  • Reduced operations costs
  • Avoid costly data duplication
  • No new data lake / data warehouse costs
  • No new ETL license and other tool cost
  • Reduced development costs
  • Reduced maintenance costs