Creating value out of data (without replicating it)
Connect to any type of data
Connect to any data in its original format and in real-time. RAW code generates source specific adaptors for each source optimized for high performance data reading and retrieval. RAW detects the format, schema and encoding automatically to make it transparent to the user. RAW supports both simple and complex data including CSV, Excel, Word, Amazon S3, machine logs, Streams, JSON, HJSON, XML, AVRO, HDFS, HDF5, Parquet, NoSQL relational data bases, columnar databases, multi-dimensional arrays, APIs and more.
Transform it into workable data sets
RAW's extended version of SQL support all functions necessary to create workable data sets. Join, clean and transform data with different file formats all at the same time. Applying data cleansing algorithms to correct your data. No need to create tables or perform heavy ETL processes. RAW supports very complex queries such arbitrarily nested queries allowing you to mix normally incompatible data.
Enhance your datasets
Use your favourite data science tools to turn your datasets into Smart Datasets. As a data scientist you can quickly bring together data, or use the queries created by your colleagues to run your experiments. Add new predictive data points directly from e.g. Python notebooks using Scikit Learn. Save the enhanced data set as Smart Dataset and make it automatically available in the RAW Virtual Lake. If your model works well, save it and share it with your colleagues. You can even embed your model directly in a RAW query and run it on the Edge.
Virtualize and share the data
Datasets are automatically virtualized in RAW either in near real-time, or through high performance caches. RAW's caching engine takes into consideration the structure and format of the datasets and optimizes the caching accordingly. A simple dataset based on CSV files will be cached different than a data set based on multidimensional arrays. RAW updates is caches as datasets increase without having to refresh the entire cache saving time and keeping high response times. RAW allows the users to name and describes its datasets, thus providing a simple, yet effective to build on a repository of datasets that can then be shared with the appropriate user access rights across the organisation.
Deploy and create business value
Any dataset in RAW can be exposed in multiple formats: As RestAPIs, as csv files, excel, SQLLite, Python and in many other formats. Take your datasets and deploy as a service to increase the value of your enterprise applications such as BI tools, supply chain system or marketing automation. Or use the Smart Datasets to create new data driven applications, e.g. predictive maintenance, fraud detection or diabetes prediction.
RAW Labs helped the Swiss Broadcasting Corporation provide a single virtual view of all key programmatic content stored in regional databases and systems without duplicating the data. SRG SSR was able to integrate, clean and transform data from nested XLMs, JSONs and web services and deliver this a service to an enterprise search engine in just 3 weeks.
RAW accesses data from most commonly-used sources and file formats at source and in
real-time. No ETL needed.
Big & Complex Data
RAW queries relational tables, JSON, XML, array data, Word documents, Excel spreadsheets, machine logs - all seamlessly with a SQL language.
Easy and Fast
RAW tunes itself based on your queries and your data, all autonomously. No preparation time. No need to manually create views, indexes or keep DBAs on call.