Today’s buyers of data infrastructure and analytics are building their technology stacks using technologies and services from these three broad market categories:
Category 1: Data ETL. These providers move data from source systems (e.g OLTP databases) to target systems (e.g EDWs). Informatica and Talend primarily belong in this group.
Category 2: BI on prepped data. These are older providers that don’t move data but benefit from data that has been previously moved and prepped. They read directly from target systems such as EDWs. Historically, Tableau has been in this group, although the company is currently building an entire data stack (raw data access, prep, governance, discovery, analytics, collaboration).
Category 3: BI on raw data. These are newer providers that federate, optimize and push data queries to source systems where the queries run on raw data. Some query results are often cached for performance, but data primarily remains in the source system in its raw form. Dremio and AtScale are in this group.
Over time, we can expect to see a fourth category emerging of “Data Clouds”. These providers will seek to become a one-stop shop for data prep and analytics. We can see Tableau and Informatica aggressively moving in that direction.
Rather than fitting squarely into any of these categories, Lore IO can be thought of as a hybrid that encompasses the best of each group. Lore IO moves raw data from source systems, but keeps the data at its raw form in a data lake landing area. Lore IO then provides Collaborative Transformations that enable teams to manage and use their data by jointly operating on its metadata (or data definitions).
At query time, Lore IO converts the metadata to optimized queries that run on the raw data. Lastly, Lore IO exports the prepped data for consumption by EDWs or other target systems.
From this perspective, Lore IO can be integrated into a data stack that includes components from categories 1 and 2. Lore IO will provide differentiated value in that it will enable teams to transform their data faster and lower cost than they will be able to do using the other components of their stack.