Skip to main content

Delta Sharing [Premium]

The Delta Sharing Integration is a Premium Integration that we offer so that our partners can access their complete realm-level data through a Delta Sharing endpoint. Unlike many of our other no-code integrations which are set up within individual user accounts, this premium integration allows you access to your realm dataset so that you can craft your own queries, set up and run your own reports & visualizations, etc.

About Delta Sharing, Delta Lake, Delta​

  • Delta Sharing - an open protocol for secure data sharing using Delta Lake without any dependency on a particular computing platform.

  • Delta Lake - an open-source storage framework that enables building a lakehouse architecture on top of delta file format and cloud storage.

  • Delta - an open source data format based on Apache Parquet.

What software connects to Delta Sharing​

Delta Sharing doesn't rely on any computing platform. The shared data can be accessed via any client that supports the open Delta Sharing protocol, i.e. can be consumed via Pandas, Power BI, Apache Spark, Rust, etc. Basically, any software that has a built-in Delta Sharing connector. You are welcome to choose any software to connect to the data that supports Delta Sharing - more information can be found here.

πŸ’‘ Important Note: myDevices doesn't develop the connectors, we’re providing access to the data only. From time to time we may be able to offer examples to help you with connecting to the data (such as the Power Bi Example), however we cannot provide developer assistance with use of the data afterwards - usage, crafting queries, creating reports, etc from within your chosen tools is not something we can help with.

What is Delta Sharing reference server​

Note that you don't have to configure any servers to get access to the data. myDevices will provide you with the credentials file once the access request is approved. Reference server is an open source version of managed solution for Delta Sharing protocol. The key role is to handle requests, authorization and security. This part is fully controlled by myDevices and doesn't require any additional steps from your side.

How to get access​

myDevices generates a secure Delta Sharing endpoint and sends the link to the credentials file. The token can be downloaded only once. Otherwise, you need to request a new token. myDevices will disable the previous one automatically. Use values from the JSON file and omit the double quotes.

What to do if you lost the credentials file or it was exposed​

Delta Sharing protocol has a rich API. You can set up an extraction pipeline to get the updates on a particular schedule. Delta Sharing endpoint will act as any regular API endpoint. Such an approach will optimize the performance, costs and storage on your end. For example, you can request only the last 30 days using date fields or any specific location.

πŸ’‘ Keep in mind that myDevices doesn't provide any support for the connectors. Delta Sharing protocol is open-sourced. You can get more details on their website, github page and slack channel.

Can I create complex queries in the form of SQL requests to Delta Sharing​

Note that myDevices doesn't provide compute services. Delta Sharing protocol will only point you to the data stored in the cloud.

πŸ’‘ myDevices is providing access to your complete realm dataset and it could be potentially huge amounts of data. Careful attention should be paid to how you will import specific data in order to keep queries manageable, etc.

There are a few SQL-like things you can do to limit the returned data size, i.e. select particular columns, rows, etc. but it is not an actual database. You have to import the data and use your own compute resources to manipulate and analyze the data.

Delta Sharing Protocol

Can I access data in BI tools​

Yes, that is possible but only for those with the built-in Delta Sharing connector. For example, Microsoft provides a Delta Sharing connector for Power BI but Tableau doesn't support it yet out of the box. You can use the provided credentials to access the data using Delta Sharing. There are some limitations that solutions might have like the number of rows, allocated memory size, etc.

See also: Power Bi Example