New Treasury Proposal - Substrate ETL
Project: SubstrateETL
Proposer: D5.ai
**Payment Address **: To Be Updated
Substrate ETL lets you convert blockchain data into convenient formats like CSVs and relational databases. This makes it easy for analysts and data scientists to explore on-chain activity, and ultimately share their findings with the community.
We aim to make it as simple as possible for data scientists to get on-chain data without having to worry about how to operate a blockchain node.
This will be an open source project, just like all the other blockchain-etl projects we have worked on.
For the complete proposal and a more detailed description please have a look at the below doc:
https://docs.google.com/document/d/1h8pTcoFhQsJHRIIFXWTAuhvCJ_pIBmfLdOmTCf49KOk/edit?usp=sharing
Comments (6)
At first glance, this looks to be a much needed tool for Substrate chains. Subscan has done a great job of exposing more information about the blockchain and economics, but we could use more open source tool for data scientists to harvest and review on-chain data and activity. Your team is clearly experienced with building ETLs. How will the ongoing publishing of data to BigQuery go? Is the data fed through an API or will the team be doing this ad hoc?
Hey Jack, that's a great question. The ongoing publishing of data on BigQuery is completely automated. The currently outlined method in the proposal is using Cloud Composer. This will feature daily update capabilities. Detailed architecture is available here. https://cloud.google.com/blog/products/data-analytics/ethereum-bigquery-how-we-built-dataset. However we are also open to build real time ingestion capabilities. This might take slightly more time and cost a bit more. If the council is more interested in this we can update our costs accordingly. It would approximately cost an additional operational cost of $50-100/month for this + some extra hours of work to deliver this feature. Detailed architecture for this is available here. https://github.com/blockchain-etl/blockchain-etl-architecture