Glaciers’s Substack

Glaciers’s Substack

Home
Notes
Archive
About

Glaciers: Decoding at Scale

Glaciers is a tool designed for large-scale EVM decoding. It empowers developers to quickly and flexibly process all raw blockchain data.

Glaciers's avatar
Glaciers
Jan 31, 2025
Cross-posted by Glaciers’s Substack
"Glaciers technical guide."
- Yule

You can read a Twitter thread version of this post here.

Most EVM libraries—such as Alloy-rs, Ethers.js, and web3.py—offer basic decoding, enabling the extraction of event data from raw logs. Furthermore, indexing engines like TheGraph, HyperIndex, and Ponder build on these libraries, providing users with functions to extract decoded data. However, all these solutions typically focus on a limited set of contracts and ABIs, often requiring devs to handle decoding tasks manually.

Glaciers provides a different approach. It doesn’t require users to preemptively select the contracts they are interested in or provide any ABIs. Instead, it automatically decodes all available raw data regardless. This allows data consumers to explore different events from different contracts and work with any protocol simultaneously while the heavy lifting takes place behind the scenes.

Glaciers is especially useful for data providers, allowing them to decode all the raw data they already possess and transform their business into a “Decoded Service”. This shift enables them to deliver enriched blockchain datasets directly to users. By creating a primitive “decoded logs" table, consumers can effortlessly create insights and metrics for any protocol just by exploring and filtering it.

In our previous example, we demonstrated Glaciers’ capabilities on small, easily replicable datasets. But this only scratched the surface of what it can do. So, what happens if we want to decode every Ethereum log? With Glaciers, large-scale decoding is not only possible but also efficient. Let's explore how Glaciers effortlessly handles such a task.

The ABI Database

A comprehensive ABI database is essential for an optimal decoding experience. Fortunately, Sourcify provides an open-source contract verification database, making it an excellent source for ABIs.

This database is extensive:

  • It contains 366,646 unique ABIs from 4,863,129 contracts across multiple chains.

  • Ethereum constitutes a significant portion, with 147,775 unique ABIs from 684,932 contracts.

  • After processing these using Glaciers’ ABI reader functions, we extracted 54 million ABI items, of which 1.747 million were unique full signatures.

  • For Ethereum, this breaks down into 857,238 unique signatures, consisting of 129,429 events and 727,809 functions ABI items.

It's useful to separate ABIs into events and functions for large-scale decoding. Since logs and traces are distinct datasets, we will need to process them separately anyway. Thus, the Ethereum ABI database we will use consists of 129,429 unique events from 555,771 distinct contracts (some contracts do not emit events).

If you plan to use Glaciars for any decoding, you can either directly download and use this ABI database here, or you can run the script available in the repository and modify it for your use cases.

Raw logs

Glaciers assume you already have your raw logs indexed. For this example, we used logs from 60,000 blocks, spanning block numbers 21,000,000 to 21,060,000, which were produced between 2024-10-19 13:45:47 and 2024-10-27 22:37:11—approximately one week of data.

These blocks contain 25.52 million logs, stored in six Parquet files, each covering 10,000 blocks and totaling 802MB. Although this is just a subset of Ethereum logs, it effectively demonstrates that decoding all logs is not only possible but also manageable.

Decoding Performance

The decoding process took 9 minutes and 46 seconds to complete using an Intel i7, 32 GB notebook, without any further optimization.

  • 96.24% of all logs were successfully decoded, amounting to 24.18 million logs.

  • 15.92 million logs had an exact address match, meaning the event emitter contract was present in the ABI database.

  • 8.26 mi logs were algorithmically matched with an ABI item from another contract.

  • The final decoded output, including all additional columns from the ABI database, amounts to 2.41GB across six Parquet files. However, some of these columns are redundant. By removing [data, topic1-3, num_indexed_args] while retaining the essential decoded columns [event_values, event_keys, and event_json], it remains possible to identify decoding mismatches, and the total file size is reduced to 1.81GB.

What Decoded Logs Unlock?

Data richness

Suppose we want to display transactions in a UI, such as a Transaction Explorer. With decoded logs, we can not only show the transaction details but also provide a complete view of all events that occurred during its execution, enriching the user experience with deeper insights. This allows users to track token transfers, contract interactions, and other emitted events without relying solely on unreadable raw transaction data. Additionally, by linking events to known contracts and providing human-readable function names, we make blockchain data more accessible and interpretable for both developers and end users.

Exploration

Making data exploration easier is often underrated. Do the data scientists creating the most insightful dashboards fully understand the protocol they’re analyzing? Not always. In fact, they often learn the inner workings of the protocol by exploring the data itself—something that would be impossible without decoded logs.

Let’s look at an example. Suppose I want to explore Uniswap V3. A good starting point is investigating one of its pools:

We can immediately gather valuable insights. The results show that the pool emits multiple types of events, which gives us clues about their purpose:

  • Mint – A Liquidity Provider (LP) deposits liquidity into the pool.

  • Burn – A Liquidity Provider removes liquidity from the pool.

  • Swap – A user makes a trade using this pool.

  • Collect – A user collects fees from the pool.

  • Flash – A user takes a flash loan within the pool.

Looking at the event parameters, we see multiple amount values—likely the token amounts involved in each transaction. To further confirm this, let’s investigate a specific of the above transaction, for example, the Burn :

From this, we notice that the token amounts in the pool’s events align with Transfer events from other contracts. This tells us that token balances are changing hands:

  • The from (or src) field in both token transfers is the pool contract.

  • The to (or dst) field is the Liquidity Provider (LP) withdrawing their funds.

This all makes sense—when an LP removes liquidity from the pool, it should receive the corresponding tokens.

At no point did we need to manually provide ABIs for these tokens—the decoded data was immediately available for exploration, significantly improving the developer experience.

Analytics

After identifying and exploring the most relevant contracts, filtering them becomes straightforward, enabling the efficient generation of meaningful metrics and insights. By analyzing decoded events, we can track key protocol activity, such as liquidity movements, trading volume, user participation, and fee distributions. This allows for deeper analytics, such as measuring pool volumes:

Conclusion

Glaciers revolutionizes blockchain data decoding by automating the process and providing deep insights across protocols without requiring pre-selected contracts or ABIs. Whether you're a data provider looking to enhance your offerings or a developer exploring vast blockchain datasets, Glaciers enables efficient, scalable, and seamless decoding that unlocks new possibilities for data analysis and exploration. Ready to dive in? Let Glaciers power your next blockchain project.

https://github.com/yulesa/glaciers

No posts

© 2026 Glaciers · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture