Query Bitcoin Blockchain for Active Addresses in BigQuery

Vivien Chua
Dev Genius
Published in
4 min readDec 5, 2022

--

In this article, we will show how to find the active addresses count on the Bitcoin blockchain by running the query below in Google BigQuery. Active addresses refer to the number of unique addresses that have sent or received at least one transaction within a specified time period. Only addresses that were active in successful transactions are counted.

Active addresses are often used as a proxy for measuring the level of adoption and usage of the Bitcoin network. When the number of active addresses increases, it can indicate growing interest and activity within the Bitcoin ecosystem. It is important to note that a single user can have multiple addresses, and not all addresses that are active may represent individual users.

Before we begin, you may learn how to setup a project in Getting Started with BigQuery, and how to run a simple query on the bitcoin public dataset in Explore Cryptocurrency Public Datasets with BigQuery. If you prefer not to run the query, you may view the results on the Bitcoin Dashboard by Trigo Labs.

Bitcoin active addresses
Bitcoin Active Addresses

Step 1

Go to the Editor field on BigQuery.

Step 2

In the Editor field, copy the query below.

This query creates a Common Table Expression (CTE) active_addresses by extracting the date and addresses from two tables, namely bigquery-public-data.crypto_bitcoin.inputs and bigquery-public-data.crypto_bitcoin.outputs .

The two result sets are combined into a single result set using the UNION ALL clause. This creates a list of all unique addresses that have been used in transactions, regardless of whether the address was used as an input or output.

WITH active_addresses AS (
-- input addresses
SELECT DATE(block_timestamp) AS date,
array_to_string(inputs.addresses, ",") as address FROM `bigquery-public-data.crypto_bitcoin.inputs` as inputs
UNION ALL
-- output addresses
SELECT DATE(block_timestamp) AS date,
array_to_string(outputs.addresses, ",") as address FROM `bigquery-public-data.crypto_bitcoin.outputs` as outputs
),

Next, the CTE active_addresses_group_by_date selects data from the table active_addresses , and calculates the count of unique addresses that were active on each day. The results are grouped by date.

active_addresses_group_by_date as (
SELECT date, COUNT(DISTINCT address) as active_addresses_count
FROM active_addresses
WHERE date > "2009-01-01" AND date <= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
GROUP BY date
)

Finally, we use the SELECT statement to fetch date and active addresses count, and display these when we run the query.

SELECT date, active_addresses_count FROM active_addresses_group_by_date
ORDER BY date

The full query is shown below.

WITH active_addresses AS (
-- input addresses
SELECT DATE(block_timestamp) AS date,
array_to_string(inputs.addresses, ",") as address FROM `bigquery-public-data.crypto_bitcoin.inputs` as inputs
UNION ALL
-- output addresses
SELECT DATE(block_timestamp) AS date,
array_to_string(outputs.addresses, ",") as address FROM `bigquery-public-data.crypto_bitcoin.outputs` as outputs
),

active_addresses_group_by_date as (
SELECT date, COUNT(DISTINCT address) as active_addresses_count
FROM active_addresses
WHERE date > "2009-01-01" AND date <= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
GROUP BY date
)

SELECT date, active_addresses_count FROM active_addresses_group_by_date
ORDER BY date

Step 3

Click Run.

The query ran in BigQuery using 154.54 GB. The results are displayed in the Query results section. The table header row contains ‘date’ and ‘active_addresses_count’. The result shows the number of unique active addresses per day.

Step 4

You may export the results by clicking Save Results, and in the, dropdown menu, select an option to export to CSV, JSON or Google Sheets.

Interested in learning more about on-chain metrics? Check out the articles below.

Also read:

Thank you for reading!

If you liked the article and would like to see more, consider following me. I post regularly on topics related to on-chain analysis, machine learning and BigQuery. I try to keep my articles simple but precise, providing code, examples and simulations whenever possible.

--

--

I invest in companies. CIO and co-founder at Meadowfield Capital. Stanford PhD.