The "timestamp" join (ASOF) ⏱
Success
The ASOF join is fantastic for joining two tables with mismatched timestamps, where you want to join on the closest timestamp in the right table to the timestamp in the left table.
Syntax
At the time of writing (2024-04-07), the syntax for the ASOF join changes depending on the database engine you're using. However, the general idea is the same across all engines: you're joining two tables based on the closest timestamp in the right table to the timestamp in the left table.
DuckDB
Note
The explanation below is adjusted from the DuckDB documentation.
In DuckDB, the ASOF keyword is a join modifier just like NATURAL, which means that you can use it with any join type (left, inner, etc.).
The ASOF join requires at least one inequality condition on the ordering field. The inequality can be any inequality condition (>=, >, <=, <) on any data type, but the most common form is >= on a temporal type. Any other conditions must be equalities (or NOT DISTINCT). This means that the left/right order of the tables is significant.
SELECT *
FROM left_table
ASOF [LEFT | INNER | ...] JOIN right_table
ON left_table.id_column = right_table.id_column
AND left_table.timestamp_column >= right_table.timestamp_column
ASOF joins can also specify join conditions on matching column names with the USING syntax, but the last attribute in the list must be the inequality, which will be greater than or equal to (>=):
SELECT *
FROM left_table
ASOF [LEFT | INNER | ...] JOIN right_table
USING (id_column, timestamp_column)
Snowflake
Snowflake has a more unique syntax for the ASOF join. Rather than be a join modifier, Snowflake treats it as a new join type and introduces the MATCH_CONDITION clause:
SELECT *
FROM left_table
ASOF JOIN right_table
MATCH_CONDITION (left_table.timestamp_column >= right_table.timestamp_column)
ON left_table.id_column = right_table.id_column
It has an added restriction that the MATCH_CONDITION clause must reference the left column first, then the right column. This isn't a typical for join conditions, hence calling it out here.
Availability
At the time of writing (2024-04-07), the ASOF join has the following availability:
- DuckDB: ✅ (>=0.6)
- SQLite: ❌
- PostgreSQL: ❌
- SQL Server: ❌
- Snowflake: ✅ (February 28, 2024)
Are you aware of any other databases that support the ASOF join?
Examples
Sample data
Suppose we have two tables, transactions and exchange_rates, with the following data:
transactions
| date | account | amount | currency |
|---|---|---|---|
| 2023-12-31 | A | 100.00 | GBP |
| 2024-01-01 | B | 200.00 | GBP |
| 2024-01-02 | C | 300.00 | GBP |
| 2024-01-04 | A | 400.00 | GBP |
| 2024-01-07 | B | 500.00 | GBP |
| 2024-01-07 | C | 600.00 | GBP |
| 2024-01-08 | A | 700.00 | GBP |
| 2024-01-10 | B | 800.00 | GBP |
| 2024-01-13 | C | 900.00 | GBP |
| 2024-01-13 | A | 1000.00 | GBP |
| 2024-01-17 | B | 1100.00 | GBP |
| 2024-01-17 | C | 1200.00 | GBP |
| 2024-01-18 | A | 1300.00 | GBP |
| 2024-01-20 | B | 1400.00 | GBP |
| 2024-01-23 | C | 1500.00 | GBP |
| 2024-01-23 | A | 1600.00 | GBP |
| 2024-01-27 | B | 1700.00 | GBP |
| 2024-01-27 | C | 1800.00 | GBP |
| 2024-01-28 | A | 1900.00 | GBP |
| 2024-01-30 | B | 2000.00 | GBP |
| 2024-02-02 | C | 2100.00 | GBP |
| 2024-02-02 | A | 2200.00 | GBP |
| 2024-02-06 | B | 2300.00 | GBP |
| 2024-02-06 | C | 2400.00 | GBP |
Expand for the object DDL
create or replace table transactions(
"date" date,
account varchar(8),
amount decimal(10, 2),
currency varchar(3),
constraint pk__transactions primary key ("date", account),
);
insert into transactions
values
('2023-12-31', 'A', 100, 'GBP'),
('2024-01-01', 'B', 200, 'GBP'),
('2024-01-02', 'C', 300, 'GBP'),
('2024-01-04', 'A', 400, 'GBP'),
('2024-01-07', 'B', 500, 'GBP'),
('2024-01-07', 'C', 600, 'GBP'),
('2024-01-08', 'A', 700, 'GBP'),
('2024-01-10', 'B', 800, 'GBP'),
('2024-01-13', 'C', 900, 'GBP'),
('2024-01-13', 'A', 1000, 'GBP'),
('2024-01-17', 'B', 1100, 'GBP'),
('2024-01-17', 'C', 1200, 'GBP'),
('2024-01-18', 'A', 1300, 'GBP'),
('2024-01-20', 'B', 1400, 'GBP'),
('2024-01-23', 'C', 1500, 'GBP'),
('2024-01-23', 'A', 1600, 'GBP'),
('2024-01-27', 'B', 1700, 'GBP'),
('2024-01-27', 'C', 1800, 'GBP'),
('2024-01-28', 'A', 1900, 'GBP'),
('2024-01-30', 'B', 2000, 'GBP'),
('2024-02-02', 'C', 2100, 'GBP'),
('2024-02-02', 'A', 2200, 'GBP'),
('2024-02-06', 'B', 2300, 'GBP'),
('2024-02-06', 'C', 2400, 'GBP'),
;
exchange_rates
| date | from_currency | to_currency | rate |
|---|---|---|---|
| 2024-01-01 | GBP | INR | 110.0000 |
| 2024-01-01 | GBP | JPY | 160.0000 |
| 2024-01-01 | GBP | USD | 1.3000 |
| 2024-02-01 | GBP | INR | 120.0000 |
| 2024-02-01 | GBP | JPY | 170.0000 |
| 2024-02-01 | GBP | USD | 1.4000 |
| 2024-03-01 | GBP | INR | 100.0000 |
| 2024-03-01 | GBP | JPY | 150.0000 |
| 2024-03-01 | GBP | USD | 1.2000 |
Expand for the object DDL
create or replace table exchange_rates(
"date" date,
from_currency varchar(3),
to_currency varchar(3),
rate decimal(10, 4),
constraint pk__exchange_rates primary key ("date", from_currency, to_currency),
);
insert into exchange_rates
values
('2024-01-01', 'GBP', 'INR', 110.0),
('2024-01-01', 'GBP', 'JPY', 160.0),
('2024-01-01', 'GBP', 'USD', 1.3),
('2024-02-01', 'GBP', 'INR', 120.0),
('2024-02-01', 'GBP', 'JPY', 170.0),
('2024-02-01', 'GBP', 'USD', 1.4),
('2024-03-01', 'GBP', 'INR', 100.0),
('2024-03-01', 'GBP', 'JPY', 150.0),
('2024-03-01', 'GBP', 'USD', 1.2),
;
Finding the USD amount for each transaction (DuckDB)
Solution
To find the USD amount for each transaction, we can use the ASOF join to find the closest exchange rate to the transaction date:
select
transactions.date,
transactions.account,
transactions.amount,
exchange_rates.rate,
transactions.amount * exchange_rates.rate as amount_usd,
from transactions
asof left join exchange_rates
on transactions.currency = exchange_rates.from_currency
and exchange_rates.to_currency = 'USD'
and transactions.date >= exchange_rates.date
order by
transactions.date,
transactions.amount
| date | account | amount | rate | amount_usd |
|---|---|---|---|---|
| 2023-12-31 | A | 100.00 | null | null |
| 2024-01-01 | B | 200.00 | 1.3000 | 260.000000 |
| 2024-01-02 | C | 300.00 | 1.3000 | 390.000000 |
| 2024-01-04 | A | 400.00 | 1.3000 | 520.000000 |
| 2024-01-07 | B | 500.00 | 1.3000 | 650.000000 |
| 2024-01-07 | C | 600.00 | 1.3000 | 780.000000 |
| 2024-01-08 | A | 700.00 | 1.3000 | 910.000000 |
| 2024-01-10 | B | 800.00 | 1.3000 | 1040.000000 |
| 2024-01-13 | C | 900.00 | 1.3000 | 1170.000000 |
| 2024-01-13 | A | 1000.00 | 1.3000 | 1300.000000 |
| 2024-01-17 | B | 1100.00 | 1.3000 | 1430.000000 |
| 2024-01-17 | C | 1200.00 | 1.3000 | 1560.000000 |
| 2024-01-18 | A | 1300.00 | 1.3000 | 1690.000000 |
| 2024-01-20 | B | 1400.00 | 1.3000 | 1820.000000 |
| 2024-01-23 | C | 1500.00 | 1.3000 | 1950.000000 |
| 2024-01-23 | A | 1600.00 | 1.3000 | 2080.000000 |
| 2024-01-27 | B | 1700.00 | 1.3000 | 2210.000000 |
| 2024-01-27 | C | 1800.00 | 1.3000 | 2340.000000 |
| 2024-01-28 | A | 1900.00 | 1.3000 | 2470.000000 |
| 2024-01-30 | B | 2000.00 | 1.3000 | 2600.000000 |
| 2024-02-02 | C | 2100.00 | 1.4000 | 2940.000000 |
| 2024-02-02 | A | 2200.00 | 1.4000 | 3080.000000 |
| 2024-02-06 | B | 2300.00 | 1.4000 | 3220.000000 |
| 2024-02-06 | C | 2400.00 | 1.4000 | 3360.000000 |
Note the NULL values for the first transaction -- this is because there is no exchange rate for the date of that transaction in our exchange_rates table, and we specified ASOF LEFT JOIN so that values without matches are still kept in the result set (like a normal LEFT JOIN!).
Exercise
Can you re-write this query using the Snowflake syntax?
Expand for the Snowflake equivalent
select
transactions.date,
transactions.account,
transactions.amount,
exchange_rates.rate,
transactions.amount * exchange_rates.rate as amount_usd,
from transactions
asof join exchange_rates
match_condition (transactions.date >= exchange_rates.date)
on transactions.currency = exchange_rates.from_currency
/* Currently not allowed in `ASOF` join conditions */
where coalesce(exchange_rates.to_currency, 'USD') = 'USD'
order by
transactions.date,
transactions.amount
"Traditional" solutions
For comparison, here are a few solutions that illustrate how you might solve this problem without the ASOF join.
Using a lateral join
If your database supports lateral "joins" (like DuckDB), you can use a lateral join to find the closest exchange rate to the transaction date:
select
transactions.date,
transactions.account,
transactions.amount,
rates.rate,
transactions.amount * rates.rate as amount_usd,
from transactions,
lateral (
select rate
from exchange_rates
where 1=1
and transactions.currency = exchange_rates.from_currency
and exchange_rates.to_currency = 'USD'
and exchange_rates.date <= transactions.date
order by exchange_rates.date desc
limit 1
) as rates
order by
transactions.date,
transactions.amount
Note that this approach would drop the first transaction from the result set because there is no exchange rate for the date of that transaction in our exchange_rates table. We'll "fix" this in the next example.
Using a left lateral join
To keep the first transaction in the result set when using lateral, we can move the lateral subquery to a left join:
select
transactions.date,
transactions.account,
transactions.amount,
rates.rate,
transactions.amount * rates.rate as amount_usd,
from transactions
left join lateral (
select rate
from exchange_rates
where 1=1
and transactions.currency = exchange_rates.from_currency
and exchange_rates.to_currency = 'USD'
and exchange_rates.date <= transactions.date
order by exchange_rates.date desc
limit 1
) as rates on 1=1
order by
transactions.date,
transactions.amount
Using left join and qualify
If your database doesn't have ASOF or lateral joins, you can use a left join and the QUALIFY clause to find the closest exchange rate to the transaction date:
select
transactions.date,
transactions.account,
transactions.amount,
exchange_rates.rate,
transactions.amount * exchange_rates.rate as amount_usd,
from transactions
left join exchange_rates
on transactions.currency = exchange_rates.from_currency
and exchange_rates.to_currency = 'USD'
and exchange_rates.date <= transactions.date
qualify 1 = row_number() over (
partition by transactions.date, transactions.account
order by exchange_rates.date desc
)
order by
transactions.date,
transactions.amount
If you're using a database that doesn't support the QUALIFY clause, you can wrap the query in a subquery with the ROW_NUMBER() calculation saved to a column and filter on the column in the outer query.
Using a left join and correlated subquery
If your database doesn't have ASOF joins, lateral joins, or the QUALIFY clause (and you don't want to use ROW_NUMBER() in a subquery), you can use a correlated subquery:
select
transactions.date,
transactions.account,
transactions.amount,
exchange_rates.rate,
transactions.amount * exchange_rates.rate as amount_usd
from transactions
left join exchange_rates
on transactions.currency = exchange_rates.from_currency
and exchange_rates.to_currency = 'USD'
and exchange_rates.date <= transactions.date
where 0=1
or exchange_rates.date is null
or exchange_rates.date = (
select max("date")
from exchange_rates as rates_inner
where 1=1
and exchange_rates.from_currency = rates_inner.from_currency
and exchange_rates.to_currency = rates_inner.to_currency
and rates_inner.date <= transactions.date
)
order by
transactions.date,
transactions.amount
Note that, since this specific example, is only using the rate column from the exchange_rates table, we could have evaluated the subquery in the SELECT clause instead of the FROM clause. However, this would not work if we needed to use more than one column from the exchange_rates table, hence sticking to approaches that cater for multiple columns.
Wrap up
Like with most SQL problems, there are multiple ways to get the output that we want. The ASOF join is just a great way to solve this particular problem, and it significantly reduces the complexity of our SQL code!
Note that, also like with most SQL problems, the performance of these queries will depend on the size of the tables and the indexes available. If you're working with large tables, you will need to consider the performance implications of the ASOF approach compared to more "traditional" approaches.