aggTrades Retrieving data over a date range

Note: If both startTime and endTime are sent, limit should not be sent AND the distance between startTime and endTime must be less than 24 hours.

If you request data for a range of dates, first the data was given for 24 hours, then for 1 hour.

Now, after updating 2022-12-05, the data is only given for 1000 trades!
This is very little, catastrophically.

Now, if I want to download data for one trading pair for 24h, I have to wait for almost a day))

And if you want to get data for several coins, or for a month?

If limit not provided, regardless of used in combination or sent individually, the endpoint will use the default limit.

https://binance-docs.github.io/apidocs/spot/en/#change-log

  • Changes to GET /api/v3/aggTrades
    • New behavior: startTime and endTime can be used individually and the 1 hour limit has been removed.
      • If limit not provided, regardless of used in combination or sent individually, the endpoint will use the default limit.

The maximum limit is 1000 trades. This is very little…
Has this problem not affected anyone yet this week?!

Hi,

Have you tried downloading your desired data from https://data.binance.vision? You can collect individual days or an entire month of aggTrades, klines and trades. It is a more efficient way of grabbing large intervals of trading data.

1 Like

Thank you so much for the link.
Yes, in the current situation this is one of the solutions, thank you!

how to download current day data, or the last 24 hour data?

For that you would need to use the API and paginate through the results. The data.binance.vision site is generally updated daily with the previous day’s data.

jonte ignored the original question in the answer. Current day data downloading takes hours with the current limit.
Can you give back the 1 hour download segment that was there for 5 years?

1 Like

jonte, after setting a limit in the last API update - it takes hundreds of times longer to download data, this is the problem…

If you’re interested in current data, could WebSocket streams be a solution? You can subscribe to <symbol>@aggTrade streams and get live updates as soon as they happen. There is a limit for the number of subscriptions in a single connection, around 200, so you’ll need multiple connections, and likely some backup connections too. But if you can handle the incoming traffic, that should be doable.

Why is that? I don’t think it will take a day. You will need to make many multiple requests for symbols that trade a lot and likely be careful with rate limit, but it shouldn’t take a day.

Here’s how to paginate through responses.

For example, you want to get all aggtrades for BTCUSDT for today. 2022-12-17 starts on 1671235200000 timestamp. You make the following request:

https://api.binance.com/api/v3/aggTrades?symbol=BTCUSDT&startTime=1671235200000&limit=1000

You get 1000 aggtrades in response with the last one being

  {
    "a": 2003207128,
    "p": "16624.89000000",
    "q": "0.02114000",
    "f": 2343384605,
    "l": 2343384605,
    "T": 1671235210240,
    "m": true,
    "M": true
  }

Then you use the aggtrade ID – "a" – with the fromId parameter for the next request, querying aggtrades starting with the next one:

https://api.binance.com/api/v3/aggTrades?symbol=BTCUSDT&fromId=2003207129&limit=1000

You get another 1000 trades up to

  {
    "a": 2003208128,
    "p": "16624.89000000",
    "q": "0.00647000",
    "f": 2343385766,
    "l": 2343385766,
    "T": 1671235228068,
    "m": true,
    "M": true
  }

Then repeat with

https://api.binance.com/api/v3/aggTrades?symbol=BTCUSDT&fromId=2003208129&limit=1000

and so on, until the timestamp "T" is where you want it, or there are no more current trades.

Due the limit on the limit, you will need about 100 queries to page through a day of BTCUSDT trading. But that’s not hours and days, more like 10 seconds if you use 10 ms delay between requests. Though, it scales up to hours if you want to do this for all symbols.

  1. We know how to google, too
  2. No, now you don’t need 100 queries to get the data for the day! You need to do thousands of queries! And that takes hours!

Before you tell tales - try to download data for 24 hours with the new limits!

@sanchesfree, indeed, my estimation for BTCUSDT was way off. It gets much more active during the day.

I had to make 4763 queries to fetch 4757678 aggtrades for the yesterday’s data. Also, I had to use ~50–60 ms delay between queries to keep within the rate limit. That took 7 minutes 50 seconds to fetch.

Well, not hours for this symbol, but it definitely will not scale if you need to track all symbols. Unless you spam Binance API from multiple IP addresses :sweat_smile:

Here’s a script that I used:

Python source code
#!/usr/bin/env python3

import argparse
import datetime
import sys
import time
from binance.spot import Spot


def main():
    parser = argparse.ArgumentParser()

    parser.add_argument('--symbol', required=True, help='Symbol to fetch')
    parser.add_argument('--date', required=True, help='Date to fetch, ISO format: YYYY-MM-DD')

    args = parser.parse_args()
    date = datetime.date.fromisoformat(args.date)
    start_time = datetime.datetime(
        year=date.year,
        month=date.month,
        day=date.day,
        tzinfo=datetime.timezone.utc,
    )
    end_time = start_time + datetime.timedelta(days=1, milliseconds=-1)

    fetch_aggtrades(symbol=args.symbol,
                    start_time=int(start_time.timestamp() * 1000),
                    end_time=int(end_time.timestamp() * 1000))


def fetch_aggtrades(symbol, start_time, end_time):
    client = Spot()

    last_aggtrade_id = None
    last_timestamp = -1
    last_report = time.monotonic()
    total_aggtrades = 0
    total_queries = 0

    while last_timestamp <= end_time:
        now = time.monotonic()
        if now - last_report > 1.0 and last_timestamp != -1:
            progress = 100.0 * (last_timestamp - start_time) / (end_time - start_time)
            print(f'complete: {progress:6.02f}%, {total_queries} queries, {total_aggtrades} aggtrades',
                  file=sys.stderr)
            last_report = now

        total_queries += 1

        if last_aggtrade_id is None:
            aggtrades = client.agg_trades(symbol, startTime=start_time, endTime=end_time, limit=1000)
        else:
            aggtrades = client.agg_trades(symbol, fromId=last_aggtrade_id+1, limit=1000)

        if len(aggtrades) == 0:
            break

        total_aggtrades += len(aggtrades)

        for aggtrade in aggtrades:
            last_aggtrade_id = aggtrade['a']
            last_timestamp = aggtrade['T']
            if last_timestamp > end_time:
                break

            print_aggtrade_as_csv(aggtrade)

        time.sleep(0.06)

    print(f'complete: 100.00%, {total_queries} queries, {total_aggtrades} aggtrades',
          file=sys.stderr)


def print_aggtrade_as_csv(aggtrade):
    aggtrade_id = aggtrade['a']
    price = aggtrade['p']
    quantity = aggtrade['q']
    first_trade_id = aggtrade['f']
    last_trade_id = aggtrade['l']
    timestamp = aggtrade['T']
    buyer_is_maker = aggtrade['m']
    best_match = aggtrade['M']
    print(f'{aggtrade_id},{price},{quantity},{first_trade_id},{last_trade_id},{timestamp},{buyer_is_maker},{best_match}')


if __name__ == '__main__':
    main()

That’s the problem.

For example, a trading bot or charting service downloads data on all coins for the last couple of hours and then listens to the current changes through websocket before working.

That’s the stage of downloading data for the last hours - it’s just become unworkable, basically.

ilammy, that query takes 25 minutes for me (everyday pc/cpu/network, non jp location) - and - on repeated requests (another symbol or day) , something happening, the response time become much slower, it is as if binance not tolerate the spamming anymore (there is no problem with the weight). So spamming the binance api does not look like a good advice.

1 Like