QuantRocket

Speed up data collection for far out futures


#1

I have a futures DB with a lot of contracts. It usually is pretty quick to collect data for most futures. However when it gets near the end of collection (300 or so contracts) it becomes very slow. I think what is happening is that it’s requesting data for some contracts with are listed but have no data for them so it retries like 10 times to get data. Is there a way to change these 10 retries to a lower number to speed up data collection?

I know the system automatically skips futures that are very far out, but I guess these don’t fall into that category.


#2

The IB API sends error code 162 when there is no data, and QuantRocket won’t retry in that case. Retries occur when there is no response from the API, which is considered an unexpected error condition. Does it effect specific conids? Does it happen when you collect data for those conids in isolation? Can you provide steps to reproduce the issue?


#3

The DB has around 3,000 futures. When I start the collection it starts very quick until it reaches ~300 completions and then it slows down. To give you an idea, when the collection starts, it says it will take around 4.5 minutes to complete. The second message printed out with estimated runtime remaining is 8 hours 30 minutes.

It gets stuck on some conids and has to retry many times to get the data. For example, with conid 328011116 I got this error:

image
image
image

The last retry also failed and then this error printed out.

image

However, if I run just quantrocket history collect us-futures-1d -i 328011116 it takes like 10 seconds and then saves 3 total records.

In my last post I assumed it was due to futures that weren’t expiring soon since those were the ones I saw in the console that were having many retries. I just assumed they had no data, but if I try and collect just the conids which I see are being retried, they all finish fairly quickly and there is data downloaded and saved.

The only way I can reproduce the issue is to try and collect the entire DB and wait until some conids get stuck.

I’ve also noticed that sometimes if I stop IBG1 and start it again, collection speeds up again.


#4

That’s a tough one, I can’t reproduce this on a test universe of all Globex futures. Usually if IB Gateway becomes unresponsive it might be associated with either (1) excessive load in your deployment, such as too much memory usage, or (2) Saturdays when the API is spottier. Might not be specific to the financial instruments. Canceling and restarting the collection might clear it up, and/or restarting IB Gateway. Sorry not to have a more definitive answer.


#5

Thanks for looking into this. I found a workaround to get the data much quicker. What I do is download the master file and group conids by their TradingClass. Then I’ll loop through the trading classes and download the history only for that subet of conids. This works much faster than collecting everything at once.

I’ve also tried just looping through all the conids and downloading 100 or so at a time and it works quite well. If I try to get around 500 conids at once then I run into the same issue where it gets stuck trying to download the same security 10 times and fails.

If I group by TradingClass or download 100 conids at a time it doesn’t get stuck collecting the same security over and over again.