Dear Mike and Guy,
I met with the Cornell Lab yesterday specifically to discuss Stall Catchers data quality and data needs. Among other things, we discovered that there was an uploading glitch that occurred with two recent data upload attempts, which explains why high volume annotators like yourselves ("super catchers!") ran out of movies.
The new data is being uploaded and we have an updated plan to avoid future data gaps. Unfortunately, this plan involves unavoidable uncertainties that are related the automating aspects of the data preparation pipeline. In particular, we cannot precisely estimate how long it will take to engineer this automation. While we are waiting for that automation to occur, we will continue to upload approximately 2000 to 4000 new vessels per week, which is double the previous volume, which resulted in gaps. We think this will be enough to prevent gaps until the automation is ready. However, if we have sudden annotation volume increases or a surge of new participants, we may end up with another data drought. We are exploring other ways to avoid data gaps during this awkward interim period.
The good news is that there are approximately 240,000 vessels from 7 or 8 studies including treatment studies that have already been imaged and are waiting to be pre-processed for Stall Catchers. So as soon as the automation is in place, we will have plenty of data to analyze and lots of potentially interesting research results to emerge.
Thanks for your excellent communication and enduring patience as we work to further open the data spigot.
All the best,