Thank you for the insightful questions! Please see below...
The research requirements for the Cornell studies involve very high data standards: Specificity and Sensitivity both need to be greater than 95%. That's even more challenging that having an accuracy level that high. We conducted a validation study at the beginning of this project, which revealed we need between 15 and 20 annotations per vessel to achieve that quality level. So, today, we collect 20 annotations per vessel.
Yes - exactly. We are currently working on a method to assign weight to a user's response based on their sensitivity, which will mean we can collect fewer annotations per vessel. This method will also adjust for individual response bias. Once it has been validated, we can even apply it retroactively. This is currently our highest research priority in EyesOnALZ.
The sudden drop associated with getting dinged for a few incorrect (or disagreed with) annotations is an unavoidable characteristic of our unique method, which ensures that all contributions are helpful, and that web bots or malicious actors cannot adversely influence the data. Another key aspect of this method is that it adjusts rapidly to intraday fluctuations in a user's sensitivity. For example, many people naturally experience an afternoon "slump". We also have Alzheimer's patients participating, who also have moments of higher or lower lucidity, and this ensure they can contribute beneficially anytime they play.
Also, as we continue to curate and build our library of calibration vessels, we hope to reduce disagreements and increase learning opportunities.
Sensitivity is indeed windowed, but the "blue tube" (label credited to @evelynrsmith) already reflects the windowed value. The window size was chosen to balance between statistical power and responsiveness.
I hope that's helpful and thanks again for asking!