I would actually disagree with you there. I think the problem is these modellers are requiring precision and not accuracy.
The entire point statistical modelling is to use a sparse number of datapoints to create a generalized model. Go back to stats 101 and the sample size problem. What is the generally accepted minimum number of samples? It is 25 or 30. At 200 points you can get to significance at 99% confidence.
I have some coworkers from medical research - we used 50, 100 samples to draw medical conclusions back then...and now portions of the bank claim they can't build a sufficiently accurate model due to lack data when they have datasets in the thousands.
The tradoff is you can use 500 data points to create a generalized model but it will lack precision. Or you can build a model with 500,000,000 datapoints (we have them - do not believe anyone who says they lack data unless it is risk-specific) but it lacks the ability to generalize over time.
Modellers try to take the best of both worlds with various methods to reduce overfitting (you can google the regularization methods) but IMHO there is only one true method, which is intuition - and this currently cannot be modelled or at least I am not aware how.
I believe we are talking about different things. You are talking stat curve fitting. I am talking DNNs that can model and generalize real world info and deal with high number of factors influencing the corporate results going forward. Curve fitting is the reason why Wall Street analyst predictions are subpar and also why most investors underperform. Most of these expect the future to look like the past - which is what curve fitting is.
People who outperform are:
1. People who have higher accuracy model (whether hand built or ML/automatic).
2. People who have longer term predictions than others
If the future looks like the past, nobody can outperform simple curve fitting for 1. or 2. So people can outperform only if curve fitting is wrong. Determining that it is wrong can be based on real world knowledge, second order thinking, intuition, whatever. And these can be ML/DNNed if sufficient data were available. And sufficient data here is way larger than what's needed for curve fitting.
I am not sure what you are talking about when you say
"you can build a model with 500,000,000 datapoints" - no you cannot. There are not enough companies on Earth to have that many datapoints. You can do that for
price data, but not for
fundamental data like yearly sales/profits/etc. There's a reason people build DNNs based on data that's available daily or even better every (nano/micro/milli)second. But that excludes most fundamental data.
* People can also outperform by choosing an area where competition is low and their models don't have to compete with competent curve fitters.
** People and algos can also outperform by exploiting (psychological/emotional/technical/etc.) drawbacks of other actors. I'm not talking about this now though, even though it's a fascinating area on its own.
Edit: For fun and clarity, I'll classify how I see some investors:
- Graham cigar butt investing: Mostly expecting future to differ from the past.
- Growth investing: Mostly expecting company to grow longer than others.
- Buffett: higher accuracy model and longer prediction than others.
- Writser

: choose area where competition is low and you don't have to compete with ...
All of the above (may) exploit the drawbacks of other actors:
- Graham cigar butt investing: exploit others giving up on underperforming company.
- Growth investing: exploit others undervaluing the growth company even when growth is known.
- Buffett: exploits the heck of irrationality of other actors.
- Writser

: exploits the behavior of limited set of actors in special situations.