The data mindset is good for some questions, but completely inadequate for others. But try arguing that with someone who insists on seeing the numbers.
The promise is that enough data will give you insight. Retain data indefinitely, maybe waterboard it a little, and it will spill all its secrets.
There’s a little bit of a con going on here. On the data side, they tell you to collect all the data you can, because they have magic algorithms to help you make sense of it.
On the algorithms side, where I live, they tell us not to worry too much about our models, because they have magical data. We can train on it without caring how the process works.
The data collectors put their faith in the algorithms, and the programmers put their faith in the data.
At no point in this process is there any understanding, or wisdom. There’s not even domain knowledge. Data science is the universal answer, no matter the question.
— From Maciej Cegłowski’s talk at the Strata+Hadoop 2015 conference in NYC.