Have you reached the limitations of data science?


Source: https://www.turing.ac.uk/events/ai-and-data-science-age-covid-19

The Alan Turing Institute organised an informative conference focusing on the application of AI and Data Science in the age of Covid 19. It was great to start with trustworthiness of communication using graphical representation and textual content. Discussing the changes in funding research projects and publication for the need of policymakers was enlightening. The discussion panels brought the whole picture of Covid-19, provision of health care at national and international level, and applications of techniques from researchers.

Some of the highlights of the conference were undoubtedly Professor Lord Robert Winston. He reminded us history had already addressed the difficulties we are currently experiencing. Another highlight was Neil Lawrence, who reminds us that communication was key to multi-disciplinary collaboration. It is so easy to become engrossed in our little bubble we can focus too much on our mathematics, modeling, and techniques we have developed. 

What has data science succeeded with Covid 19?

It is worth noting that data science is a broad term that uses this equation:

Data + model + computation = possible predictions

Statisticians would agree with this definition. However, Computer scientists would say that data and manipulations lead to some outcomes. Meanwhile, Machine learning specialists would suggest the same as statisticians. Machine learning and AI do apply many advanced statistical methodologies. Data science and data analytics have also been used interchangeably through time. The main point is that we need data, we need computations, and we need some mathematics.

Covid-19 has relied on data science to bring to our front pages inequalities in societies but generated some visual representations for a political endeavour. Some graphs have yet to suggest whether a logarithm scale has been used or not; for example, the readers have to guess and debates. Data science has successfully provided assistance in decision making and discussions between experts. However, the predictions have informed instead solved some problems. The results have relied on a community of data scientists to work together, voluntarily perhaps, to bring the tools we use worldwide. So humanity in all of us is rising to the challenge wellCovid-19 has relied on data science to bring to our front pages inequalities in societies but generated some visual representations for a political endeavour. Some graphs have yet to suggest whether a logarithm scale has been used or not; for example, the readers have to guess and debates. Data science has successfully provided assistance in decision making and discussions between experts. However, the predictions have informed instead solved some problems. The results have relied on a community of data scientists to work together, voluntarily perhaps, to bring the tools we use worldwide. So humanity in all of us is rising to the challenge well.

What has data science been limited with?

It was rewarding that data science or machine learning applied in data science has yet to be highlighted. For example, data science could predict the number of hospital beds needed in the pandemic’s next wave. However, how the health care provision adapts to the situation has yet to be suggested.

The issue arises in the nature of the problem. Such solutions rely on optimization and express problems with constraints and value to minimize. So, we have moved away from prediction to optimization. These problems are defined with some NP-hardness and relate to computational complex theory. Some deterministic often exhaustively find a solution. Non-deterministic solvers can rely on probabilities methodologies to navigate a solution-search space. Generative hyper-heuristics generate general non-deterministic solvers that can be applied and coded again for many instances of an optimization problem.

Data science also requires some specific understanding of domain knowledge. Otherwise, the outcome of the computations may be incorrect or highlights something unexpected. The latter should be happening all the time. Nonetheless, only experts in a field should interpret these results and suggest their level of certainties. Otherwise, some incorrect assumptions may lead to inappropriate decisions.

Finally, the data and privacy of our data can bring some limitations to data science. This aspect was discussed briefly with the societal effect on society. Some countries have traded privacy for freedom; the latter has prevented national lockdown. These observations are valid but should not bring two-dimension aspects to privacy preservation and privacy protection of our data. Technologies exist to bring the computations to the data. DataSHIELD is bringing a light way solution to the existing system; privacy-preserving computations prevents the inferential reconstruction of datasets. MedCo and other Cloud-based technologies rely on outsourcing obfuscated data with some strong privacy protection. Data leaves the health-providers using advanced encryption methodologies. However, individuals could be identified with sufficient knowledge.

To overcome access to data and privacy, some experts were adamant that synthetic data should be generated to advanced research. It was a shame the conference could not discussed issues with such data and their trustworthiness. Synthetic data may help in training some machine learning algorithms and identify some patterns. Nonetheless, there is yet a general approach to generate synthetic data and no clear discussion the errors in accuracy compared against the original data. Finally, bias towards minorities can continue to be present in synthetic datasets too. This tutorial will inform about these issues.

To conclude

The AI and data science age COVID-19 was informative and brought back our reliance on technologies and mathematics to understand and solve the crisis. It has highlighted issues related to data capture, privacy, and limitations of blind-foldingly using algorithms and computing libraries. More importantly, it has brought the critical concept that human interaction and collaboration are both essential to advance science, advise governments, and effectively communicate with the public. The future now lies in skepticism and questioning finding using algorithms we may not understand fully. 

,

One response to “Have you reached the limitations of data science?”

Leave a comment