Below is a copy of my submission to the House of Lords Select Committee on Artificial Intelligence. Intentionally absent are more sensitive aspects of AI related to areas such as compromising law and enforcement, and at risk individuals.
Submission to the House of Lords Select Committee on Artificial Intelligence
Dr Jerry Fishenden FIET CITP FBCS FRSA, Visiting Professor University of Surrey. This submission is made in a personal capacity. 25th August 2017.
The pace of technological change
1. What is the current state of artificial intelligence and what factors have contributed to this? How is it likely to develop over the next 5, 10 and 20 years? What factors, technical or societal, will accelerate or hinder this development?
1.1 In domains related primarily to perception and cognition – such as facial and speech recognition, or detection of unusual patterns of trading activity indicative of fraud – “AI” techniques have proven the ability to learn effectively and to bring significant benefits (as Amazon’s Alexa is demonstrating, along with other areas such as autonomous vehicles). Many of these advances have been assisted by improved processing power alongside the focus within well-defined domains – evidenced in declining error rates within those domains – rather than earlier efforts which attempted to tackle “learning” across a much broader stage.
1.2 We are still nowhere near the “Turing test” type of “AI” that the public expect – i.e. a system that can think, create and generalise as well as a human being. Generalised self-learning systems remain confined to the realms of science fiction: we still await unsupervised learning systems able to generalise successfully and convincingly across broad domains.
1.3 Unfortunately “AI” has become an often largely meaningless label applied to software asserted to be adaptive and “self-learning” to the task at hand rather than being pre-coded. All software is written by humans and contains all the biases, mistakes, errors, conceits and failings of its creators, either by accident or intent. It will also be impacted by the nature of the training data utilised, which may be incomplete, partial or biased in some way. The boundary between “AI” and other software is fluid and ill-defined. The Committee should therefore consider not only those programming techniques labelled “AI” (however that might be arbitrarily defined), but software in general where the ethics of the impact of that software need to be clearly understood.
2. Is the current level of excitement which surrounds artificial intelligence warranted?
2.1 Partially, in some specific domains (as per para 1.1), although “AI” has experienced regular periods of hype ever since 1955 when John McCarthy minted the phrase. It also has a tendency to self-promotion and overstatement – such as the grandly named “neural networks” which come nowhere close to the true neural composition and functioning of the human brain. Some successful data-related statistical work labelled “AI” may well draw upon techniques such as bayesian inference and heuristic analysis. Many so-called “AI” techniques are often applied for pointless activities – such as trying to reverse engineer consumer behaviour to guess what they may be interested in in order to serve up adverts that irritate them. A simpler and less computationally expensive method would simply be to ask consumers what they are interested in. Other applications demonstrate more toxic outcomes, such as the so-called “surge pricing” of taxis during terrorist events in London and Sydney.
Impact on society
3. How can the general public best be prepared for more widespread use of artificial intelligence? In this question, you may wish to address issues such as the impact on everyday life, jobs, education and retraining needs, which skills will be most in demand, and the potential need for more significant social policy changes. You may also wish to address issues such as the impact on democracy, cyber security, privacy, and data ownership.
3.1 We lack a good level of understanding of computers and software in society in general, of which the “AI” issue is only a subset. “AI” systems are already in widespread use at companies such as Apple, Microsoft, Google and Amazon, in video games, and for facial recognition for security and understanding and improving flow management at airports. Most such systems complement and assist rather than replacing humans. They help improve productivity and also let humans focus on their strengths – creativity, interactivity, etc. – leaving lower value, high scale repetitive volume work to computers which can handle large data sets and analytics more efficiently.
3.2 We need a concerted effort to improve the public understanding of technology in general, through a well-resourced “public awareness and understanding of technology” campaign with a good reach into all sections of society. We also urgently need to have more expertise embedded within Whitehall and Westminster, and on company boards. There is a pressing need for far better technical advice inside government – one option is to consider creating the role of “Chief Technical Advisors” to help advise Parliament, politicians, Ministers, Permanent Secretaries and government departments. The current Chief Scientific Advisor community has few technologists or technological expertise amongst its membership despite technology’s critical role in the modernisation of government and our public services – leaving departments unable to understand or leverage technology advances, or to recognise their economic and societal impacts early enough to develop relevant policy or regulatory responses.
3.3 Please see paragraph 7.1 for wider considerations concerning security, privacy and data ownership.
4. Who in society is gaining the most from the development and use of artificial intelligence and data? Who is gaining the least? How can potential disparities be mitigated?
4.1 Most significant financial benefits at present appear to be accruing to the companies deploying claimed “AI” software. However, areas such as facial recognition have brought benefits in areas such as security, including airport security, albeit with associated controversy around their use, including aspects such as bias, discrimination and false positives. Other areas of application include anti-fraud systems in financial areas as well as an increasing role in healthcare in terms of assisting with analysing and identifying issues of clinical interest in complex medical data.
4.2 Many of these uses appear to be running well ahead of any associated ethical, regulatory and legislative regime, something likely to undermine public trust in such systems or, worse, create profound negative impacts within specific communities, with consequential negative reactions towards government and certain business sectors. Organisations most able to benefit from “AI” are those that own the bulk of the data in a particular sector, industry, or space. “AI” systems are primarily limited by the training data available to the individual system. This may well be only a subset of the overall data – as for example with health data, where the NHS has access to only a subset (that of sick patients) and not the wider set of health-related data held outside NHS systems (such as in wearable devices, gym equipment, smartphones, etc.)
4.3 A notable negative aspect is that “AI” can further entrench and narrow people into their own echo chamber. Online recommendations for example that “People like you also bought products like this” risk narrowing people’s experience. Alternative techniques that ensure people are exposed to wider choices and options – “People like you never read articles like this” for example – and early work on things such as the Syzygy Surfer need to be explored. In addition, we need to see a far better understanding of these issues to ensure appropriate public trust in these systems, assured where appropriate through regulatory, contractual and legal means.
5. Should efforts be made to improve the public’s understanding of, and engagement with, artificial intelligence? If so, how?
5.1 Yes. There needs to be a comprehensive, transparent programme for better informing and engaging the public in the discussion of “AI” and software in general. This needs to begin soon, rather than letting misapprehension, misunderstandings and falsehoods take root that will be difficult to displace – as they have been with other complex topics such as GM foods and fracking. There needs to be a better understanding that the current state of “AI” remains relatively primitive and most successful in narrow, specific domains. It is highly desirable for all “AI” research, learnings and developments to be openly published for wider review and understanding, moving away from an environment where academic publication happens behind paywalled obscure journals, and where commercial companies make absurd marketing claims unsupported by meaningful peer-reviewed evidence or proof in the public domain. As per para 3.2, government also needs far better technical capabilities, such as could be provided by experienced technical advisors embedded within policy making.
6. What are the key sectors that stand to benefit from the development and use of artificial intelligence? Which sectors do not? In this question, you may also wish to address why some sectors stand to benefit over others, and what barriers there are for any sector looking to use artificial intelligence.
6.1 Historically industries and jobs that have been supplanted by automation have been blue collar jobs (such as those replaced by robotics). Machine learning and “AI” applications will however begin to supplant white collar, professional workers. Specific domains of professional work that are primarily the application of research and facts, and identification and analysis of patterns, are some of the areas most likely to be impacted quickly. Less obvious, but equally likely and perhaps more critical, is the automation of component tasks – for example, material selection decisions in architecture, that might previously have been made by a civil engineer. While this will disrupt many existing jobs, as with other technological changes it may not necessarily decrease the number of workers employed, but will enable them to focus on increasingly value-added roles rather than the lower level work that is becoming better suited to computers.
7. How can the data-based monopolies of some large corporations, and the ‘winner- takes-all’ economies associated with them, be addressed? How can data be managed and safeguarded to ensure it contributes to the public good and a well-functioning economy?
7.1 Government should be leading by example in the secure, consensus-based use of data and the establishment of general principles to be applied to the ethical use of software (including “AI”). Issues where governments can help establish principles and standards include:
7.1.1 user consent: engaging and educating users to ensure their consensual participation and understanding including of the data they are revealing, what is done with that data and how they may (or may not) be able to provide or revoke consent
7.1.2 legal context: consideration of the legal context and how far e.g. the Digital Economy Act (2017), Data Protection Act and GDPR apply to machine learning, internet of things etc. or how they may need to be updated to keep pace with changing technology
7.1.3 economic: the impact that “AI” and other software are likely to have at both micro- and macro-economic levels in the UK, including on the potential future configuration of UK public services as the IoT and embedded health sensors etc. become more ubiquitous
7.1.4 access and control: establishing a trust framework across these many systems and humans’ relationships with them, one that spans anonymisation, pseudonymisation and strong identity proofing
7.1.5 data quality: data needs to be of sufficient accuracy and veracity to ensure that resulting decisions are coherent. This is a complex area – consider for example just one field, health, where the quality of many patient health records is unknown. Before building analytics and machine learning on top of such unknown data quality, users should be provided with access to their data to ensure their records are accurate. Many environments need to have precursor mechanisms in place to assess and improve data quality – including assessing data sets for inherent bias. Software-enabled or supported decisions are likely to amplify the bias of poor or inaccurate data and lead to inappropriate or potentially damaging outcomes. Consider commercial organisations such as Facebook building large international biometrics databases and related tracking systems based on users tagging faces in photos: this assumes that people are accurately tagging and not intentionally or accidentally misidentifying data. There is an inadequate focus on data quality, and the pyramids of assumption, analysis and decisions being built on what may actually be worthless or badly distorted data
7.1.6 data de-identification and anonymity: known problems already exist with anonymising personal data successfully and this has become an increasingly significant and complex issue. De-identification is not the same as anonymisation. More research is needed in this area to look how far e.g. attribute confirmation / exchange or techniques such as differential privacy might be more viable (or more appropriate in specific contexts) rather than providing access to raw data, whilst still enabling beneficial applications of machine learning
7.1.7 data access: ensuring appropriate control mechanisms for data (public and private / personal) accessed by such systems including appropriate protections (security / privacy / audit / accountability / protective monitoring etc.) are in place
7.1.8 data veracity / integrity: how do we know that data being used by such systems can be trusted? How do we know all data have been released from the systems when attempting to regulate or ensure they are compliant with e.g. laws of non-discrimination?
7.1.9 metadata: improving the understanding of the role this will play and how much use it is likely to in reality (as opposed to academic theory) e.g. see Cory Doctorow’s 2001 thoughts on metadata’s true value
7.1.10 code jurisdiction: whilst some code may run within the UK (in particular systems, devices, or sensors) much will be operating in the cloud, or in private data centres or interacting with other systems scattered across the planet. There is a need to clarify how UK and non-UK systems will operate particularly in terms of whether they meet standards required (e.g. not exhibiting biased, illegal or discriminatory behaviour, or being compromised by hostile actors)
7.1.11 resilience: as many goods and services become ever more reliant upon this new generation of interconnected systems, the potential resilience to failure (accidental or malicious) will become an issue. Research is required into the potential interactions and vulnerabilities and risks of emergent systems of systems. It is also likely that all such systems will (a) need to be readily isolated from their environments should they behave in an undesirable way or be compromised by hackers, malware, etc. (b) be remotely patchable, requiring secure mechanisms to do this since, as with SCADA (Supervisory Control And Data Acquisition) systems, remote management facilities themselves present a potential vector for security compromise
8. What are the ethical implications of the development and use of artificial intelligence? How can any negative implications be resolved? In this question, you may wish to address issues such as privacy, consent, safety, diversity and the impact on democracy.
8.1 If the right approach is not taken, the downside of this emergent generation of systems is that they will be discriminatory, wrong, biased, unaccountable, manipulative, and create significant security, privacy, legal and trust issues. However, if well applied the upside is that they will help support better policy-making, health care, education and transport etc., through responsive and more efficient systems. These are ethical issues that apply to all software and should not be limited to so-called “AI” software alone. However, government appears ill-equipped to develop appropriate ethical frameworks – see for example issues with its data science ethical framework. Note also the further detail provided in paragraph 7.1 above.
9. In what situations is a relative lack of transparency in artificial intelligence systems (so- called ‘black boxing’) acceptable? When should it not be permissible?
9.1 Consistent standards of security, privacy and software engineering together with transparency about the decisions such systems are making is required. Systems and the decisions they make or enable must be able to demonstrate when challenged that they behave in unbiased, non-discriminatory and non-invasive ways and are making applicable, acceptable and legal determinations. The data that they rely upon and on which they have constructed their models needs to be trusted, accurate and verifiable. Any exceptions to this need to be identified quickly and early so that appropriate remedial and corrective action can be taken.
9.2 The most viable option is probably to assume a “black box” approach and therefore adopt a model that requires certain data to be made openly available by systems to enable analyses of observable external behaviours, including longitudinal analysis over time. This could involve making sufficient data available via open interfaces (APIs) so that the external characteristics of systems and services can be inspected / analysed and held to account. Consideration needs to be given as to how open such interfaces and data would need to be: genuinely open (to all) or open to specialists? This will likely vary by subject domain. There is also the issue of how to ensure data is being fully released, and how to assure the integrity of that released data (i.e. that it has not been modified in some way to game the system and make it appear to be unbiased when in practice it is). After all, data released from a system might not be the same as the data held within the system. Appropriate issues of liability and insurance should be considerations here to help encourage the right behaviours.
9.3 There is also the issue of where boundaries are drawn – technical, legal, accountability etc. – in what will often be a complex ecosystem of interacting components using both “AI” and non-“AI” software and hence likely to exhibit sometimes unpredictable emergent behaviour. As the European Commission’s working document on the internet of things points out, any such interdependency “gives rise to a number of questions, such as:
- Who is responsible for guaranteeing the safety of a product?
- Who is responsible for ensuring safety on an ongoing basis?
- How should liabilities be allocated in the event that the technology behaves in an unsafe way, causing damage?” [p.22]
The role of the Government
10. What role should the Government take in the development and use of artificial intelligence in the United Kingdom? Should artificial intelligence be regulated? If so, how?
10.1 It would be a mistake to try to isolate “artificial intelligence” or “machine learning” or any other name given to self-learning software from any other software-based processes. Agreeing on what is “AI” as opposed to other software based techniques will prove a frustrating activity and distract from the core issue – which is how to ensure that software performs as expected. Current regulators appear incapable of performing this function and ill-equipped to regulate and hold to account the software and its creators that increasingly run almost every business (consider for example the Volkswagen emissions issue and problems with the operation of Uber’s software).
10.2 Society needs to be assured that software is not discriminatory, or rigging the system or otherwise failing to operate in a trustworthy way – whether someone decides to label such software “AI” is immaterial (particularly given so much of it is “black boxed”). It matters little to a citizen whether an inappropriate, wrong or discriminatory decision is made by one type of software system or another: what is needed is trust in decision-making / assisting software, particularly those operating in domains such as health. Along with the public awareness of technology and its social, political and economic implications outlined in paragraph 5, the role of education also needs to be considered – not just the narrow focus on coding and computer science, but on the creative and social sciences to ensure not just jobs for future generations, but also to equip them better emotionally to understand, navigate and deal with the future.
10.3 The recent Digital Economy Act (2017) is notable for entirely missing the need to include devices, sensors etc. within its definitions (it assumes existing administrative information systems, citizens and officials), missing most of what digital government and digital society is rapidly becoming. What is needed are highly precise bills / regulations / codes of practice that ensure compliance: technically agnostic law is often inadequate, hence why we have RIPA, the IP Bill etc. which incorporate tech within them. A similar approach is needed for trust in software. To do this we need genuinely expert groups, working in the open (see e.g. https://www.gov.uk/design-principles#tenth), both to get the best possible outcome as well as building public trust in what is being developed / proposed.
10.4 The underlying issue is the behaviour of digital devices / systems and digital machine ecosystems, not just their learning characteristics (which are a subset of the problem space). So the policy issue to be addressed is a broader “trust in machine behaviour”. Such machines will include devices and sensors around us in the growing internet of things (IoT), including software running in hardware and firmware.
10.5 Requiring minimal standards for software engineering / quality could be one potential approach (e.g. ISO9126, application of e.g. CISQ, and inputs from the National Cyber Security Centre, NCSC). Consideration is required as to whether there are some minimal trustworthy computing requirements that could be developed / used / stipulated, particularly for use in more sensitive domains (health especially).
Learning from others
11. What lessons can be learnt from other countries or international organisations (e.g. the European Union, the World Economic Forum) in their policy approach to artificial intelligence?
11.1 Some good work has been done, but has often led to the arbitrary distinction of “AI” from other software techniques – when in fact the same principles of trust and transparency are required regardless of the nature of the software utilised. This is particularly true given the very grey lines around what is “true AI” versus “AI washing” etc. If only “AI” software is regulated, some industries, companies, suppliers etc. may decide to stop labelling their systems “AI” to avoid such regulation – another disadvantage of such an arbitrary distinction.
11.2 Some relevant work to be considered includes:
- US Federal Government Automated Vehicles Policy September 2016
- Royal Society machine learning dinner (28 July 2016) and their ongoing work at Machine Learning
- European Commission Staff Working Document: Advancing the Internet of Things in Europe