Machine (deep) learning is already widely deployed in back-end, enterprise systems to analyse complex data e.g. web performance, retinal scans, stock markets, but when used with speech, voice, text and facial recognition like Siri, and gadgets that automatically collect and analyse user data like smartphones, smart fridges, the AI genie is out the bottle and in our hands.

 

Artificial intelligence and ‘machine learning’ will be used in areas such as radiology, dermatology and pathology to ‘improve clinical care’ – and to get through more time-consuming work than health professionals can,

NHS England chief executive Simon Stevens

 

A previous post looked at how chess-playing computers are at the forefront of development. Though designed to play strategy games, the champion AphaZero is more versatile and considered a “broad” AI capable of both supervised and unsupervised reinforcement learning. Besides games, it has learned to model complex proteins as AlphaFold. Machine Learning and big data are quickly becoming widespread and though AI’s now have around the same number neural connections as an earthworm, they’re still capable of doing things that couldn’t be done before –
e.g. planning the daily routes for Amazon’s 158k delivery drivers, predicting attendance, identifying target demographics, weather forecasting, market and retail analysis, facial recognition, natural language processing, virtual assistants, advertising, genomics…

When processes are automated existing constraints need no longer apply but new challenges and risks come into play which need to be researched and specified as understanding how a deep learning solution works can be opaque e.g. in 2016 a fatal accident involving a Tesla car in auto mode was attributed to the car misidentifying a truck pulling out as a road sign because it had been trained on data that only showed trucks from behind. The codebase needs to accord not only with the view of the organisation and developer, but to work in the messy and often unpredictable real world

Some challenges –

  • The problem’s too hard or too broad – even a simple sentence with double negatives can cause confusion
  • Training data may be imbalanced leading to biased outputs – a smart system that assessed the CVs of job applicants inherited the human biases in the training data
  • Bad, insufficient or messy data leads to flawed, incomplete or useless solutions – outputs can deviate form those intended
  • The problem is not what was thought
  • Shortcuts (hacks) taken by AI can be hard to debug- e.g. https://youtu.be/TRzBk_KuIaM?t=164

 

How AI can get it wrong

 

Connecting smart technology with end users and the real world?

Full automation is often inappropriate for systems e.g. consider your experience of automated telephone lines. But robotic process automation (RPA) can facilitate human work e.g ordering and finding patterns in data for a human decision maker. In any case the development and application of AI in the real world is facilitated by ongoing feedback. A big design up front approach risks complexity and neglecting usefulness and usability.

Below are some research methodologies currently used for informing the development, training and testing of automated and intelligent decision support systems (IDSS).

Formative and summative user research
(aka generative and evaluative)

    • Formative explores and defines requirements and context
    • Summative evaluates the efficacy and efficiency of a design or build
    • Formative research typically gives way to summative as development progresses and there’s something to test

 

Formative research

Methods and outputs

  • How do people view automation and smart technologies?

e.g. automated systems might be less accessible, trusted or tolerated.
Method-
Qualitative research: ethnographic and attitudinal studies, user and stakeholder interviews
Contextual enquiry: observing interactions
Quantitative research: surveys

Output  –
Insight reports, anecdotal evidence

 

  • What might the benefits and consequences be of introducing smart technologies for both the provider and users

Method –
Formative interviews with stakeholders and users, technology capability, impact assessment, cost benefit analysis

Output  –
Experience map across service touch-points

 

  • What might be the challenges of introducing smart technologies for both the provider and users

Method –
Review of existing technologies, systems and data
Formative interviews with stakeholders and users

Output  –
Swim lane service map showing touch points and handovers

 

  • Will the system be accessible to people with disabilities or communication requirements

Method –
Specialist accessibility audit and gap analysis
Ensuring research demographic spans the Gov.uk digital inclusion scale

Output  –
Accessibility report, ideas for alternative user journeys

 

  • Is a controlled natural language (CNL) required and if so, what is its vocabulary?

Method –
Analysis of service call logs: Bag-of-words (BoW)
Interviews and observing interactions

Output  –
Lexicon, or a more formal controlled vocabulary e.g. Regular expressions (Regex)

 

  • What are the conditions and choices for the user, and the available paths in the system

e.g. most simply the menu and submenu options for a website’s navigation – decision trees
Method –
Card sorting,

Output –
Scripts, questions and phrases, tree diagram of navigational hierarchy, journey maps

 

  • What will be helpful feedback for both the system and the user (positive and negative)?

e.g. “From how you have described your enquiry the HMRC tax helpline might be more relevant.”

Method –
Interviewing and observing frontline operations

Output –
Scripts for system response, model questions and responses

 

  • More generally, what are the use cases, devices, connectivity, user stories, flows and navigation

Method –
Numerous

Output –
Use cases and user stories suitable for a product backlog, process (user) flow diagram,

Tree diagrams can map systems and show user options and paths

 

Summative research – testing

Methods and outputs

  • Evaluating the “smoothness” of stepping through the process across service touch points and in realistic timeframes

Method –
Usability testing, diary studies, service logs and anecdotal evidence

Output –
Usability report, completion times and success rates

 

  • Evaluating the transitions (handovers) between different contexts/modalities

e.g. a shopping list compiled on a phone, order placed with a pc, the “shop” calculated and receipt issued at point of sale, stock systems updated, and delivery signed for on a pda.Method –
Usability testing, Amazon’s Mturk (also used to evaluate use cases), analysis of drop off rates at boundaries

Output –
Drop off and bounce rates

 

  • Evaluating the efficacy and efficiency of automated processes for users

    Method –
    Usability testing, preprocessing training data, rigorous testing for biases in outputs, interviews, questionnaires, observing frontline operations
    analysis of server logs, bounce rates, completion rates, anecdotal reports

    Output –
    Service metrics

 

 

“Life, the Universe, and Everything. There is an answer. But, I’ll have to think about it.

The Answer to the Great Question… Of Life, the Universe and Everything… Is…
Forty-two,’ said Deep Thought, with infinite majesty and calm.”

The Hitchhiker’s guide to the galaxy

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *