My PhD, 10 years on

This month (November 5th to be exact) marked the 10th anniversary of my PhD viva.


I did my PhD at LSE with Dr James Backhouse, and investigated the profiling of undesirable customers (or customer screening). I looked at how organisations define who is a desirable customer, and who isn’t one; and the process that they follow to sort through the prospective and existing customers in order to weed out the ‘undesirables’.


Now, you may be thinking: Why would a business want to avoid certain customers? Surely, the more the merrier?!


Well. They do. First, because resources are limited. For instance, there is only so many customers a restaurant can serve at any time. Second, because some customers, while profitable in themselves, may spoil the value of your offer for others. For instance, arguably, sales of luxury bags go down when reality show stars start using them.


As these examples suggest, there are various forces at play when deciding which customers to demote or avoid. And, at times, these forces are in contradiction with each other – e.g. short-term vs long term gains. Moreover, these decisions can be very subjective, and they often reflect long held, but unsubstantiated or outdated views and preconceptions about who and what is good vs. bad. Hence, studying how a firm classifies its customers can be very revealing about its priorities and processes, and shine a light on biases.


The specific industry that I looked at, for my PhD, was banking; and the specific type of undesirable customer was ‘money launderers’. If you think that this is very niche application, you are wrong. Every single bank in the UK (and in most Western economies, for that matter) regularly sift through their clients’ transactions to try and detect possible money laundering. They have to, or they risk heavy fines and possible imprisonment, as well as having their reputation thrashed in the media. So, it affects you and me, on a daily basis.

In addition, it is actually quite difficult to define what money laundering is, in practice. Sure, the formal definition is that a money launderer is someone who is using money generated from criminal activity, with the intent of disguising its illegal source. Since 2001, money laundering has been extended to include the processing of funds to finance criminal activity – e.g., terrorism – regardless of whether those funds had a legal origin or an illegal one. But crimes are very diverse in nature, scale and methods used (e.g., ranging from the handy-man who wants to be paid in cash to evade tax, to multinational people, arms and drug trafficking networks). Plus, the methods used are not only secretive but ever changing to avoid detection.

For this combination of reasons, money laundering (and terrorism financing) is an area of profiling that is ripe for subjectivity, assumptions and speculation; and which impacts everyone who uses a bank.

Big Bang Data

To study this topic, I looked at how one particular bank tried to detect money laundering among its customers. I looked at the customer facing staff, back-office staff, and the algorithms used. In addition to interviews and observations, I also analysed training manuals, policy documents, formulae, and others.


I found that the policy document offered one definition of money laundering, which was very broad and inclusive. However, when people across the organisation tried to apply that definition to their day to day activities, from selling a mortgage to tuning the algorithms’ formulae, they focused on very specific instances of the behaviour – for instance, a particular type of business, or a particular type of transaction. The algorithms took this focus even further, by looking not just at specific types of transaction, but even very concrete amounts or timings between transactions.


My data collection also included discussing a particular (fictitious) pattern of transactions with various types of employees, including those developing the algorithms. All employees had had the same money laundering training, and their work was governed by the same policies. So, I wanted to see if they would reach similar conclusions about whether that pattern of transactions referred to money laundering or not.


They didn’t.


Their answers varied from ‘extremely likely to be suspicious’ to ‘very unlikely to be suspicious’.


In these discussions, I noticed how different members of staff would pick up on different details of the transaction pattern that I had presented – e.g., some would comment on the name and presumed nationality of the client; others on whether the client was using cash or transfers; yet others would comment on the timing of the transactions.


Regarding who would pick up on what detail, there were some commonalities in terms of, for instance, the job function that they had. Namely, customer facing staff might focus on the cash element. However, back-office staff were slightly more likely than their customer facing colleagues to comment on the timing of the transactions. In turn, the algorithm developers were more likely than the others to comment on very specific markers such as the professional occupation or the length of the relationship with the bank (I.e., how long they had been a client of the bank).


However, for other factors, the reasons wether they were picked up on, or not, were very circumstantial. For instance, employees in one branch in a deprived area of the city were more likely to say that the client was engaged in money laundering than their colleagues in a branch located in the business district. The personal circumstances of the interviewee also influenced their tendency to think that the client was engaging in criminal behaviour, or otherwise. And the programmers commented that, even though there was a routine that would flag a type of transaction mentioned in the case, this particular client would not have been picked up by the algorithm because the specific amounts mentioned were outside of the thresholds used in the formulae.


Not only would some data be picked up by some staff while being ignored by others, but also, in all cases, the interviewees ‘filled in the gaps’ when certain data were not there. They made assumptions and jumped to conclusions – e.g., they tried to guess the supposed nationality of the client based on the surname used. Some, even mentioned details that were not in the data provided. For instance, they would say that it was suspicious for the client to have so much money in their account given that the wife was unemployed, whereas the wife’s professional occupation – or lack of – had not been mentioned anywhere in the data (the case only said that the client was married).


Alas, ten years on, algorithms are everywhere. They determine whether you get an upgrade, how long you wait in the cue at the call centre, which products you are recommended, which results you get when you use a search engine, what adverts you get served, which news you see, and even which of your friends’ many social media updates you get to see first.


So, next time you get a table at that popular restaurant – or not – you know: it had less to do with you, and more to do with a combination of organisation, individual and technical factors, at play in that particular scenario.



4 thoughts on “My PhD, 10 years on

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s