Algorithm development is in part the art and science of building the equivalent of human pattern recognition into a software program. That applies to everything from understanding the meaning – and intent – behind e-commerce engine search queries to spotting suspicious user behavior. Particularly when the suspicious behavior is an attempt to game a system.
Data scientists have been deploying algorithm-based pattern detection for decades to identify potential fraud across a range of disciplines. Typically, this is based on behavioral analytics that will examine network traffic in real time to pinpoint abnormalities that could be indicators of fraudulent activity.
With an aggregate of billions of dollars of retail transactions at stake every year, fraud detection has become a key feature for sales enabling platforms such as eBay, Amazon, Etsy and the like – where independent seller and buyer are supposed to play by the rules, but don’t always behave.
The heart of the scam is a coded, sometimes overt but more often subtle, communication between buyer and seller to hijack the actual sale off the platform – and in the process deprive the platform operator its share of the transaction. A human observer would likely catch on quickly to a platform scam that could be as simple as a seller telling the buyer, “here’s my phone number. Call me directly and I can give you a better deal.”
Even a rudimentary fraud detection algorithm, built through Natural Language Processing coding, would likely catch that one. But the digital wink, wink, nod, nod exchanges that characterize 2020 scammers are far better disguised, and require algorithms with a similar ability to carefully parse exchanges looking for communication worth investigating.
They’re designed to flag trigger words and phrases pointing to suspicious exchanges involving phone numbers, email addresses, mailing addresses and the like, all based on context. There is, after all, a difference between a seller asking for a mailing address in order to ship product, processed through the platform, and a buyer asking for a seller’s address so the parties can meet to facilitate a side deal.
Sometimes it’s just the seller perpetrating the scam, by running duplicate product pages for the same product. The aim? Manipulate sellers through duplicate listings that in doing so violate the terms of the e-commerce platform.
Teaching a machine to recognize the intent behind otherwise innocuous-seeming requests or listings requires, first, an ability to spot suspicious trigger words and phrases and then to correctly parse intent, entity, and context in a single sentence. And over time it’s become something of a “whack a mole” exercise as fraudulent actors do their best to fool the fraud detection algorithm – even with something as basic as a phone number.
One example: camouflaging phone numbers. “We were flagging any digits exchanged (between buyer and seller) remembers Arnab Chowdhury, iMerit Senior Manager Projects, who supervised the project. “Once sellers (looking to game the system) realized that the platform was monitoring for this they began to evolve.” Step one: writing out the numerals as words. When the platform caught on, they resorted to writing the numbers as a single block (sevennineonethreeonetwonineninenineseven). And when the algorithm caught on to that they moved on to numbers written as the upper-case keyboard symbol above each number.
It used to be easier when gaming the system always began with, “Psst, buddy – I got a deal for ya.”