Data cleansing market awash with choice

10th January 2017 • Features

by James Lawson, consulting editor

Marketers have never been better served when it comes to data cleansing with a wider array of quick, easy and inexpensive solutions available to them than ever before.

Cyril Law | solutions director, Callcredit Marketing Solutions
Graham Clark | sales director, helpIT Systems
Mark Dobson | client services director, Software Bureau
Phil Good | managing director, Hopewiser
Steven Adams | marketing executive, Intermedia Global

Why keep your campaign lists and customer databases clean? Reducing campaign costs, enhance analysis accuracy and boosting customer response are just three of the reasons. Handily, constant industry enhancements to software and services are making it ever quicker, easier and cheaper to do.

Email is a good example to start with. It’s always been possible to verify formatting and syntax but this falls well short of confirming the email is live. Today, full server level email validation is becoming the industry standard.

It’s available as an online bureau service from suppliers like Intermedia Global and as a direct web service from helpIT Systems. The latter has integrated it with its current range of cleansing tools and services.

This technique goes well beyond a simple ping by sending a simulated “ghost” message to make sure the mailbox really exists for that user/address combination and that the mail will be accepted. It differentiates between soft bounces (where the email is valid and the message reached the recipient’s mail server) and hard bounces where the email address is invalid or doesn’t exist.

“Email is often overlooked and it’s surprisingly cheap to check this way,” says Steven Adams, Marketing Executive at Intermedia Global. “It’s so easy and efficient that there’s really no excuse for not doing it. It also protects your sender reputation – you could be blacklisted for spamming if you repeatedly send invalid addresses.”

Adams gives the example of one client’s house file where fewer than half the records were live or produced a soft bounce. “That probably hadn’t been cleaned for years and there had been a lot of money wasted on broadcasting fees,” he says.

Where name, address and other identity information is available, additional enhancement services allow new emails (and new names in the case of B2B job roles) to be appended to a cleansed email file to replace out of date contacts. “With the GDPR coming in, you need to stay on top of your cleaning more than ever,” comments Adams.

Via web services, emails can also be instantly verified on capture as can most other types of customer-related data these days. Suppliers like GBG and Data8 offer a host of services on demand from address checking to bank account validation.

By operating in single look-up mode, many in-house batch cleansing tools can also offer this facility. With the right reference data, that could allow immediate checking against mortality and goneaway files.

HelpIT’s matchIT Web package is another established contender here, checking that new customers’ records don’t already exist within the customer database. With widespread adoption in the USA in particular, it’s used globally by the likes of 3M.

“It’s a locally held web service that sits on top of the data stores,” says Graham Clark, Sales Director at helpIT Systems. “We recently won two contracts with large system integrators to enable duplicate prevention alongside rapid addressing.”

On-demand cleansing delivered via DaaS and SaaS continues to be the main theme in addressing and verification services today. CallCredit’s Define API, released just under a year ago, gives direct real-time access to its vast Define consumer database to power data cleansing and appending. With everything from lifestage to geodems codes available (including the new individual-level Cameo) across 46 million records, enhancement applications are a strong focus.

“You can do it in real time, record by record,” says Cyril Law, Solutions Director at CallCredit Marketing Solutions. “Businesses can plug their website into it to verify and enhance new customer data and then immediately use it for targeting or to email them.”

Like the other web services discussed above, the Define API can also be employed for verification and enhancement at the point of capture. It can replace the company’s improvemydata online services and conventional offline bureau services, or be used in tandem with them.

In a separate, more recent project, CallCredit is building a new suite of templated SCV databases. “We’re launching them in the new year,” says Law. “With the option of hosting it in the cloud, this approach will make it much quicker and cheaper to build and deploy an SCV database.”

The new platform will use the MS Azure cloud platform and employ Microsoft’s Data Lake analytics technology. As is increasingly common in marketing analytics, this replaces tradition relational structures with distributed data stores that don’t require transformation or schema definition and are able to scale almost infinitely.

“You can throw various types of data in there very quickly and then create the view you need on top of that,” explains Law. Using the proven Hadoop ecosystem, Data Lake can support a variety of analytical packages and other applications.

Batch processing and matching software for in-house address formatting, PAF validation, deduping and suppression are also seeing active development. With a broad spread of functionality, The Software Bureau’s Cygnus suite gets constant attention to make sure all its aspects are bang up to date.

“Sortation, suppression and trace processing are the primary areas that are keeping us busy at the moment,” says the company’s Client Services Director Mark Dobson. “With sortations, there’s thousands and thousands of pounds of discount involved so you have to get it exactly right.”

The next release of Cygnus will include the full Citipost sortation, meaning the package will now have all the DSA providers available within it.

“It has been quite a long process but this will make our customers lives a lot easier as they won’t need any third party tools to handle any of their mailing work,” Dobson says. “At least until another provider releases a new sortation product onto the market that is.”

Experian’s Absolute Movers and Contacts are the latest reference files to be added to Cygnus’s ES module. By encrypting a huge range of reference files – over 500 million records – ES gives users instant pay-per-click access to all the leading suppression data on the market, with Acxiom’s Purity the only current exception due to lack of customer demand.

“The way our system works means you can do a lot of trial and error processing to get the best results before you incur any hit charges,” explains Dobson. “Around 80% of our clients are using ES now, it has grown significantly in the last couple of years. With automatic updates, the ease of maintenance is a big attraction – doing it yourself would be a full time job.”

That ability to fine-tune processing and reduce waste is behind the company’s current Lean DM campaign, inspired by Toyota’ famous production philosophy. This highlights the savings that accurate, expert processing can produce.

“That means precise PAF validation and suppression screening but also identifying addresses that aren’t quite right,” says Dobson. “Is the name presented correctly and is the address deliverable? We’re not in a world where two per cent response is acceptable any more so cutting out any wasted mail packs is vital.”

HelpIT’s matchIT SQL and matchIT Hub also stand out in the domain of high volume, high speed in-house data processing. As the name suggests, the former runs directly on top of SQL databases while the latter is a platform-independent solution that employs in-memory processing for super-fast matching.

“We have one client running Hub on a combination of Linux, AWS and Ubuntu,” says Clark. “It lets them present their customer database as a web portal so their staff can use Hub for intelligent searches.”

Latest upgrades to matchIT Hub include a new REST-based API to make access easier. Rather than running locally, MatchIT SQL can now make use of Microsoft’s Azure cloud platform.

Clark also notes that clients are increasingly using more sophisticated tools like Qlikview and MS Power BI to report on and visualise problem records. HelpIT has recently made the latter available to clients to analyse processing results within its products.

“You need these visualisation tools to trace problems and improve the process,” he says. “You can view the charts and then drill down to inspect the bad data.”

Addressing mainstays Hopewiser provide another great example of persistent innovation, presently putting in a huge amount of work on a wide range of projects. Maximising quality and speed in their bureau service is a big current focus, with one goal to make all the major goneaway, reconnect and mortality files available online as web service look-ups.

“That way, you will be able to link through immediately from our software or use the feeds as a bureau service,” says Phil Good, Managing Director of Hopewiser. “Part of the reason is the GDPR. We want clients to be able to run cleansing in-house without having to send their files to us and we’ll be looking to provide a version of our bureau software to allow them to do that.”

Another project involves revamping the user interfaces in the company’s software tools to make them as simple as possible. “Simplicity and clarity are the big drivers,” Good says. “ My question always is, do we really need that button or control, and can we take it out?”

Effort is also going into new video tutorials to guide clients so there is no need to call for support. Licence management for addressing is another line of attack, with the ambition to put all the various licences involved in deploying software into a single document.

Hopewiser’s developers are also making web-based rapid addressing much easier to build into websites. Simply drop in the pre-formatted code and off you go.

“Our customers don’t want lines of code, they just want it to work,” states Good. “Modern technology makes it very easy, we use a wizard to take developers through it step by step to build it into their website. Our aim is to reduce set-up to almost zero.”

Another Hopewiser bureau development is a thorough appraisal of deceased files, checking the counts that all files produce in an area against the numbers recorded by ONS and HMRC data.

“We want to know, do the figures add up?” says Good. “We’re running them against each other and taking out the dupes.”

One of Good’s pet projects is to take the company’s Documailer variable print templating software used by mailing houses and evolve it into a self-service web-based tool. This would be an online campaign builder that could output email, web pages and other formats like PDF as well as for printed mail, with the ability to archive everything at low cost.

“You just enter the names and content,” explains Good. “It’s more of a long-term project.”

Wrapping up the development update, he notes that the cost and quality are still in the wrong order when it comes to selecting cleansing software and services.

“Quality is our priority but I think that cost will remain the driver for most clients,” he says. “When that issue comes up, I usually ask if they have compared the results from different suppliers and attempted to put a cost on any missed or wrongly matched data.”

If only more end users took the time to do just that. However, data quality and cleansing are notoriously unattractive to marketers. So, it should comfort to them and all other customer data users to know that there are plenty of experts out there that can provide expert support – and that continue to move the game on.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

« »