As veterans in the localization space, we have seen the use of Machine Translation steadily increase over the years as the quality of MT has improved. The world of Big Data has clearly helped drive this increase in quality since MT engines are based on statistical algorithms — the more data you feed the engine, the better the results. This is one reason you see “free” website translation tools today: they want your data so they can further test and train their engine.

One company found that their employees were sending more than 1 GB of data per month to free machine translation web sites.  Some of that information was proprietary and extremely sensitive.

Did you ever ask yourself what exactly happens to that data? In many cases the content becomes the property of the engine provider, allowing them to use the information as they see fit.

You probably have a data security breach thanks to free MT tools

When content is entered into a free web-based machine translation tool, it is being delivered outside of your firewall and into the possession of another company. This probably isn’t a big deal when I’m trying to translate my Aunt Sophia’s family meatball recipe from Italian, or when I translate the assembly instructions for my kid’s train set from a foreign-language manual I found online.

But what about sensitive or regulated information? What can the MT provider learn from the quantity and types of translation requests submitted by your people?  What about staff reviews for in-country employees? What about the contracts from overseas suppliers? What about medical test results from non-English labs?  What is done with that information that ties an IP address or individual to the content being translated – information that is easily determined and stored with the free MT provider.

Your employees may be entering sensitive information into free web-based machine translation tools. Now you have a problem.

Here are 4 tips to help you scope and correct the security problems caused by free Machine Translation tools:

Tip 1: Check the ULA of “free” online machine translation tools

Do free web-based translation tools retain your content? Do they use it to tune their translation engines? Do they retain metadata to measure performance? Do your search terms and phrases become the content used to auto-populate searches and translations for other far-flung users outside of your organization?

You should carefully check the User License Agreement (ULA) of these tools to see what the risk might be — and be sure to check back frequently because the terms of service are subject to change at any time.

Tip 2: Ask IT to monitor the use of free online machine translation tools

You might be surprised! One of our clients asked IT to specifically monitor this web traffic and found that over 1GB per month of content was being entered into free translation tools, and some of it was highly confidential in nature. It is a relatively trivial matter for IT to track the use of the IP addresses for these websites. They could even take more extreme measures such as displaying a warning message about security and IP protection in the browser whenever an employee accesses these sites, or blocking them altogether.

Tip 3: Deploy a private machine translation portal for your employees

The number one cause of this problem is that most companies simply do not provide a better alternative for their employees. We have helped our customers deploy secure private machine translation engines that work with your security protocols to provide the same functionality without putting your sensitive corporate data in a compromising position.

In fact, our clients report that private machine translation portals deliver superior translation quality because they use MT engines tuned especially for each client’s industry, leveraging their linguistic assets like translation memory and terminology databases.

Tip 4: Consider whether you want on-premise or SaaS deployment

We often get questions about the security of MT services hosted in the cloud. It’s really important, and you should definitely ask your MT provider for details about their deployment options.

SDL Government offers secure, on-premise Machine Translation  in your environment — a deployment option mandated by many of our government customers as well as other customers who handle confidential or regulated information. So, if the data that you translate, or your association with the data being translated is proprietary or not meant to be held by a third party vendor, looking at on-premise MT is critical, and SDL Government can help.  Over the past two years we have seen a significant increase within our Defense and Intelligence customers – transitioning from ‘open’ web-based translation solutions residing outside the organization’s firewalls, to on-premise translation solutions.  The result – our customers create their data, keep their data, and completely control the overall translation process.

With the proper precautions, you  can use machine translation for your sensitive data. You just need to make sure that you are using a fully secure solution so you aren’t exposing that data to the world.

Protect your important data with on-premise Machine Translation.

Your team will benefit from improved security, performance and quality.

Learn More About Enterprise Translation Server-G