Google Translate Recipe - Part 2
In Part 1 of this series, we discussed at a high level, what is a recipe and how Google Cloud Services can be consumed as recipes in a solution. To recap, a recipe is a small program that runs as a serverless function inside Trillo Workbench. A recipe for a Google Service integrates with that service and provides an easy and customizable interface to make use of it.
In this article, we will discuss an end-to-end solution built using a recipe for the Google Translate Service.
In a couple of hours, a user can build a workflow to process and translate any number of files (thousands, millions, or more).
The solution discussed here has the following requirements — these are typical of several use cases. Some of them, we will visit in future blogs in greater detail.
- Users upload 10000s, or more files into the Google Cloud Storage bucket using the “gcloud” command Trillo Workbench (UI or in-built SFTP server).
- These files contain text in one language. The text needs to be translated into another language using the Google Translation service.
- The result of the translation is stored in the database along with the original text and full file name.
- The system should be able to process several 100s files concurrently (or 10000 concurrently with minor configuration changes).
- The system should provide a secure API to retrieve a list of files and for each file the translated text.
- Users should be able to monitor the progress of each file. All failures should be recorded. The system should retry failed files 3 times. Users should be able to view all failures that occurred 3 times.
Building the Solution using Trillo Workbench
- In order to upload files, we will use the UI of Trillo File Manager. It is a part of Trillo Workbench.
- An end-user uploads files using Trillo Workbench UI.
- An admin-user can attach a task that runs when a file is uploaded to a folder. It runs on each file.
- The admin-user creates the task. A task in the Trillo Workbench consists of a custom-recipe (program snippets).
- The custom recipe reads the content of the file, invokes out-of-the-box Google Translate recipe to translate text. It then writes the output to a database table. The custom recipe also creates a log of important events such as file size or any failure.
- Trillo Workbench provides a UI to create the table to store the result of translation.
- Trillo Workbench automatically publishes a paginated API to list entries in the table and details of a record.
- Trillo Workbench can issue credentials to access APIs securely.
The above steps provide a scalable service for translation in a few hours or a day’s work. The beauty is that each step is customizable. The table columns are customizable. For example, you can use another API automatically provided for the table to store the correction made by end-users in the translated text.
You can easily extend the functionality — such as computing the accuracy of translation using another recipe. Or, you can use translated and corrected results to train a model.
Simply replace the Google Translate recipe with NLP entity extraction, you can visualize that a similar workflow can work for the entity extraction.
Recap of Developer User Tasks
In order to build the solution, a developer user carries out the following tasks.
- Create a table to store source, translated text, file name.
- Write a custom recipe to read a file, translate it using an existing recipe.
- Save source and the translated text in the table. Also, create logs of important events.
- Create a task, associate the recipe with it, specify concurrency of tasks as 100, and retrial as 3.
- Create a folder, attach the task to it.
Screenshots of the Workflow
The following screenshots provide a visual overview of the workflow discussed above.