Monthly 11-12 highlights of Amazon’s 2020 cloud technology AI

Time:2021-8-22

Monthly 11-12 highlights of Amazon's 2020 cloud technology AI

We are happy to welcome 2021. It’s time to implement all your big and small goals. What new knowledge are you going to learn this year? What new technologies do you intend to apply in the project? While striding forward, don’t forget to review the previous dry goods knowledge points, which may also give you some inspiration.

In November and December 2020, as usual, we shared a large number of technical articles on Amazon AI, machine learning, in-depth learning and so on. Let’s review it now.

machine learning

Many Amazon cloud technologies  ( Amazon   Web   Services) customers have begun to use Amazon Lex robots to enhance the conversation experience of Amazon connect self-service on the phone and many other channels. With Amazon lex, callers can quickly get answers to questions without the intervention of manual customer service. But it also puts forward higher requirements for service availability, which raises new questions: which architecture mode should we use to improve robot availability? In fact, we can deploy Amazon Lex robots in multiple regions through a cross region approach to improve service availability. For detailed methods, see:You’ve seen a lot of robot customer service. Can you build a robot that is not too available?

When using machine learning technology for prediction, the most troublesome problem is the inaccurate prediction results caused by missing data. However, if your forecasting system uses Amazon forecast, it will be simple. The data problem will be solved by using the missing completion function of the service. This article will use the notebook example in forecast GitHub repo to show you the function of missing value completion for related and target time series (TTS) data sets. Please read:How to make machine learning prediction when the value is missing? Amazon forecast will help you complete it

Amazon sagemaker provides a fully managed solution for building, training, and deploying various machine learning (ML) models. In this article, we will demonstrate how to use Amazon sagemaker processing jobs to execute Jupiter notebook in conjunction with the open source project papermill. The combination of Amazon sagemaker with Amazon cloudwatch, Amazon lambda and the entire Amazon cloud technology stack provides us with the modular backbone needed to expand jobs in real time and on schedule. Welcome to:“Tool man” why wait all night? Sagemaker helps you flexibly schedule your notebook

Fraudulent users and malicious accounts may cause billions of dollars in revenue loss to enterprises every year. Although many enterprises have been using rule-based filters to prevent all kinds of malicious activities in the system, such filters are often quite fragile and can not capture all malicious behaviors. This article describes how to use Amazon sagemaker and deep graph library (DGL) to train GNN model to detect malicious users or fraudulent transactions. See:In the face of liars who “don’t speak martial virtue”, we deal with TA like this

Generally speaking, the developed ML model needs to be continuously improved to provide satisfactory results. However, in many scenarios, such as e-commerce applications and other practical environments, off-line evaluation alone is not enough to ensure the model quality. We need to conduct a / B test on the model during production and make model update decisions accordingly. With Amazon sagemaker, you can run multiple production model variants on the endpoint and easily perform a / B tests on the ML model. These production model variants can correspond to ML models trained by different training data sets, algorithms and ml frameworks, and then cooperate with different cloud instance types to establish a variety of test factor combinations. Please refer to:The A / B test commonly used in the field of software development can also be used for ML model

Amazon sagemaker has pre installed the R kernel in all service areas around the world. This function is available out of the box, and the retegulate library is pre installed. The library is responsible for providing R interface for Amazon SageMaker Python SDK, allowing you to call Python module directly from R script. This article describes how to create a custom r environment (kernel) on top of Amazon sagemaker’s built-in r kernel, and how to implement environment persistence between sessions. It also describes how to install a new software package in the R environment, how to save the new environment on Amazon Simple Storage Service (Amazon S3), and how to use it to create a new Amazon sagemaker instance through Amazon sagemaker lifecycle configuration. See:How many skills do you know about playing in the R environment on sagemaker?

Deep learning

All the tried children’s shoes know that if you want to play deep learning, especially model training, you have to prepare a computer with strong performance. CPU doesn’t matter, but the more GPUs, the better, the faster, the better. But even if you don’t need money and are equipped with the most high-end AI dedicated GPU computing card, it still takes a lot of time to train in the face of complex models… Bosses and customers are waiting for you to get results quickly. How to improve the training speed? Find more computers and carry out distributed training. The effect is great! What exactly?Requisition more computers. Today we play the deep learning of multi GPU distributed training

GPU can significantly accelerate the training speed of deep learning, and is expected to shorten the training cycle from a few weeks to a few hours. However, many problems need to be considered in order to give full play to the powerful performance of GPU resources. This article will focus on general-purpose technologies that can effectively improve I / O to optimize GPU performance when training on Amazon sagemaker. These technical methods have good universality and do not make any requirements for infrastructure or deep learning framework itself. By optimizing I / O processing routines, the performance improvement in the whole GPU training can be up to 10x. Please read:How can deep learning training be faster? Have you tried I / O optimization of GPU performance?

artificial intelligence

Sometimes it may be difficult for us to find the right words to describe what we are looking for. As the saying goes, “a picture is worth a thousand words”. Generally speaking, the expression effect of displaying real examples or images is much better than pure text description. This is particularly prominent when using search engines to find the required content. In fact, similar capabilities are available in many applications, but have you ever thought about how to implement them? How to quickly start building a visual image search application from scratch and include a full stack of web applications for providing visual search results? Please read:“Visual search” seems magical. It’s not that difficult to try it yourself

The operation mechanism of convolutional neural network (CNN) is like a black box. If we can’t understand the reasoning process of prediction, we are likely to encounter problems in use. Similarly, after the model is deployed, the data used for reasoning may follow a completely different data distribution than the data specially used for model training. This phenomenon is often called data drift, which may lead to model prediction errors. In this case, understanding and explaining the causes of model prediction errors has become the only hope to get out of the fog. This article will deploy a set of models for traffic sign classification and set up Amazon sagemaker model monitor to automatically detect model behaviors that do not meet expectations, such as always low prediction scores or over prediction of some image categories. See:Sagemaker model monitor and debugger are divine assistants to understand the “black box” of convolutional neural network

When data scientists try to solve problems using supervised learning technology, they usually need to sort out high-quality labeled data sets before modeling. It’s hard work. Fortunately, the emergence of Amazon sagemaker ground truth allows everyone to easily obtain the data set they need for a variety of different tasks (such as text classification and object detection). Ground truth can also help you build custom data sets for user-defined tasks and label any content in them. For details, see:Angular x ground truth, dataset marking job has never been so simple

For data scientists, R language is no different from a sharp weapon, which can easily and intuitively solve a large number of challenges related to data analysis. How powerful can such an artifact be combined with sagemaker’s machine learning ability? At present, many Amazon cloud technology customers have begun to fully introduce the popular open-source statistical computing and graphics software R into the field of big data analysis and data science. In this article, we will learn how to use r to realize the training, deployment and prediction result retrieval of machine learning model on the Amazon sagemaker notebook instance. Welcome to:Sagemaker supports R programming, data scientists, cheer!

Organizations in all walks of life need to process a large number of paper documents, most of which are invoices. In the past, it was difficult to extract effective information from various scanned documents containing tables, forms, paragraphs and check boxes. Although many organizations have solved the problem of information extraction through manual, user-defined code or optical character recognition (OCR), they still need the help of perfect form extraction and user-defined workflow template. In addition, after extracting text or other forms of content from documents, users also want to help end users sort out more deep insights from receipts or invoices. However, it needs to build a complex natural language processing (NLP) model, and the training of the model needs a lot of training data and computing resources. The construction and training of machine learning model is often expensive and time-consuming. This article will introduce you how to use Amazon AI services to automatically realize text data processing and insight discovery. With Amazon AI services, we can set up an automated server free solution to meet the above requirements. See:At the end of the year, the financial personnel who are busy with the whole invoice must try this good thing

Amazon cloud technology provides a machine translation service called Amazon translate, which supports two-way translation of major languages around the world. In addition to copying and pasting the text content of one language into Amazon translate and translating it into the specified language, the recently updated Amazon translate also supports the translation of office open XML documents in docx, pptx and xlsx formats. This article will introduce how to translate documents on the Amazon cloud management console. See:What are the documents sent by overseas customers saying? Ask if you don’t understand, and Amazon translate will tell you the answer

Case sharing

As a large enterprise, many business departments of Cisco are applying machine learning (ML) and artificial intelligence (AI) technologies. The Cisco AI team reporting directly to CTO in the company is mainly responsible for promoting open source AI / ml best practices in all business departments within the company, and is also a major contributor to kubeflow open source project and mlperf / mlcommons. The Department hopes to develop artifacts and best practices in the field of machine learning for both Cisco business units and customers, and share these solutions in the form of reference architecture. So how did they do it? Welcome to:Cisco’s hybrid machine learning workflow combined with sagemaker and kubeflow

In the field of machine learning, the generative countermeasure network (GAN) algorithm has become the most popular algorithm for building deepfake. The underlying technology used by deepfake algorithm is exactly the same as the method to bring us realistic animation effects in movies and host games. Unfortunately, malicious people use these innocent algorithms to blur the difference between reality and falsehood. The essence of deepfake video is to use artificial intelligence to manipulate audio and video, resulting in the final result that the characters make or speak behaviors or languages that do not exist objectively. In October 2019, Amazon cloud technology, Facebook, Microsoft and partnership on AI jointly organized the first deepfake detection challenge. How did they use Amazon cloud technology services to successfully hold this event? Please read:Amazon EC2 helps deepfake test challenge

In the face of fierce competition, almost every retailer is trying to provide customers with customized personalized experience. But at the right time, through the right channels, recommend the right products to the right customers in the right way… It’s not easy to do well! Stockx, a start-up company from Detroit, mainly carries out its business around sports shoes and street fashion brand clothing. They are committed to innovating the existing e-commerce sales model through unique two-way quotation. In order to attract customers with a more personalized experience, after extensive comparison and evaluation, they decided to build their own personalized recommendation engine based on Amazon personalize service. How did they achieve all this? Welcome to:How to improve the overall customer participation by 50% with personalized recommendation

Monthly 11-12 highlights of Amazon's 2020 cloud technology AI