Author: Hu Qi
Abstract: the Chinese New Year is approaching. I heard that Huawei cloud AI will open intelligent couplets to meet the bull spirit. Let’s wait and see! As a senior siege lion, it is impossible to implement a couplet model by myself. Therefore, I have collected many previous practice cases. Today, I want to share with you the project AI couplet contributed by Rua New Year cake of whale community, which is based on modelarts“My notebook”realization.
1、 Environmental preparation
Before preparing the environment, first nag a few words: modelarts is a one-stop AI development platform for developers. Modelarts has excellent performance in the three elements of artificial intelligence. In terms of data, it provides massive data preprocessing and semi-automatic annotation. In terms of algorithm, in addition to developers’ own development, it also provides a large number of preset algorithms and subscription algorithms to choose from, In terms of computing power, the current development environment provides free computing power and point and use“My notebook”。 At present, my favorite function is“My notebook”, if you have experienced the mindspire tutorial「Run in ModelArts」, you will find that the link in the tutorial is modelarts“My notebook”Module. For specific experience, you can read my history article for 5 minutes and experience mindspire’s layer IR — mindir online.
Compared with conventional development, we need to install a lot of environments and software first. AI development based on modelarts seems to be simpler. In theory, it is enough to have devices that can access the Internet, such as pad. Then, we only need to register a Huawei cloud account and authenticate its real name. Of course, the preparation of modelarts is not just these. For example, if OBS is needed, it is also necessary to generate the access key and complete the global configuration of modelarts. Please refer to modelarts lab: https://gitee.com/ModelArts/M… 。
Free“My notebook”In the development tool card at the bottom of the model Arts Overview page, click“Experience now”You can open the jupyterlab of a default CPU environment, which we can see on the right“Switching specification”Bar to switch the environment or specifications. It should be noted that:
After switching resources, all notebooks and terminals under the instance will be affected. All variables executed in the notebook will become invalid. The terminal needs to be reopened. The manual installation package will no longer take effect and needs to be executed again.
At present, the CPU and GPU environments support eight notebook environments, including conda-python 3, python-1.0.0 and tensorflow-1.13.1. However, when using GPU, you should pay attention to:
1. Free specifications are used for use experience and will automatically stop after 1 hour;
2. Free computing power does not include the cost of object storage service (OBS) storage resources.
If you want to use the mindspire framework, you can use it from the mindspire official documentation tutorial「Run in ModelArts」Jump to jupyterlab with mindspire framework.
2、 Introduction to seq2seq
Seq2seq is a general encoder decoder framework for tensorflow, which was opened by Google in 2017. It can be used for machine translation, text summarization, session modeling, image description, etc.
Thesis address: https://arxiv.org/abs/1703.03906
3、 Copy practice
Create a new notebook file for tensorflow 1.13.1 environment and start coding (Kao) and writing (BEI).
- Dataset Download
Although the couplet dataset is relatively old, it has 700000 pieces of data, which should be enough to implement a simple pairing model.
- Dependent installation and reference
- data processing
- Model definition
Other codes will not be posted here. It is recommended to directly refer to the source code or visit https://github.com/hu-qi/modelarts-couplet 。 Here I choose 200 epochs, and the training process is as follows:
It is obvious from the figure that the lower link of the output of the evaluation function is constantly adjusting. After the training, we refined the Dan AI couplet model that can be used simply, and then tested it:
It’s good. It’s still very neat and smooth!
Of course, it is not plain sailing in practice. If you encounter a renewal prompt during the training process, be sure to click it manually, otherwise you have to restart the notebook. It’s right to renew it. Although I’m not sure I can renew it several times, it won’t interrupt the training.
The dataset and notebook of this practice have been uploaded to GitHub: https://github.com/hu-qi/modelarts-couplet In addition, the couplet data is also shared to modelarts AI Gallery: coupon dataset: 700000 couplet dataset. Welcome to read it!
Click focus to learn about Huawei cloud’s new technologies for the first time~