“Tool man” why wait all night? Sagemaker helps you flexibly schedule your notebook

Time:2021-9-26

At 5:00 p.m. on Friday, you finally completed a complex and cumbersome feature engineering strategy before work. This strategy is inAmazon SageMakerWork has started on studio t3.medium notebook. What you want to do is insert this strategy into a large instance, overwrite the remaining data sets through horizontal expansion, and then start the weekend happy hour.

However… Although the notebook instance can be upgraded directly, this operation will stop as soon as the computer is shut down. What should I do? Stay in front of the computer while the job is running? Why not schedule jobs directly from the notebook?

Amazon sagemaker provides a fully managed solution for building, training, and deploying various machine learning (ML) models. In this article, we will demonstrate how to use Amazon sagemaker processing jobs with open source projectsPapermillExecute jupyter notebook. Amazon sagemaker andAmazon CloudWatchAmazon LambdaAnd the entire Amazon cloud   (Amazon   Web   The combination of services stack provides us with the modular backbone required for real-time and scheduled expansion of jobs, such as jobs such as feature engineering. We are glad to provide a DIY toolbox to simplify the whole process, set permissions using Amazon cloudformation, start jobs using lambda, and useAmazon Elastic Container Registry(Amazon ECR) create a custom execution environment. It also includes a set of class libraries and cli (command line execution tool), which can be used for initial notebook execution from any Amazon cloud technology client, plus a jupyter plug-in to provide a seamless user experience.

As of the time of writing this article, you can already write code in jupyter notebook and run it on the temporary instance of Amazon sagemaker immediately or as scheduled. Using the tools provided in this article, you can perform this operation from any of the following places: at the shell prompt, in the jupyterab of Amazon sagemaker, in another jupyterab environment of your own, or automatically in the program you write. We also provide relevant sample code to simplify the heavy setup process using Amazon cloudformation, and provide simple tools to run and monitor the implementation of the whole system.

For more details on executing notebook, seeGitHub repo。 On GitHubaws-samplesAll sample source code is provided. Next, we will describe how to use the scheduled notebook execution.

When to use this solution

This toolkit is especially suitable for running night report jobs. For example, we may need to analyze all the training work completed by the data science team on that day, conduct cost / benefit analysis, and generate a report on the business value that will be brought after the model is deployed to the production environment. Such use cases are perfect for using a scheduled notebook – all graphs, tables, and charts will be automatically generated by the code, just as we debug the notebook step by step, but now they will be processed automatically, and the results will persistAmazon Simple Storage Service(Amazon S3). We can use the latest notebook implemented last night to start a new day’s work and promote the development of analysis ability.

Or, think about vertical expansion feature engineering. You have completed the for loop to perform all the pandas conversion. What is needed next is time and computing power to run it on all 20GB data. No problem – just put the notebook in the toolbox, run a job, close the notebook, and everything is done. No matter whether you are actively using Jupiter or not, the code will continue to run on the scheduled instance.

Perhaps your data science team may still train the model on the local notebook or Amazon sagemaker notebook, and has not yet used the temporary instance of Amazon sagemaker to perform the training. Using this toolkit, we can easily use advanced computing options during model training. We can start a p3.xlarge instance for an hour of model training, but run the studio environment all day with an affordable t3.medium instance. We can easily connect these resources to the experiments SDK through a few lines of code. Although it fully supports running Amazon sagemaker notebook and Amazon sagemaker studio on P3 instances, it is an important cost saving practice to develop the habit of using the largest instances only for a short time.

You may also have an S3 bucket full of objects, and you need to run a complete notebook on each object. These objects may be the phone record date in the call center or the tweet flow of specific users in the social network. In any case, we can easily write for loops for these objects using this toolbox. The toolbox schedules jobs for each file, runs the job on its special instance, and stores the completed notebook in Amazon S3. These can even be model components loaded from our favorite training environment – packaged reasoning code into the notebook, and then easily deployed using the toolkit.

Finally, the customer also reported to us that the operational performance of the reporting model is an important asset for all stakeholders. Using this toolbox, we can implement a set of manual solutions that can analyze the importance of features, generate ROC curves, and evaluate how the model performs in various extreme cases that are important to the final product. We can also build a model analyzer that can be easily accessed by all data scientists in the team. We can trigger the model analyzer after each training operation is completed, and end the whole cycle after sending the analysis values to various stakeholders.

In sagemakerExecute scheduled notebook inThree ways of

To execute the notebook in Amazon sagemaker, we can use the lambda function to configure and run the Amazon sagemaker processing job. This function can be called directly by the user or added as a target to Amazon eventbridge rules to run on schedule or through event response. The notebooks to be run are stored as Amazon S3 objects, so even if we are not online, these notebooks can still be executed normally when execution occurs. The following figure shows this architecture.

Next, we outline three different ways to install and use this feature, which allows us to handle notebook and scheduling in the way we need.

Using AWS APIOr use the CLI directly

We can use the AWS API directly to execute and schedule the notebook. To simplify the process, we provide a set of cloudformation templates to configure the required lambda functions, and provide Amazon cloud technology identity and access management required to run notebook(Amazon Identity and Access ManagementIAM) role and strategy. We also provide relevant scripts to build and customize the docker container image used by Amazon sagemaker processing job when running notebook.

After instantiating the cloudformation template and creating the container image, you can run a notebook using the following code:

$ aws lambda invoke --function-name RunNotebook 
 --payload '{"input_path": "s3://mybucket/mynotebook.ipynb", 
 "parameters": {"p": 0.75}}' result.json

To create a schedule, enter the following code, and note that replace the ARN in the region and account number sections and replace the input_ Path points to its own S3 bucket.

$ aws events put-rule --name "RunNotebook-test" --schedule "cron(15 1 * * ? *)"
$ aws lambda add-permission --statement-id EB-RunNotebook-test 
 --action lambda:InvokeFunction 
 --function-name RunNotebook 
 --principal events.amazonaws.com 
 --source-arn arn:aws:events:us-east-1:123456789:rule/RunNotebook-test
$ aws events put-targets --rule RunNotebook-test 
 --targets '[{"Id": "Default", 
 "Arn": "arn:aws:lambda:us-east-1:123456789:function:RunNotebook", 
 "Input": "{ "input_path": "s3://mybucket/mynotebook.ipynb", 
 "parameters": {"p": 0.75}}"}]‘

In this way, you can move the notebook to Amazon S3, monitor the Amazon sagemaker processing job, and extract the output notebook from Amazon S3.

If an experienced Amazon cloud technology user wants to build a solution without involving additional dependencies, this solution can be said to be a great solution. We can even modify our own lambda function or papermill execution container to meet more specific requirements.

For more details on scheduling notebooks using the Amazon API, seeGitHub repoComplete configuration instructions on.

Simplify the whole process with a convenient toolbox

In order to further simplify the scheduling of notebooks (especially when we are not Amazon cloud technology experts), Amazon cloud technology has also developed a convenient toolbox to encapsulate Amazon cloud technology tools into CLI and python libraries, providing us with a more natural interface for running and scheduling notebooks. This toolkit allows us to build a custom execution environment through Amazon codebuild without using docker, and manage and monitor Amazon S3 interactions and jobs at the same time.

After the setup run is complete, execute the notebook using the following code:

$ run-notebook run mynotebook.ipynb -p p=0.5 -p n=200

Use the following code to schedule the notebook:

$ run-notebook schedule --at "cron(15 1 * * ? *)" --name nightly weather.ipynb -p "name=Boston, MA"

The toolkit also contains tools for monitoring operations and viewing current scheduling. See the following code:

$ run-notebook list-runs
Date Rule Notebook Parameters Status Job
2020-06-15 15:31:40 fraud-analysis.ipynb name=Tom Completed papermill-fraud-analysis-2020-06-15-22-31-39
2020-06-15 01:00:08 DailyForecastSeattle DailyForecast.ipynb place=Seattle, WA Completed papermill-DailyForecast-2020-06-15-08-00-08
2020-06-15 01:00:03 DailyForecastNewYork DailyForecast.ipynb place=New York, NY Completed papermill-DailyForecast-2020-06-15-08-00-02
2020-06-12 22:34:06 powers.ipynb p=0.5 Completed papermill-powers-2020-06-13-05-34-05
 n=20
$

For more details about this convenient toolkit, seeGitHub repo

Using GUIVia jupyterlabExecute notebook directly

If you prefer interactive experience, the convenient toolkit also provides jupyterlab extension, which can be used in local jupyterlab, Amazon sagemaker studio or Amazon sagemaker notebook instances.

After setting up the jupyter extension for Amazon sagemaker studio users, you will see the new notebook execution sidebar (rocket ship icon). This sidebar lets us execute or schedule the notebook we are viewing. We can use the default settings to create a notebook runner container, or build other containers. Enter the ARN used by these jobs and the execution role required by the instance, and you are ready to go to the next step.

After selecting run now, the lambda function will select the notebook and run it on an Amazon sagemaker processing job. At this time, we can select runs to view the status of the job. See the following screenshot for details.

After the job is completed, the completed notebook will be stored in Amazon S3. Note that this means that previous runs will still persist, so we can easily restore them.

Finally, select View output and import notebook. If the notebook is not imported, it will never be copied to the local directory. This design is great when we want to see what’s happening, but don’t want to generate too many additional notebooks.

For more instructions on setting up the jupyterlab extension and running and monitoring the notebook using the GUI, seeGitHub repo
summary

This paper discusses how to combine the modular functions of Amazon sagemaker and Amazon cloud technology to provide data scientists and machine learning engineers with a seamless notebook running experience on temporary instances. We have also released an open source toolkit to further simplify the whole process, including a cli, a convenient toolkit and jupyter features.

On this basis, we discussed a variety of use cases, from running night reports to extending Feature Engineering, and then to model analysis of the latest data sets. We also shared several sample ways to run the toolkit. Welcome to GitHubQuick start tutorial, and inGitHub repoSee more examples on.