This article is from the heart of the machine, written by Adrien treuille, compiled by the heart of the machine.
How difficult is it for machine learning developers to build an app? In fact, you only need to know Python code, and the rest of the work can be handed over to a tool. Recently, Adrien treuille, co-founder of streamlit, wrote an article to introduce the development framework of machine learning tools streamlit, which is a free and open source app construction framework specially created for machine learning engineers. This tool can update your application in real time as you write Python code. At present, streamlit’s GitHub star has exceeded 3400, and its heat on medim has reached 9000 +.
With 300 lines of Python code, programming a real-time neural network inference semantic search engine。
In my experience, every extraordinary machine learning project is integrated with internal tools that are error prone and difficult to maintain. These tools are usually written in jupyter notebooks and flash app, which are difficult to deploy. They need to reason the client server architecture (C / S architecture), and can’t integrate well with tensorflow GPU session and other machine learning components.
I first saw this kind of tool at Carnegie Mellon University, and then in Berkeley, Google x, zoox. These tools were originally just small jupyter notebooks: sensor calibration tool, simulation comparison app, lidar alignment app, scene reproduction tool, etc.
As a tool becomes more and more important, project managers get involved: the process and the requirements are increasing. These individual projects become code scripts, and gradually develop into lengthy “maintenance nightmares”
The process of creating app by machine learning Engineer (ad-hoc)。
When a tool is critical, we build a tool team. They write Vue and react skillfully, and stick stickers on their laptops full of voiceover frames. Their design process is this style:
Clean process, clean team building。
This is amazing! But all of these tools need new features, such as new features that go online every week. However, the tools team may support more than 10 projects at the same time, and they will say, “we will update your tools in two months. “
We hope that machine learning engineers can build good apps without a tool team. These internal tools should come naturally as a by-product of machine learning workflow. Writing such tools feels like training neural networks or performing ad-hoc analysis in jupyter! At the same time, we want to retain the flexibility of the powerful app framework. We want to create tools that engineers can be proud of.
We hope that the app construction process is as follows:
Construction process of streamlit app。
Together with engineers from Uber, twitter, stitch fix, Dropbox, etc., we created streamlit, a free and open source app framework for machine learning engineers. For any prototype, streamlit’s core principles are simpler and purer.
The core principles of streamlit are as follows:
- Embrace Python
The streamlit app is a completely top-down script with no hidden state. You can use function calls to process code. As long as you can write Python scripts, you can write streamlit app. For example, you can write to the screen as follows:
import streamlit as stst.write('Hello, world!')
- Treat a widget as a variable
No callback in streamlit! Each interaction is just a top-down rerun of the script. This method makes the code very clean:
import streamlit as stx = st.slider('x') st.write(x, 'squared is', x * x)
Streamlit interactive app written in 3 lines of code。
- Reusing data and Computing
What if you want to download large amounts of data or perform complex calculations? The key is to reuse information safely in multiple runs. Streamlit introduces cache primitive, which is like a continuous default unchangeable data storage, to ensure that streamlit app can easily and safely reuse information. For example, the following code is only available from the udacity autopilot project（ https://github.com/udacity/se… ）You can get a simple and fast app by downloading data once in
use st.cache To save data in streamlit multiple runs. For code operation instructions, see: https://gist.github.com/treui…。
Run above st.cache The output of the example。
In short, the workflow of streamlit is as follows:
- Every user interaction requires running all scripts from scratch.
- Streamlit assigns the latest value to each variable based on the status of the widget.
- Caching ensures that streamlit reuses data and computation.
As shown in the figure below:
The user event triggers streamlit to rerun the script from scratch. Keep only cache in different runs。
If you are interested, you can try it immediately! Just run the following line:
The web browser opens automatically and moves to the local streamlit app. If no browser window appears, just click on the link.
These ideas are simple but effective, and using streamlit doesn’t prevent you from creating rich and useful apps. When I was working at Zoox and Google X, I watched the autopilot project develop into a number of G visual data. These data needed to be searched and understood, including running models on image data and comparing performance. Every autopilot project I see has tools for the whole team to do this.
Building such tools in streamlit is very simple. The following Streamlit demo can perform semantic search on the entire Udacity autopilot vehicle photo data set, visualize the true tag of human annotation, and run the complete neural network (YOLO) in app in real time.
This 300 line streamlit demo combines semantic visual search with interactive neural network inference。
The entire app has only 300 lines of Python code, most of which are machine learning code. In fact, there are only 23 streamlit calls in the entire app. You can try:
As we worked with machine learning teams to work on their projects, we came to realize that these simple ideas can bring a lot of important benefits:
Streamlit app is a pure Python file. You can use your favorite editor and debugger.
I like to use vscode editor (left) and chrome (right) when building apps with streamlit。
Pure Python code can be seamlessly connected with GIT and other source control software, including commit, pull requests, issues and comment. Since streamlit’s underlying language is python, you can take advantage of these collaboration tools for free.
Streamlit app is a python script, so you can easily perform version control using GIT。
Streamlit provides an instant mode programming environment. When streamlit detects changes to the source file, just click always rerun.
Click “always rerun” to ensure real-time programming。
Caching simplifies the calculation process. A series of cache functions automatically create an efficient calculation process! You can try the following code:
Simple calculation flow in streamlit. Run the above code, see instructions: https://gist.github.com/treui…。
Basically, the process involves loading metadata to creating a summary_ metadata → create_ summary）。 Each time the script runs, streamlit simply recalculates a subset of the process.
In order to ensure the executable of the app, streamlit only calculates the parts necessary to update the UI。
Streamlit is suitable for GPU. Streamlit can directly access machine level primitives (such as tensorflow, pytorch) and complement these libraries. For example, in the following demo, streamlit’s cache stores the entire NVIDIA pggan. This method enables the app to perform near instantaneous inference when the user updates the left slider.
The streamlit app uses tl-gan to demonstrate the effect of NVIDIA pggan。
Streamlit is a free open source library, not a private web app. You can deploy streamlit app locally without contacting us in advance. You can even run streamlit locally on your laptop without networking. In addition, existing projects can use streamlit progressively.
Several ways to use streamlit incrementally.
That’s just the tip of the iceberg for streamlit. These scripts are easy to read, but they can be complicated. This will involve the operation principle and function of the architecture, which will not be discussed in this paper.
Diagram of the streamlit component.
We are very happy to share streamlit with the community and hope it can help you easily turn Python scripts into beautiful and practical machine learning apps.
Link to the original text:
 J. Redmon and A. Farhadi, YOLOv3: An Incremental Improvement (2018), arXiv.
 T. Karras, T. Aila, S. Laine, and J. Lehtinen, Progressive Growing of GANs for Improved Quality, Stability, and Variation (2018), ICLR.
 S. Guan, Controlled image synthesis and editing using a novel TL-GAN model (2018), Insight Data Science Blog.