• Target detection API


    By Ivan Rala š i ćCompile VKSource: analytics vidhya Tensorflow target detection API (TF od API) is just getting better. Recently, Google released a new version of TF od API, which now supports tensorflow 2. X, which is a huge improvement we have been waiting for! brief introduction Recent improvements in object detection (OD) are […]

  • Detailed analysis of checkpoint mechanism, the cornerstone of Flink reliability


    Checkpoint introduction Checkpoint mechanism is the cornerstone of Flink’s reliability, which can ensure that the Flink cluster can restore the state of the whole application flow graph to a certain state before the failure when an operator fails due to some reasons (such as abnormal exit), so as to ensure the consistency of the application […]

  • It’s time to upgrade your parquet: IOException: totalvaluecount = = 0


    Abstract:When using spark SQL to perform ETL task, an error is reported when reading a table: “IOException: totalvaluecount = = 0”, but there is no exception when writing the table. This article is shared from Huawei cloud community《It’s time to upgrade your parquet: IOException: totalvaluecount = = 0》Author: wzhfy. 1. Problem description When using spark […]

  • Big data development: persistence and caching of spark RDD


    1. RDD caching mechanism cache, persist One reason spark is very fast is that RDD supports caching. After successful caching, if the dataset is used by subsequent operations, it will be directly obtained from the cache. Although the cache also has the risk of loss, due to the dependency between RDDS, if the cache data […]

  • Flume1.7 + Kafka + streaming integrated development steps


    1. Installation Download address:apache-flume-1.6.0 After downloading, upload and unzip in the / opt / ebohailife / directory [[email protected] ~]$ tar -zxvf apache-flume-1.7.0-bin.tar.gz Check whether the installation is successful or not: opt / ebohailife / flume / apache-flume-1.6.0-bin / bin / flume ng version Print the following information to indicate that the installation is successful [[email protected] […]

  • Technology sharing | multi thread parallel playback of slave MTS (1)


    Author: Gao Peng (eight monsters) This section contains the distribution call process. Please refer to the link below:https://www.jianshu.com/p/870… 1、 Overview Different from the playback of single SQL thread, MTS contains multiple working threads, and the original SQL thread is transformed into coordination thread. The SQL coordination thread also undertakes the work of checkpoint. We know […]

  • Flink internal exact only three axes: state, state back end and checkpoint


    Flink is a distributed stream processing engine, and one of the characteristics of stream processing is 7×24. So, how to ensure the continuous operation of the Flink job? The internal of Flink will store the application state in the local memory or the embedded kV database (rocksdb). Due to the distributed architecture, Flink needs to […]

  • Alliance +: small game to achieve rapid growth, you need these seven tips!


    Tiktok market has been developing at a relatively mature stage after three years or so. It is on the one hand manifested in the continuous improvement of the existing team’s R & D capability and traffic operation capability. On the other hand, the emergence of new platform channels brings the continuous opening and activation of […]

  • Spark streaming learning notes


    characteristic: Spark streaming can realize streaming processing of real-time data stream, and has good scalability, high throughput and fault tolerance. Spark streaming supports data extraction from a variety of data sources, such as Kafka, flume, twitter, zeromq, kinesis and TCP sockets. It also provides some advanced APIs to express complex processing algorithms, such as map, […]

  • Yolov5 simple tutorial


    This library represents ultralytics’ open source research on future object detection methods, and combines with the previous Yolo library https://github.com/ultralytics/yolov3 Best practices from training thousands of models on custom datasets.All codes and models are under active development and are subject to modification or deletion without prior notice.If used, at your own risk. GPU speed measurement: […]

  • How to deploy pytorch lightning model to production


    A complete guide to the large scale service pytorch lightning model. Throughout the field of machine learning, one of the main trends is the proliferation of projects focusing on the application of software engineering principles to machine learning. For example,CortexIt reproduces the experience of deploying server free but reasoning pipeline. Similarly, DVC implements modern version […]

  • spark:checkpoint


    Previously, I recorded the checkpoint mechanism of Flink in the process of learning Flink. Today, I will mainly record the checkpoint mechanism of spark and analyze the overall process with the source code. checkpoint As a fault-tolerant mechanism, checkpoint will be used in many scenarios. For example, in the previously recorded Flink, the checkpoint mechanism […]