Tag:hadoop

  • [Zhao Yuqiang] The new book “Big Data Principles and Practical Combat” is on the market! ! !

    Time:2022-11-27

    After nearly a year of waiting, the new book “Big Data Principles and Practice” is on the market! ! Sneak peek! ! ! Click here for details ​ ​ Click here for details ​ Click here for details Click here for details ​ Click here for details ​ Click here for details Click here for […]

  • Oozie5.2.1 + Hadoop3 compilation

    Time:2022-11-26

    Compile Oozie5.2.1 based on hadoop3 System Requirements Java JDK 1.8+ Maven 3.0.1+ Hadoop 3.0.0+ compile summary git clone https://github.com/apache/oozie.git # If building against Hadoop 3, the profile hadoop-3 must be activated The following properties should be specified when building the distribution: -DgenerateDocs : force generation of Oozie documentation -DskipTests : skip tests -Dvc.revision= : Specifies […]

  • (4) Demonstration of Flink CEP SQL greedy word volume

    Time:2022-11-25

    Based on the extension of the previous (3) Flink CEP SQL loose neighbor code demonstration, in the previous article we used greedy word size + (matching at least 1 or more lines), this article will demonstrate the effect of various greedy word sizes:(1) Use greedy word size * (match 0 or more lines) public static […]

  • Client cannot authenticate XXX:[TOKEN, KERBEROS]

    Time:2022-11-25

    Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate xxx:[TOKEN, KERBEROS] The security authentication failed, and the cause analysis is as follows:1. Check whether the Kerberos address can be connected normally // kdc’s ip System.setProperty(“java.security.krb5.kdc”, “192.168.1.1”); // realm System.setProperty(“java.security.krb5.realm”, “XXX”); 2. The configuration file failed to be read successfully UserGroupInformation.loginUserFromKeytab(user name, path to keytab file); It is recommended […]

  • Can’t get Kerberos realm

    Time:2022-11-24

    When using HDFS to connect to Hadoop, Keytab authentication is required, but an error is reported: Can’t get Kerberos realm Solution: Add two lines of code to the configuration code System.setProperty(“java.security.krb5.realm”, “XXX.COM”);System.setProperty(“java.security.krb5.kdc”, “XXX.COM”); XXX.COM is obtained from the krb5.cof configuration file screenshot:

  • Graphical Big Data | Detailed Explanation of Distributed Platform Hadoop and Map-reduce

    Time:2022-11-23

    author:Han [email protected] address:http://www.showmeai.tech/tutorials/84Address of this article:http://www.showmeai.tech/article-detail/168Disclaimer: All rights reserved, please contact the platform and the author for reprinting and indicate the source 1.Hadoop quick start 1) Introduction to Hadoop Hadoop is an open source distributed computing platform under the Apache Software Foundation, which provides users with a distributed infrastructure with transparent details of the underlying […]

  • Graphical big data | Practical case – Hadoop system construction and environment configuration

    Time:2022-11-22

    author:Han [email protected] address:http://www.showmeai.tech/tutorials/84Address of this article:http://www.showmeai.tech/article-detail/169Disclaimer: All rights reserved, please contact the platform and the author for reprinting and indicate the source 1 Introduction This tutorial ShowMeAI will explain the installation and environment configuration of Hadoop in detail. For the basic knowledge of Hadoop and map-reduce, you can review the basic knowledge of ShowMeAIDetailed explanation […]

  • CDH6 offline installation

    Time:2022-11-21

    1. Environmental preparation 1. Introduction to CM Cloudera Manager is a tool with automatic cluster installation, centralized management, cluster monitoring, and alarm functions, which shortens the time to install a cluster from a few days to a few hours, and reduces the number of operation and maintenance personnel from dozens to a few. Greatly improve […]

  • HDFS heterogeneous storage of Hadoop operation and maintenance toolbox

    Time:2022-11-20

    Heterogeneous storage mainly solves the problem that different data are stored in different types of hard disks to achieve the best performance. The storage types and storage strategies of Hadoop are; 1. Check which storage strategies are currently available [[email protected] hadoop-3.1.3]$ hdfs storagepolicies-listPolicies2. Set the specified storage policy for the specified path (data storage directory) […]

  • How to access the webui of services such as HDFS/YARN/HIVESERVER2 after enabling kerberos in CDH/CDP

    Time:2022-11-20

    How to access the webui of services such as HDFS/YARN/HIVESERVER2 after enabling kerberos in CDH/CDP In big data platforms such as CDH/CDP, how to access the webui of services such as HDFS/YARN/HIVESERVER2 after kerberos security is enabled? Let’s take a look at the relevant knowledge together. problem phenomenon In big data platforms such as CDH/CDP, […]

  • (6) Flink CEP SQL simulated account login risk control warning in a short period of time

    Time:2022-11-19

    In this article, we will simulate a real risk identification scenario, simulating possible account hacking on the XX platform. Technical implementation plan: (1) Send the login log of the xxx platform user to kafka (the socket used for the code demonstration in this article); (2) define the risk control identification rules in the Flink CEP […]

  • HDFS cluster expansion of Hadoop operation and maintenance toolbox

    Time:2022-11-19

    1. Add white list Whitelist: Indicates that the host IP addresses in the whitelist can be used to store data. In the enterprise: Configure a whitelist to prevent malicious access attacks by hackers. The steps to configure the whitelist are as follows: 1) Create whitelist and blacklist files in the /opt/module/hadoop-3.1.3/etc/hadoop directory of the NameNode […]