Analysis of hive modeling


Modeling analysis

  • The discussion is based on the case of visitor system

Purpose of modeling analysis

  • Analyze which levels, tables and fields the whole data warehouse needs
  • ODS layer: source data layer

    • Docking source data, keep the same granularity as the source data
  • DWD

    • Task:
    • 1. Cleaning.

      • Incomplete data
      • Expired or invalid data
    • 2. Conversion

      • create_ Time – > mm / DD / yyyy
      • Or timestamp
    • 3. Dimension degradation can be appropriate to reduce the association of tables
  • DWM: middle layer

    • For example, the daily records can be merged first, and the monthly records can be merged more conveniently later
  • DWS: business layer

    • Refine aggregate statistics,
  • App: application layer

    • The detailed statistical results are analyzed again
    • It can be omitted
  • Dim layer: Dimension