Sitemap about Spartacus XML problem

Time:2022-5-26

staySAP CommerceIn, you can use cronjob to generate XML site maps. This cronjob will collect all the pages in the current site and build a media file with the URL of each page. A separate file is created based on the page type. Later, when you use the old accelerator storefront (configure the / yaccelerator storefront extension using the web module) and access / yaccelerator storefront / sitemap XML URL, the sitemap index is displayed, which contains references to all partial Sitemaps.

Site maps allow site administrators to notify search engines of the pages on their site that are available for indexing. Accelerator supports different page types (such as product pages and category pages) and site maps in different languages and currencies.

In its simplest form, a site map is an XML file that lists the URL of the site and other metadata about each URL so that search engines can index the site more intelligently. Examples of metadata include information about when the URL was last updated, how often it was changed, and how important it is relative to other URLs in the site.

The site map is published in accelerator through the following URL:

http://electronics.local:9001…

Here is an example of sitemap index:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
            <loc>http://electronics.local:9001/medias/Homepage-ja-JPY-3422021852412885281.xml?context=bWFzdGVyfHJvb3R8MzQwfHRleHQveG1sfGgyNi9oNTUvODc5NzA3NjQyMjY4Ni54bWx8ODhkMDBhODYyMGU5OGY4YTRlMGVjNTE1MmVkMTgxOWYxNDBkOTU0MjU0MjRlZmZhODA5ZWNkY2Q2YzJlZmFhYg</loc>
        </sitemap>
    
    </sitemapindex>

The code responsible for generating sitemap in accelerator: sitemapcontroller java

Customers often need a mechanism to automatically discover all Spartacus pages (URLs) in order to generate site maps or pre render them in SSR / SSG. This is a cross component and module task, which requires the integration of many information and mechanisms:

  • Collect all categories defined by backoffice
  • Collect all products defined by backoffice
  • Collect all static angular routes defined in Spartacus / client applications
  • Use Spartacus routing configuration to shape the URLs of specific PDP, PLP and content pages
  • more

For some pages, you may want to generate canonical URL alternatives. (Note: Spartacus has the function of regulating the URL, which is used to place the link on the top of the current document<head>Medium; Maybe it can adapt in some way to generate site map / url discovery…)

For the product list, collect all facet combinations or search queries in the URL you want to index

For the content page, learn about all Spartacus CMS driven sub routes. For example, there is a single content page with a page label / store finder in the background, but in Spartacus, the CMS driven sub routes at the top of the single content page are: / store Finder (parent), / store finder / view – all, / store finder / country /: country, / store finder / country /: Country / region /: region For specific functions, such as store finder, you also need to collect all possible and effective dynamic URL parameter combinations (such as: country,: region).

Due to the complexity (various mechanisms involved) and scale (a large number of products, categories, etc.), the process of discovering all Spartacus pages needs to be automated. In order to keep the site map up-to-date, customers should run the process regularly to understand how often content managers add new pages, products, categories and aspects.