New wine in old bottles of strlen

Time:2020-10-11

Preface – overview of strlen

I accidentally scan the code in glibc strlen. C, and I can’t forget it for a long time

Programming is not a joke, some difficult, but also some reluctant. And strlen seems to be still in the school age, still so green and attractive~

/* Copyright (C) 1991-2020 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Written by Torbjorn Granlund ([email protected]),
   with help from Dan Sahlin ([email protected]);
   commentary by Jim Blandy ([email protected]).

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   .  */

#include 
#include 

#undef strlen

#ifndef STRLEN
# define STRLEN strlen
#endif

/* Return the length of the null-terminated string STR.  Scan for
   the null terminator quickly by testing four bytes at a time.  */
size_t
STRLEN (const char *str)
{
  const char *char_ptr;
  const unsigned long int *longword_ptr;
  unsigned long int longword, himagic, lomagic;

  /* Handle the first few characters by reading one character at a time.
     Do this until CHAR_PTR is aligned on a longword boundary.  */
  for (char_ptr = str; ((unsigned long int) char_ptr
            & (sizeof (longword) - 1)) != 0;
       ++char_ptr)
    if (*char_ptr == '
/* Copyright (C) 1991-2020 Free Software Foundation, Inc.
This file is part of the GNU C Library.
Written by Torbjorn Granlund ([email protected]),
with help from Dan Sahlin ([email protected]);
commentary by Jim Blandy ([email protected]).
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
.  */
#include 
#include 
#undef strlen
#ifndef STRLEN
# define STRLEN strlen
#endif
/* Return the length of the null-terminated string STR.  Scan for
the null terminator quickly by testing four bytes at a time.  */
size_t
STRLEN (const char *str)
{
const char *char_ptr;
const unsigned long int *longword_ptr;
unsigned long int longword, himagic, lomagic;
/* Handle the first few characters by reading one character at a time.
Do this until CHAR_PTR is aligned on a longword boundary.  */
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
/* All these elucidatory comments refer to 4-byte longwords,
but the theory applies equally well to 8-byte longwords.  */
longword_ptr = (unsigned long int *) char_ptr;
/* Bits 31, 24, 16, and 8 of this number are zero.  Call these bits
the "holes."  Note that there is a hole just to the left of
each byte, with an extra at the end:
bits:  01111110 11111110 11111110 11111111
bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD
The 1-bits make sure that carries propagate to the next 0-bit.
The 0-bits provide holes for carries to fall into.  */
himagic = 0x80808080L;
lomagic = 0x01010101L;
if (sizeof (longword) > 4)
{
/* 64-bit version of the magic.  */
/* Do the shift in two steps to avoid a warning if long has 32 bits.  */
himagic = ((himagic << 16) << 16) | himagic;
lomagic = ((lomagic << 16) << 16) | lomagic;
}
if (sizeof (longword) > 8)
abort ();
/* Instead of the traditional loop which tests each character,
we will test a longword at a time.  The tricky part is testing
if *any of the four* bytes in the longword in question are zero.  */
for (;;)
{
longword = *longword_ptr++;
if (((longword - lomagic) & ~longword & himagic) != 0)
{
/* Which of the bytes was the zero?  If none of them were, it was
a misfire; continue the search.  */
const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str;
if (cp[1] == 0)
return cp - str + 1;
if (cp[2] == 0)
return cp - str + 2;
if (cp[3] == 0)
return cp - str + 3;
if (sizeof (longword) > 4)
{
if (cp[4] == 0)
return cp - str + 4;
if (cp[5] == 0)
return cp - str + 5;
if (cp[6] == 0)
return cp - str + 6;
if (cp[7] == 0)
return cp - str + 7;
}
}
}
}
libc_hidden_builtin_def (strlen)
') return char_ptr - str; /* All these elucidatory comments refer to 4-byte longwords, but the theory applies equally well to 8-byte longwords. */ longword_ptr = (unsigned long int *) char_ptr; /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits the "holes." Note that there is a hole just to the left of each byte, with an extra at the end: bits: 01111110 11111110 11111110 11111111 bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD The 1-bits make sure that carries propagate to the next 0-bit. The 0-bits provide holes for carries to fall into. */ himagic = 0x80808080L; lomagic = 0x01010101L; if (sizeof (longword) > 4) { /* 64-bit version of the magic. */ /* Do the shift in two steps to avoid a warning if long has 32 bits. */ himagic = ((himagic << 16) << 16) | himagic; lomagic = ((lomagic << 16) << 16) | lomagic; } if (sizeof (longword) > 8) abort (); /* Instead of the traditional loop which tests each character, we will test a longword at a time. The tricky part is testing if *any of the four* bytes in the longword in question are zero. */ for (;;) { longword = *longword_ptr++; if (((longword - lomagic) & ~longword & himagic) != 0) { /* Which of the bytes was the zero? If none of them were, it was a misfire; continue the search. */ const char *cp = (const char *) (longword_ptr - 1); if (cp[0] == 0) return cp - str; if (cp[1] == 0) return cp - str + 1; if (cp[2] == 0) return cp - str + 2; if (cp[3] == 0) return cp - str + 3; if (sizeof (longword) > 4) { if (cp[4] == 0) return cp - str + 4; if (cp[5] == 0) return cp - str + 5; if (cp[6] == 0) return cp - str + 6; if (cp[7] == 0) return cp - str + 7; } } } } libc_hidden_builtin_def (strlen)

 

Text – thinking and analysis

1. How large is the unsigned long int byte? 4 bytes, 8 bytes?

  unsigned long int longword, himagic, lomagic;

 

The specific length of long depends on the platform. For example, in most Linux, x86 sizeof (long) = 4, x64 sizeof (long) = 8

Window x86, x64 sizeof (long) = 4. (May 28, 2020), C standard guarantees sizeof (long) > = sizeof (int)

The specific number of bytes to the implementation side

 

2. ((unsigned long int) char_ PTR & (sizeof (longword) – 1)) bit alignment?

  /* Handle the first few characters by reading one character at a time.
     Do this until CHAR_PTR is aligned on a longword boundary.  */
  for (char_ptr = str; ((unsigned long int) char_ptr
            & (sizeof (longword) - 1)) != 0;
       ++char_ptr)
    if (*char_ptr == '
  /* Handle the first few characters by reading one character at a time.
Do this until CHAR_PTR is aligned on a longword boundary.  */
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
') return char_ptr - str;

 

The purpose of the initial code is to let chart_ Align (sizelong) bit by byte

This involves most computer hardware alignment requirements and performance considerations (performance is the main factor)

 

3. himagic = 0x80808080L; lomagic = 0x01010101L; what fuck ? 

  /* Bits 31, 24, 16, and 8 of this number are zero.  Call these bits
     the "holes."  Note that there is a hole just to the left of
     each byte, with an extra at the end:

     bits:  01111110 11111110 11111110 11111111
     bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD

     The 1-bits make sure that carries propagate to the next 0-bit.
     The 0-bits provide holes for carries to fall into.  */
  himagic = 0x80808080L;
  lomagic = 0x01010101L;
  if (sizeof (longword) > 4)
    {
      /* 64-bit version of the magic.  */
      /* Do the shift in two steps to avoid a warning if long has 32 bits.  */
      himagic = ((himagic << 16) << 16) | himagic;
      lomagic = ((lomagic << 16) << 16) | lomagic;
    }
  if (sizeof (longword) > 8)
    abort ();

  /* Instead of the traditional loop which tests each character,
     we will test a longword at a time.  The tricky part is testing
     if *any of the four* bytes in the longword in question are zero.  */
  for (;;)
    {
      longword = *longword_ptr++;

      if (((longword - lomagic) & ~longword & himagic) != 0)
    {

 

3.1 (((longword – lomagic) & ~longword & himagic) != 0) ? mmp ?

Maybe this is art. It’s really a genius to think of this idea. It’s so clever. Ha ha. We’ll explain it in two small points

For the first time, I feel a little cute. Here I use a simple idea to lead you to understand this problem. The above code mainly revolves around

Sizeof (unsigned long) 4 bytes and 8 bytes to get. We simply point, through processing 1 byte, analogy recursion mechanism

Understand the principle behind this formula (ˍ ˍ) ~

/**
 * himagic      : 1000 0000
 * lomagic      : 0000 0001
 * longword     : XXXX XXXX
 * /
unsigned long himagic = 0x80L;
unsigned long lomagic = 0x01L;

unsigned long longword ;

Then we analyze the following formula in detail

((longword - lomagic) & ~longword & himagic)

(& himagic) = (& 10 0000) indicates that only the highest order is concerned

Longword is discussed in three cases

longword     : 1XXX XXXX  128 =< x <= 255
longword     : 0XXX XXXX  0 < x < 128
longword     : 0000 0000  x = 0

The first longword = 1xxx XXX

Then ~ longword = 0yyy yyyy obviously ~ longword & himagic = 0000 0000, no need to continue

The second kind of longword = 0 xxx xxx and not 0, and not less than 1

Obviously (longword – lomagic) = 0zzz zzz > = 0 and < 127, because lomagic = 1;

At the moment (longword – lomagic) & himagic = 0zzz zzz & 1000 0000 = 0, so there is no need to continue

The third longword = 0000 0000

Then ~ longword & himagic = 1111 1111 & 1000 0000 = 1000 000;

Let’s look at (longword – lomagic) = (0000 0000 – 0000 0001), because the unsigned number subtraction is based on

(complement (0000 0000) + complement (- 000 0001)) = (complement (0000 0000) + complement (~ 000 0001 + 1))

=(complement (0000 0000) + complement (1111 1111)) = 1111 1111 (the final result can be obtained by looking up the formula quickly),

Therefore, the final result is 1111 1111 & 1000 0000 = 1000 0000 > 0

According to the above formula, we can select whether the value is 0. For 2 bytes, 4 bytes, 8 bytes, the idea is completely similar

 

3.2 (sizeof (longword) > 4)? (sizeof (longword) > 8) why not use a macro to make a big plan?

Macro can share multi platform source code, but can’t share multi platform binary. Glibc is such a general project, portability influence factor

It may be very heavy

 

4. libc_ hidden_ builtin_ Def (strlen)~

To understand this, we need to introduce some off-site information (different compilation parameters will be different. Here, only one branch solution is extracted)

// file : glibc-2.31/include/libc-symbols.h

libc_hidden_builtin_def (strlen)

#define libc_hidden_builtin_def(name) libc_hidden_def (name)

# define libc_hidden_def(name) hidden_def (name)

/* Define ALIASNAME as a strong alias for NAME.  */
# define strong_alias(name, aliasname) _strong_alias(name, aliasname)
# define _strong_alias(name, aliasname) \
  extern __typeof (name) aliasname __attribute__ ((alias (#name))) \
    __attribute_copy__ (name);

/* For assembly, we need to do the opposite of what we do in C:
   in assembly gcc __REDIRECT stuff is not in place, so functions
   are defined by its normal name and we need to create the
   __GI_* alias to it, in C __REDIRECT causes the function definition
   to use __GI_* name and we need to add alias to the real name.
   There is no reason to use hidden_weak over hidden_def in assembly,
   but we provide it for consistency with the C usage.
   hidden_proto doesn't make sense for assembly but the equivalent
   is to call via the HIDDEN_JUMPTARGET macro instead of JUMPTARGET.  */
#  define hidden_def(name)    strong_alias (name, __GI_##name)

/* Undefine (also defined in libc-symbols.h).  */
#undef __attribute_copy__
#if __GNUC_PREREQ (9, 0)
/* Copies attributes from the declaration or type referenced by
   the argument.  */
# define __attribute_copy__(arg) __attribute__ ((__copy__ (arg)))
#else
# define __attribute_copy__(arg)
#endif

 

Use the macro definition above to expand

libc_hidden_builtin_def (strlen)
|

hidden_def (strlen)
|

strong_alias (strlen, __GI_strlen)
|

_strong_alias (strlen, __GI_strlen)
|

extern __typeof (strlen) __GI_strlen __attribute__ ((alias ("strlen"))) __attribute_copy__ (strlen);
|
extern __typeof (strlen) __GI_strlen __attribute__ ((alias ("strlen"))) __attribute__ ((__copy__ (strlen)));
``

 

Where gun C extends the syntax

  __typeof (arg): gets the declared type of the variable
  __attribute__ ((__copy__ (arg))): attribute copy replication feature of GCC 9 and above
  alias_name __attribute__ ((alias (name))): declare alias name for name
 
Summary: libc_ hidden_ builtin_ Def (strlen) means to redefine a symbol alias based on the strlen symbol__ GI_ strlen.
(supplementary information strong_ Alias note)
 
 
There are many kinds of strlen engineering code, we choose a general glibc version to think and analyze. If you are interested, you can refer to more
Come at will ~ be a man, happiness is the most important ~ work hard, Rui Chenggang ~ ha ha ha

 

Postscript – outlook and life

Mistakes are inevitable. Please correct and communicate~