Rust practice – using socket networking API (I)

Time:2021-9-23

Although the standard library has been encapsulatedTcpListenerandTcpStreamAnd other basic APIs, but as rust lovers, we can find out. This article assumes that you have some knowledge of rust and Linux operating systems.

On Linux, rust will link to the system by defaultlibcAnd some other libraries, which means you can use them directlylibcFunctions in. For example, you can usegethostnameGet the “hostname” of your computer:

use std::os::raw::c_char;
use std::ffi::CStr;

extern {
    pub fn gethostname(name: *mut c_char, len: usize) -> i32;
}

fn main() {
    let len = 255;
    let mut buf = Vec::<u8>::with_capacity(len);
    let ptr = buf.as_mut_ptr() as *mut c_char;

    unsafe {
        gethostname(ptr, len);
        println!("{:?}", CStr::from_ptr(ptr));
    }
}

Explain the code above.

extrenRepresents “external blocks”, which is used to declare symbols in external non rust libraries. If we need to use functions other than rust, such as libc, we need toextrenDefine the functions that need to be used, and then you can use external functions like local functions. The compiler will be responsible for helping us convert. Isn’t it very convenient. However, calling an external function isunsafe, the compiler cannot provide sufficient assurance, so it should be put intounsafeBlock.

If the external function has variable parameters, it can be declared as follows:

extern {
    fn foo(x: i32, ...);
}

However, the functions in rust do not support variable parameters at present.

Actually, this should beextern "C" { .. }, because the default is"C", we can omit it. There are some other optional values, because they will not be used here. For the time being, you can gohereCheck here.

Let’s talk about types. The prototype of the “gethostname” function in the C header file is:

int gethostname(char *name, size_t len);

On Linux 64 bit platform, theintCorresponding to in rustintsize_tCorresponding to in rustusize, but in CcharAnd in rustcharIt’s completely different, in CcharAlwaysi8perhapsu8, and in rustcharIs a Unicode scalar value. You can go, tooStandard librarysee. For pointers, in rustBare pointerIt is almost the same as the pointer in C, rust’s*mutNormal pointer corresponding to C,*constConst pointer corresponding to C. Therefore, we correspond the types one by one, and the parameter names of the functions do not need to be consistent.

pub fn gethostname(name: *mut i8, len: usize) -> i32;

But we’ll use it laterCStr::from_ptr()Convert the string in C to the local string of rust. The definition of this function is:

pub unsafe fn from_pt<'a>(ptr: *const c_char) -> &'a CStr

In order to “look better”, I wrote itc_char, but,c_charjusti8Alias, you writei8No problem.

type c_char = i8;

You can seehere

However, if you are considering cross platform, you may need iti32change intostd::os::raw::c_int, not in C on all platformsintAll correspond to in rusti32。 However, if you don’t have one-to-one correspondence, it is feasible to some extent, if there is no cross-border. Like this:

use std::os::raw::c_char;
use std::ffi::CStr;

extern {
    pub fn gethostname(name: *mut c_char, len: u16) -> u16;
}

fn main() {
    let len = 255;
    let mut buf = Vec::<u8>::with_capacity(len);
    let ptr = buf.as_mut_ptr() as *mut c_char;

    unsafe {
        gethostname(ptr, len as u16);
        println!("{:?}", CStr::from_ptr(ptr));
    }
}

I putsize_tandintIt all correspondsu16, this code can be compiled and output you correctlyhostnameYes, but I suggest that you’d better match the types one by one to reduce some unnecessary trouble. Of course, you put that*mut c_charchange into*mut i32, no problem. It’s a pointer anyway. You can try:

use std::os::raw::c_char;
use std::ffi::CStr;

extern {
    pub fn gethostname(name: *mut i32, len: u16) -> u16;
}

fn main() {
    let len = 255;
    let mut buf = Vec::<u8>::with_capacity(len);
    let ptr = buf.as_mut_ptr() as *mut i32;

    unsafe {
        gethostname(ptr, len as u16);
        println!("{:?}", CStr::from_ptr(ptr as *const i8));
    }
}

You can alsoVec::<u8>change intoVec::<i32>Look at the results.

int gethostname(char *name, size_t len)This function is to receive a char array and array length, which can also be said to be the maximum length of the receive buffer and the receive buffer. I created a 255 capacityVec<u8>, convert its variable pointer to a bare pointer. You can also create a U8 array with a length of 255, and there is no problem:

    let len = 255;
    let mut buf = [0u8; 255];
    let ptr = buf.as_mut_ptr() as *mut i32;

    unsafe {
        gethostname(ptr, len as u16);
        println!("{:?}", CStr::from_ptr(ptr as *const i8));
    }

Why is this possible? The underlying memory layout of rust slice and VEC is the same as that of C. (note that the relationship between slice and array in rust is like the relationship between & STR and STR). We can see the definitions of VEC and slice in the source code:

pub struct Vec<T> {
    buf: RawVec<T>,
    len: usize,
}

pub struct RawVec<T, A: Alloc = Global> {
    ptr: Unique<T>,
    cap: usize,
    a: A,
}

pub struct Unique<T: ?Sized> {
    pointer: *const T,
    _marker: PhantomData<T>,
}

struct FatPtr<T> {
    data: *const T,
    len: usize,
}

VEC is a structure that containsbufandlenTwo fields,lenUsed to represent the length of VEC,bufIt points to another structureRawVec, there are three fields, the third fieldaIt is a tarit and does not occupy memory.capUsed to represent the capacity of VEC,ptrPoint to another structureUnique, of whichpointerThe field is a bare pointer,_markerIt’s a mark for the compiler, and it doesn’t occupy memory. I won’t discuss this for the time being. You can see itfile。 Slice has a simpler structure, just a bare pointer and length.

althoughRawVecandUniqueIt is invisible outside the standard library, but we can still use certain “means” to get the internal value, that is, define a structure with the same memory layout as VEC and “force” conversion.

#[derive(Debug)]
struct MyVec<T> {
    ptr: *mut T,
    cap: usize,
    len: usize
}

I defined a concept calledMyVecTwo fields in VEC that do not occupy memory are ignored. Their memory layout is the same, which is 24 on 64 bit platforms(ptrOccupy 8 bytes, and the other two use 8 bytes. You can try:

#[derive(Debug)]
struct MyVec<T> {
    ptr: *mut T,
    cap: usize,
    len: usize
}

println!("{:?}", std::mem::size_of::<Vec<u8>>());
println!("{:?}", std::mem::size_of::<MyVec<u8>>());

I’ll create one firstVec<u8>, get itVec<u8>Bare pointer to*const Vec<u8>, and then*const Vec<u8>Convert to*const MyVec<u8>Then, dereference, you getMyVec<u8>Yes. However, the dereference bare pointer isunsafe, be careful!!! You can also look at the standard librarypointerDocument for.

fn main() {
    let vec = Vec::<u8>::with_capacity(255);

    println!("vec ptr: {:?}", vec.as_ptr());

    #[derive(Debug)]
    struct MyVec<T> {
        ptr: *mut T,
        cap: usize,
        len: usize
    }

    let ptr: *const Vec<u8> = &vec;

    let my_vec_ptr: *const MyVec<u8> = ptr as _;

    unsafe {
        println!("{:?}", *my_vec_ptr);
    }
}

After compiling and running, can you see the following output:

vec ptr: 0x557933de6b40
MyVec { ptr: 0x557933de6b40, cap: 255, len: 0 }

As you can see, we callvec.as_ptr()What you get is the bare pointer inside VEC.

aboutstd::mem::size_ofYou can also use two equal typesstd::mem::transmuteThis function conversion is almost equivalent to the above indirect conversion through bare pointer, but an additional verification will be added if there are two typessize_ofIf they are not equal, they cannot be compiled. This function isunsafeYes.

You can continue to try, for exampleVec<u8>Convert to a length of 3 (or less and greater)usizeArray, like this:

fn main() {
    let vec = Vec::<u8>::with_capacity(255);

    println!("vec ptr: {:?}", vec.as_ptr());

    let ptr: *const Vec<u8> = &vec;

    unsafe {
        let aaa_ptr: *const [usize; 2] = ptr as _;
        println!("{:?}", (*aaa_ptr)[0] as *const u8);
    }
}

However, due to the capacity expansion mechanism of VEC in rust, this code has some problems:

fn main() {
    let len = 255;
    let mut buf = Vec::<u8>::with_capacity(len);
    let ptr = buf.as_mut_ptr() as *mut c_char;

    unsafe {
        gethostname(ptr, len);
        println!("{:?}", CStr::from_ptr(ptr));
    }

    println!("{:?}", buf);
}

Although the correct host name is obtained, you can print it laterbufYou’ll find,bufIt’s empty. This problem is left for you to explore.

As you can see, rust has become “unsafe”, which inadvertently introduces another topic–《Meet Safe and Unsafe》。 However, we should return to the subject as soon as possible and talk about this topic later.

Speaking of socket API, it mainly includes functions related to TCP, UDP and SCTP, I / O multiplexing functions and advanced I / O functions. Most of these functions are not available in the rust standard. If the standard library cannot meet your needs, you can call them directlylibcFunctions in. In fact, the network part of the standard library is basically rightlibcEncapsulation of related functions in.

Start with TCP. TCP socket programming mainly involvessocketconnectbindlistenacceptclosegetsocknamegetpeernameEqual function. Let’s take a look at the definitions of these functions:

//The socket function is used to specify the expected communication protocol type and return the socket descriptor
int socket(int family, int type, int protocol); //  Successfully returned listening descriptor. Used to set listening. The error is - 1
//Family is the protocol type used by the socket. For TCP, it is usually set to ` AF_ INET ` or ` AF_ Inet6 `, which means' IPv4 'and' IPv6 '`
//Type is the socket type created, and TCP is a byte stream socket, so it is set to ` sock here_ Stream `, optional values are
// `SOCK_ Dgram ` for UDP, ` sock_ Seqpacket ` for SCTP
//The ID of the protocol can be set to 0, allowing the system to select the default value. Optional values are ` ipproto_ TCP`、`IPPROTO_ UDP`、`IPPROTO_ SCTP`

//The connect function is used by the client to connect to the TCP server
int connect(int sockfd, const struct sockaddr *servaddr, socklen_t addrlen); //  0 returned successfully with - 1 error
//Sockfd is the socket descriptor returned by the socket function. The second and third parameters point to a pointer to the socket address structure and the length of the pointer respectively

//The bind function assigns a local protocol address to a socket.
int bind(int sockfd, const struct sockaddr *myaddr,  socklen_t addrlen); //  0 returned successfully with - 1 error
//The second and third parameters are the pointer to the address structure characteristic of the protocol and the length of the pointer, respectively

//The listen function converts an unconnected socket into a passive socket, indicating that the kernel should accept the connection request to the socket.
int listen(int sockfd, int backlog); //  0 returned successfully with - 1 error
//The second parameter specifies the maximum number of connections queued by the kernel for the corresponding socket.

//The accept function is called by the TCP server to return the next completed connection from the queue header of the completed connection.
int accept(int sockfd, struct sockaddr *cliaddr, socklen_t *addrlen); //  The non negative descriptor is returned successfully, and the error returns - 1
//The second and third parameters are used to return the protocol address of the client and the size of the address

//Close is used to close the socket and terminate the TCP connection
int close(int sockfd); //  0 returned successfully with - 1 error

//The getsockname and getpeername functions return the local protocol address and foreign protocol address associated with a socket
int getsockname(int sockfd,struct sockaddr *localaddr,socklen_t *addrlen); //  0 returned successfully with - 1 error
int getpeername(int sockfd,struct sockaddr *peeraddr,socklen_t *addelen); //  0 returned successfully with - 1 error

There is also a pair of common functions,readandwriteUsed to read and write data. There are also three pairs of advanced I / O functions,recv/sendreadv/writevandrecvmsg/sendmsgAdd it when you need it.

ssize_t read(int fd, void *buf, size_t count);
ssize_t write(int fd, const void *buf, size_t count);

In addition to functions, there are several constants andsockaddrThis structure. Constants need to be defined on the rust side. Only the required constants are defined:

const AF_INET: i32 = 2;
const AF_INET6: i32 = 10;
const SOCK_STREAM: i32 = 1;
const IPPROTO_TCP: i32 = 6;

exceptsockaddrIn addition, there are several related structures. Their definition in C is:

struct sockaddr
{
    unsigned short    int sa_ family; //  Address family
    unsigned char     sa_ data[14];  //  Contains the destination address and port information in the socket
};

struct sockaddr_in
{
    sa_family_t       sin_family;
    uint16_t          sin_port;
    struct in_addr    sin addr;
    char              sin_zero[8];
};

struct in_addr
{
    In_addr_t  s_addr;
};

struct sockaddr_in6
{
    sa_family_t       sin_family;
    in_port_t         sin6_port;
    uint32_t          sin6_flowinfo;
    struct in6_addr   sin6_addr; 
    uint32_t          sin6_scope_id;
};

struct in6_addr
{
    uint8_t           s6_addr[16]
};

struct sockaddr_storage {
    sa_family_t       ss_family;     // address family

    // all this is padding, implementation specific, ignore it:
    char              __ss_pad1[_SS_PAD1SIZE];
    int64_t           __ss_align;
    char              __ss_pad2[_SS_PAD2SIZE];
};

Then, you need to define a structure with the same layout in rust:

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr {
    pub sa_family: u16,
    pub sa_data: [c_char; 14],
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr_in {
    pub sin_family: u16,
    pub sin_port: u16,
    pub sin_addr: in_addr,
    pub sin_zero: [u8; 8],
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct in_addr {
    pub s_addr: u32,
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr_in6 {
    pub sin6_family: u16,
    pub sin6_port: u16,
    pub sin6_flowinfo: u32,
    pub sin6_addr: in6_addr,
    pub sin6_scope_id: u32,
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct in6_addr {
    pub s6_addr: [u8; 16],
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr_storage {
    pub ss_family: u16,
    _unused: [u8; 126]
}

You need to add one in front of the structure#[repr(C)]Label to ensure that the memory layout of the structure is consistent with C, because the memory alignment rules of the rust structure may be different from C.#[derive(Debug, Clone, Copy)]It’s not necessary. For the last structuresockaddr_storage, I’m also very confused. I don’t know how to define it in rust, but I know it takes 128 bytes. Then I define a U8 array with a length of 126 to make up 128 bits.

Next, continue to define those functions:

extern {
    pub fn socket(fanily: i32, ty: i32, protocol: i32) -> i32;
    pub fn connect(sockfd: i32, servaddr: *const sockaddr, addrlen: u32) -> i32;
    pub fn bind(sockfd: i32, myaddr: *const sockaddr, addrlen: u32) -> i32;
    pub fn listen(sockfd: i32, backlog: i32);
    pub fn accept(sockfd: i32, cliaddr: *mut sockaddr, addrlen: u32) -> i32;
    pub fn close(sockfd: i32) -> i32;
    pub fn getsockname(sockfd: i32, localaddr: *mut sockaddr, addrlen: *mut u32) -> i32;
    pub fn getpeername(sockfd: i32, peeraddr: *mut sockaddr, addrlen: *mut u32) -> i32;
    pub fn read(fd: i32, buf: *mut std::ffi::c_void, count: usize) -> isize;
    pub fn write(fd: i32, buf: *const std::ffi::c_void, count: usize) -> isize;
}

aboutreadandwriteParameters inbuftypevoid, you can use thestd::ffi::c_void, or*mut u8/*const u8, like the following:

pub fn read(fd: i32, buf: *mut u8, count: usize) -> isize;
pub fn write(fd: i32, buf: *const u8, count: usize) -> isize;

Or, sincevoidIt is a “dynamic type” and can also pass a pointer of other types. You can try it later, but it may be a little dangerous.

Look at the current code:

use std::os::raw::c_char;
use std::ffi::c_void;

pub const AF_INET: i32 = 2;
pub const AF_INET6: i32 = 10;
pub const SOCK_STREAM: i32 = 1;
pub const IPPRPTO_TCP: i32 = 6;

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr {
    pub sa_family: u16,
    pub sa_data: [c_char; 14],
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr_in {
    pub sin_family: u16,
    pub sin_port: u16,
    pub sin_addr: in_addr,
    pub sin_zero: [u8; 8],
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct in_addr {
    pub s_addr: u32,
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct sockaddr_in6 {
    pub sin6_family: u16,
    pub sin6_port: u16,
    pub sin6_flowinfo: u32,
    pub sin6_addr: in6_addr,
    pub sin6_scope_id: u32,
}

#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct in6_addr {
    pub s6_addr: [u8; 16],
}

#[repr(C)]
#[derive(Clone, Copy)]
pub struct sockaddr_storage {
    pub ss_family: u16,
    _unused: [u8; 126]
}

extern {
    pub fn socket(fanily: i32, ty: i32, protocol: i32) -> i32;
    pub fn connect(sockfd: i32, servaddr: *const sockaddr, addrlen: u32) -> i32;
    pub fn bind(sockfd: i32, myaddr: *const sockaddr, addrlen: u32) -> i32;
    pub fn listen(sockfd: i32, backlog: i32);
    pub fn accept(sockfd: i32, cliaddr: *mut sockaddr, addrlen: *mut u32) -> i32;
    pub fn close(sockfd: i32) -> i32;
    pub fn getsockname(sockfd: i32, localaddr: *mut sockaddr, addrlen: *mut u32) -> i32;
    pub fn getpeername(sockfd: i32, peeraddr: *mut sockaddr, addrlen: *mut u32) -> i32;
    pub fn read(fd: i32, buf: *mut std::ffi::c_void, count: usize) -> isize;
    pub fn write(fd: i32, buf: *const std::ffi::c_void, count: usize) -> isize;
}

Then, we can write a simple server and client program: the server listens to an address, the client connects to the server, then sends “Hello, server!” to the server, the server responds with “Hi, client!”, and the client disconnects after receiving it.

fn main() {
    use std::io::Error;
    use std::mem;
    use std::thread;
    use std::time::Duration;

    thread::spawn(|| {

        // server
        unsafe {
            let socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
            if socket < 0 {
                panic!("last OS error: {:?}", Error::last_os_error());
            }

            let servaddr = sockaddr_in {
                sin_family: AF_INET as u16,
                sin_port: 8080u16.to_be(),
                sin_addr: in_addr {
                    s_addr: u32::from_be_bytes([127, 0, 0, 1]).to_be()
                },
                sin_zero: mem::zeroed()
            };

            let result = bind(socket, &servaddr as *const sockaddr_in as *const sockaddr, mem::size_of_val(&servaddr) as u32);
            if result < 0 {
                println!("last OS error: {:?}", Error::last_os_error());
                close(socket);
            }

            listen(socket, 128);

            loop {
                let mut cliaddr: sockaddr_storage = mem::zeroed();
                let mut len = mem::size_of_val(&cliaddr) as u32;

                let client_socket = accept(socket, &mut cliaddr as *mut sockaddr_storage as *mut sockaddr, &mut len);
                if client_socket < 0 {
                    println!("last OS error: {:?}", Error::last_os_error());
                    break;
                }

                thread::spawn(move || {
                    loop {
                        let mut buf = [0u8; 64];
                        let n = read(client_socket, &mut buf as *mut _ as *mut c_void, buf.len());
                        if n <= 0 {
                            break;
                        }

                        println!("{:?}", String::from_utf8_lossy(&buf[0..n as usize]));

                        let msg = b"Hi, client!";
                        let n = write(client_socket, msg as *const _ as *const c_void, msg.len());
                        if n <= 0 {
                            break;
                        }
                    }

                    close(client_socket);
                });
            }

            close(socket);
        }

    });

    thread::sleep(Duration::from_secs(1));

    // client
    unsafe {
        let socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
        if socket < 0 {
            panic!("last OS error: {:?}", Error::last_os_error());
        }

        let servaddr = sockaddr_in {
            sin_family: AF_INET as u16,
            sin_port: 8080u16.to_be(),
            sin_addr: in_addr {
                s_addr: u32::from_be_bytes([127, 0, 0, 1]).to_be()
            },
            sin_zero: mem::zeroed()
        };

        let result = connect(socket, &servaddr as *const sockaddr_in as *const sockaddr, mem::size_of_val(&servaddr) as u32);
        if result < 0 {
            println!("last OS error: {:?}", Error::last_os_error());
            close(socket);
        }

        let msg = b"Hello, server!";
        let n = write(socket, msg as *const _ as *const c_void, msg.len());
        if n <= 0 {
            println!("last OS error: {:?}", Error::last_os_error());
            close(socket);
        }

        let mut buf = [0u8; 64];
        let n = read(socket, &mut buf as *mut _ as *mut c_void, buf.len());
        if n <= 0 {
            println!("last OS error: {:?}", Error::last_os_error());
        }

        println!("{:?}", String::from_utf8_lossy(&buf[0..n as usize]));

        close(socket);
    }
}

Calling an external function isunsafeYes, I temporarily put the code into a large one for simplicity and convenienceunsafe {}After that, we will package them intosafeAPI for. In order to facilitate the test, I put the server program into a thread, wait for 1 second, and then let the client establish a connection.

std::io::Error::last_os_errorThis function is used to capture the error fed back to us by the kernel after the function operation fails.

In callbindandconnectFunction, first createsockaddr_inStructure, port(sin_port)And IP address(s_addr)It’s big endian, so I calledu16andu32Yesto_be()Method to convert it to network byte order.u32::from_be_bytesThe function is to[127u8, 0u8, 0u8, 1u8]Convert tou32Integer, because what we see is already a large end, it will become a small end when converted back, so it will be called laterto_be(), you can also directlyu32::from_le_bytes([127, 0, 0, 1])。 Then usedstd::mem::zeroedFunction to create an [0u8; 8] array. You can also directly [0u8; 8], where they are equivalent. Next, we cast the&sockaddr_inConvert to*const sockaddr_inType, and continue to convert to*const sockaddr, if you understand the example of “gethostname” at the beginning, it should be well understood here. It can also be abbreviated as&servaddr as *const _ as *const _, the compiler automatically derives the type.

In callacceptFunction, amut sockaddr_storageType conversion is also performed. Whysockaddr_storageinstead ofsockaddr_inandsockaddr_in6Becausesockaddr_storageThis universal structure is large enough to carrysockaddr_inandsockaddr_in6The address structure of any socket, so if we socketbindTo oneIPv6Address, the code here does not need to be modified. I still use itstd::mem::zeroedFunction initializationsockaddr_storageI’m also confused about its structure, so I use this function, which isunsafe, be careful when using. You can also continue to try this function:

let mut a: Vec<u8> = unsafe { std::mem::zeroed() };
a.push(123);
println!("{:?}", a);

stayreadandwriteType conversion is also required.

Many times, the type is simply “not strong”. OK, that’s all for this section.

Recommended Today

Seven Python code review tools recommended

althoughPythonLanguage is one of the most flexible development languages at present, but developers often abuse its flexibility and even violate relevant standards. So PythoncodeThe following common quality problems often occur: Some unused modules have been imported Function is missing arguments in various calls The appropriate format indentation is missing Missing appropriate spaces before and after […]