High latency on XS3800 switch on 10G SFP+ ports (+0.200ms)

nafetsreuab
nafetsreuab Posts: 5
First Comment
edited August 2022 in Switch
Hi,

we observe high latency (avg. 0.200ms) on XS3800 switch between 10G SFP+ ports. This is our cluster communication and the high latency creates significant delay. Cables are 2m long.

--- 172.16.1.6 ping statistics ---
664 packets transmitted, 664 received, 0% packet loss, time 678893ms
rtt min/avg/max/mdev = 0.105/0.201/0.405/0.049 ms


Direct connection between our cluster nodes shows latency of avg. 0.050ms).
Switching with our cisco nexus switch shows avg of 0.060ms).

Overall system load is between 10-15% on Zyxel Switch.
Total throughput is around 1gbit in total at the time of tests.

Test is done with a simple ping between 2 linux nodes - over the switch. 


Something is very slow with zyxel. Any ideas?

All Replies

  • Sakura_T
    Sakura_T Posts: 101  Ally Member
    First Anniversary Friend Collector First Answer First Comment
    What brand/model of transceiver are you using?
  • On server-side, we use:

        Vendor name                             : Intel Corp
        Vendor OUI                                : 00:1b:21
        Vendor PN                                 : FTLX8571D3BCV-IT
        Vendor rev                                : A


  • Zyxel小編 Lucious
    Zyxel小編 Lucious Posts: 278  Zyxel Employee
    First Anniversary Friend Collector First Answer First Comment
    Hi @nafetsreuab

    In current Zyxel design, the priority of ICMP process in CPU queue is rather moderate which may lead to higher ping latency.
    In general usage, it should not cause major impact to the service.

    Zyxel_Lucious
  • It does cause major impact to the service. Our cluster depends on low latency.
    Our second cisco Nexus switch does show much better (lower) latency.

    If you lower the process priority for ICMP traffic, I might not be the last one asking questions as icmp/ping is the default tool for testing.

    Other tests via UDP shows the same bad (high) latency on our end.
  • Zyxel小編 Lucious
    Zyxel小編 Lucious Posts: 278  Zyxel Employee
    First Anniversary Friend Collector First Answer First Comment
    @nafetsreuab

    May we know what application/software you use for the ping test?

    Zyxel_Lucious
  • simple ping on linux console.
  • Zyxel小編 Lucious
    Zyxel小編 Lucious Posts: 278  Zyxel Employee
    First Anniversary Friend Collector First Answer First Comment
    edited November 2020
    @nafetsreuab

    Here are our local test result.

    The difference seems not as big as yours.
    We wonder if your servers/switch had some traffic loading during the test? 
    Would you mind to test again with lower loading?

    Look forward to your reply, thanks.

    Zyxel Lucious
  • Zyxel小編 Lucious
    Zyxel小編 Lucious Posts: 278  Zyxel Employee
    First Anniversary Friend Collector First Answer First Comment
    edited November 2020
    @nafetsreuab

    In addition, may we also know your network application/scenario involving Linux cluster, Cisco Nexus, and our XS3800 stacking switches? Is it like a data center?
    And what model is the Cisco Nexus switch?

    Zyxel_Lucious
  • We just compare a simple linux ping and the latency is bad with zyxel. Thats all that is interesting for now. Lets keep it simple. We compared the latency with a WS-C3850-48XS.
  • Zyxel_Adam
    Zyxel_Adam Posts: 332  Zyxel Employee
    First Anniversary 10 Comments Friend Collector First Answer
    Dear customer, 

    There are two aspects to this case. 

    First is the latency gap comparing between XS3800 enterprise switch vs Cisco Nexus Data Center Switch.

    The Switching performance really depends on ASIC design among different grades of the product, which associates with cost as well (ie: 40G links for Switch port speed).

    Data Center vertical is a unique segment which requires very low latency performance, while enterprise 10G Switch mostly serves as aggregation switches for SMB/SME network which has less concern over it.

     

    The Second part is the end-to-end Switch performance when connecting with servers in customer application.

    HQ tried to simulate the scenario and through different experiment we found out that branding of the 10G transceiver used in the system is a key factor for the latency number measured.  We have tried three different bands of 10G transceivers and the latency is different among them.

     

    We apologize that there are quite limited improvements we can offer to improve the latency concern. 


    Zyxel Support Team



    Adam