Saturday, March 6, 2010

Design software for mutli-core processors

If you recently purchase a computer, you certainly realized that AMD and Intel are no longer increasing  CPU clock speed, it's not possible anymore. The new trend is to add parallel processing cores on the same microchip.
 
Last month, I changed my home PC and bought an Intel I7, it’s a quad-core with Hyper-threading technology; Windows 7 sees it as eight logical processors. It’s a very powerful workstation for only $ 1000 usd.

Imagine how many cameras a high end workstation with 2 quad core Intel I7 processor (16 logical processors) can decode at the same time, no reasons to complain about the expensive decoding cost of H.264.Unfortunately, 8 or 16 cores doesn’t mean all softwares automatically run 8-16x faster.

It would be 8-16x faster if the CPU clock speed was increased by 800%-1600%. Designing software that takes advantages of 1-2 cores is different than designing software that scales linearly from 8 to 64 processing cores.
Before Windows Server 2003 R2, all TCP/IP communications in Windows were managed on the same logical processor; it was impossible for a network intensive application to use the entire bandwidth available with 1 and 10 gigabit/sec switches. The logical processors managing the network drivers was hitting 100% before all the others. In Windows Server 2003 R2, they improved the network stack ( Receive-Side Scaling) to spread the load on multiple processors at the same time. Therefore the Omnicast Archiver support more cameras on Windows Server 2003 R3 or Windows Server 2008 than Windows Server 2003 R1.

Windows 7 is definetly better than Vista, I'm using it for a while now on my Tablet PC and it's a lot faster. Microsoft rewrote some portion of the Windows 7 kernel to leverage more efficiently multi-core processors, I was not able to find out which improvements they made but I clearly saw a difference.

With H.264 cameras, it's important that VMS software leverage multi-core computers efficiently. The next article gives methods to evaluate if a VMS software will scale on a multi-core system.

Next article

5 comments:

  1. "With H.264 cameras, it's important that VMS software leverage multi-core computers efficiently"

    why??

    "Before Windows Server 2003 R2, all TCP/IP communications in Windows were managed on the same logical processor"

    but there is chip solutions even on XP pro
    i was installed 22 IP cam on the dell optilex 740 its work fine. if you installed 3 network (with different IRQ) card the OS reffer to each divice with different threde. try this its working :)

    Hai

    ReplyDelete
  2. Hi Hai,
    If I understand correctly your suggestion, you must configure 3 different IP address on 3 different subnet/VLAN.

    Then you must split your cameras evenly on 3 subnet/VLAN.

    It's easier to run Windows 2003 R2 or 2008.

    Jo

    ReplyDelete
  3. To answer your first question,

    Why: "With H.264 cameras, it's important that VMS software leverage multi-core computers efficiently"

    It's important because H.264 cameras require more CPU to decode than MPEG-4.

    If your software doesn't evently spread the load, you will be quickly limited.

    Jo

    ReplyDelete
  4. Hi Jonathan,

    Sorry my english, is not very high... :)

    "Before Windows Server 2003 R2, all TCP/IP communications in Windows were managed on the same logical processor"

    Are you sure? The I/O Completion Port model (async I/O call) support the multi processors architecture, and can run on Windows2000. So, the TCP/IP communications can be managed on multiple processors?


    DevilFish

    ReplyDelete
  5. Hi DevilFish,
    I/O completion ports are an efficent way to implement multi-threaded applications but the limitation I refer is in the Windows kernel (TCP/IP stack) and impact any applications that receives a lot of network traffic.

    It has been solved in Server 2003 R2, they called that improvement as "Receive-Side Scaling".

    You can find more information in a white paper written by Microsoft:

    ttp://www.microsoft.com/whdc/device/network/scale.mspx

    "Thus in Windows Server™ 2003 and prior operating systems, the network protocol stack’s receive processing (and in some cases transmit processing) was effectively limited to the amount of computation a single CPU could provide"

    Jo

    ReplyDelete