Squid’s technology gives you the freedom to migrate your demanding hardwired System on Chip (SoC) image/video processing algorithms to a (programmable) processor based platform. Over the past several decades, with each new generation of more powerful embedded processors, system architects continue to move applications to software.
Our technologies solve several of the bottle necks associated with traditional programmable processors operating in serial manner specifically in video applications. These bottle necks include computational inefficiency, high clock rates, and significant hardware used to control computing resources. Hardware used to control computing resources is reduced drastically by employing a statically scheduled data flow processor architecture with which the entire control will happen during software development. Squid’s processors exploit maximum parallelism to reduce clock rates by employing innovative Vivid block consisting of multiple video application specific functional units with a balanced control and dataflow performance. We minimize control hardware by patent pending data flow processor architecture. A high degree of computational efficiency is further achieved by employing task specific processor cores and innovative memory architecture. For example Squid processors process 4 – 8 times more instructions in parallel than conventional Very Long Instruction Word (VLIW) processors; 10 – 30 times more SIMD operations than most processors. In addition we resolve input-output bottlenecks using innovative interconnect between ALU’s in data flow processor.
High Level of Computational Efficiency
The computational efficiency of different hardware architectures spans several orders of magnitude. A high level of programmability is usually associated with low computational efficiency. Whereas a high level of computational efficiency is associated with inflexibility in state-of-the art technologies. Consequently, current System-on-Chip (SoC) designs commonly include programmable parts solely for relatively low-performance, control-oriented tasks, whereas the bulk of the computations are executed on dedicated hard-wired logic.
To obtain high performance levels coupled to low area cost and low power dissipation only attainable till now with hardwired accelerators, Squid’s application specific programmable processors combine:
- minimal control overhead: moving virtually all processor control to an innovative data flow scheduler
- extreme levels of different styles of parallelism: data level (SIMD), instruction level (VLIW), and task level (multi-core)
- application domain tuning: combining general purpose RISC operations with computationally efficient custom operations like bit-stream processor and motion-estimation engine.
Using Squid processors, SoCs can be made programmable for application domains, which for reasons of performance, power dissipation and costs used to be the preserve of hardwired solutions.
Ability to utilize more than 90% of theoretical through put for practical applications like H.264 decoding which brings lower clock and lower power for the application.
VIVID (Video Processor Template Block)
VIVID (the basic building block of Squid’s solutions) allows Squid to address a broad range of solutions in which the right balance between performance, power dissipation, area cost and flexibility can be struck for a target application and customer constraints.
To illustrate how this balance is achieved and how the almost paradoxical combination of high flexibility can be combined with high performance, low power dissipation and low area cost, one can approach the problem from two angles:
- Static scheduled data flow processor with simple instruction set, with minimizing the loss in flexibility: bottlenecks in dynamically scheduled data flow processors can be removed by static scheduling of resources at compile time. Multiple ALU’s connected using efficient register file interconnect and resolving the data conflicts during compile time brings processors closer to the characteristics that make hardwired logic computationally efficient. Moving control complexity from the hardware to the data flow scheduler reduces timing bottlenecks and limits area and power spent on hardware that does not directly contribute to data computation.
- Making hardwired application specific blocks more flexible, while minimizing the loss in efficiency: deploying application specific blocks such as bit-stream processor and Motion estimation accelerator brings in the efficiency and flexibility with programming the motion estimation accelerator to meet the needs of any motion estimation algorithm. Similarly bit-stream processor with generic RISC instruction set along with application specific custom instructions enables to achieve flexibility and efficiency. Moving the control complexity of the logic to the compiler and into application software increases flexibility while maintaining high performance, low power and low area costs.
Squid's technology uses both techniques, increasing the overall computational efficiency of the IC.
Vivid (basic building block) is suitable to perform any type of video application as it combines all the functional units required for it. To achieve High definition video processing at lesser clock rates multiple Vivid engines can be connected using shared memory architecture depending on the application requirements and customer constraints. A two level memory architecture in the multi processor architecture brings flexibility in interconnect and communication across vivid processors. Below diagram shows the example of how multiple Vivid units will be connected.
April 07, 2015: Real-time 4K60 UHD HEVC Encode at Half the Bit-Rate of H.264 read more …
Sept. 11, 2014: Squid Demonstrates Industry-Leading HEVC Solutions at IBC read more …
Aug 05, 2014: Squid Systems Video Codec Hardware IP Available for Licensing read more …
Apr. 4, 2014: Squid System Announces Realtime HEVC Encoder read more …
Feb. 25, 2014: Squid Systems Provides HEVC Decoder for Mobile Devices read more …