How to Use a Small Transformer

IBM launches Granite 4.0 to cut AI infra costs with hybrid Mamba-transformer models

Built for long-context tasks and edge deployments, Granite 4.0 combines Mamba’s linear scaling with transformer precision, ...

Renewable Energy World

Retrofit or replace? How utilities are evaluating transformer life extension

This Reinhausen and Factor This webinar contains strategic insights into OLTC retrofit decision frameworks being implemented ...

IBM releases Granite 4 series of Mamba-Transformer language models

The most advanced Granite 4 model, Granite-4.0-H-Small, includes 32 billion parameters. It has a mixture-of-experts design ...

'Western Qwen': IBM wows with Granite 4 LLM launch and hybrid Mamba/Transformer architecture

The Qwen family from Alibaba remains a dense, decoder-only Transformer architecture, with no Mamba or SSM layers in its mainline models. However, experimental offshoots like Vamba-Qwen2-VL-7B show ...

When’s The Last Time You Replaced Your Surge Protectors? Here’s How to Check If It’s Time.

You should always plug complex (and expensive) electronics, such as televisions, computers, and home audio systems, into a ...

10d

What if we've been doing agentic AI all wrong? MIT offshoot Liquid AI offers new small, task-specific Liquid Nano models

According to the company, Liquid Nanos deliver performance that rivals far larger models on specialized, agentic workflows ...

Transformer by Mint | To H-1B or not to be?

This week we wrote about Trump’s $100k H-1B fee that could upend Indian tech dreams, strain US companies, and shake a decades ...

EE World Online

What are attention mechanisms, and how do they work in speech and audio processing?

This FAQ talks about how attention mechanisms work at their core, how they are used in automatic speech recognition systems, ...

Entrepreneur

Two-Thirds of Small Businesses Are Already Using AI — Here’s How to Get Even More Out of It

A majority (68%) of small businesses have integrated AI into their daily operations, with 74% of them reporting an increase in productivity. Generative AI simplifies content creation while agentic AI ...

DeepSeek tests “sparse attention” to slash AI processing costs

DeepSeek-V3.2-Exp builds on the company's previous V3.1-Terminus model but incorporates DeepSeek Sparse Attention. According ...

Understanding the power of Small Language Models (SLMs)

Small can be powerful. In the discussions of AI engines, large language models (LLMs) often dominate the conversation due to ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results