Scaling Down to Scale Up: The Rise of On-Device Intelligence and Multiverse Computing’s Play for AI Independence

The artificial intelligence sector, long characterized by its insatiable appetite for massive data centers and increasingly expensive GPU clusters, is reaching a critical inflection point. As the financial foundations of the AI supply chain show signs of strain, a new paradigm is emerging: the move away from the cloud and toward the "edge." At the forefront of this shift is Multiverse Computing, a Spanish deep-tech firm that is leveraging quantum-inspired mathematics to shrink the world’s most powerful AI models into forms that can run on a common smartphone.

The impetus for this shift is not merely technological, but deeply economic. Recent data indicates a troubling trend in the private credit markets, with company defaults climbing toward 9.2%—the highest level in years. This volatility has sent shockwaves through the venture capital ecosystem. Lux Capital recently issued a stern advisory to its portfolio companies, urging them to secure written, legally binding commitments for compute capacity. The warning is clear: in an era of financial instability, a "handshake agreement" with a cloud provider or a compute-rich partner is no longer a viable business strategy.

This atmosphere of counterparty risk has reignited interest in local execution. If a company can run its intelligence on its own hardware—be it a laptop, a mobile device, or an industrial sensor—it eliminates the risks associated with external infrastructure. It is within this niche of "AI independence" that Multiverse Computing is positioning its latest suite of tools, transitioning from a high-profile research lab into a mainstream enterprise provider.

The Science of Shrinking Intelligence

Multiverse Computing’s core value proposition lies in its proprietary compression technology, branded as CompactifAI. While traditional AI compression often relies on "pruning" (deleting less important neurons) or "quantization" (reducing the precision of numerical values), Multiverse utilizes quantum-inspired tensor networks. This approach allows the company to identify and preserve the essential mathematical relationships within a model while discarding the redundant parameters that bloat its size.

The results are significant. The company has successfully compressed models from the industry’s heavyweights, including Meta’s Llama series, OpenAI’s open-source contributions, and the high-performing models from Mistral and DeepSeek. By reducing a model’s footprint by 50% to 90% while maintaining high levels of accuracy, Multiverse is addressing the primary bottleneck of modern AI: the "memory wall."

To demonstrate this capability to the public, the startup recently launched the CompactifAI app. On the surface, it functions similarly to well-known interfaces like ChatGPT or Mistral’s Le Chat. However, the architectural difference is profound. The app features "Gilda," a model specifically optimized for local, offline execution. When a user asks a question, the processing occurs on the device’s own silicon, ensuring that data never leaves the handset and that the system remains functional even without an internet connection.

Navigating the Edge: The "Ash Nazg" System

Transitioning to on-device AI is not without its hurdles. Hardware limitations remain a formidable barrier; local execution requires a significant amount of RAM and storage, resources that are often lacking in older or mid-range mobile devices. To manage this, Multiverse developed a sophisticated routing layer named "Ash Nazg"—a nod to J.R.R. Tolkien’s "One Ring" inscription.

The Ash Nazg system acts as an intelligent traffic controller. It assesses the hardware capabilities of the user’s device in real-time. If the device possesses the necessary resources, the query is handled locally by Gilda, preserving privacy and saving on bandwidth. If the hardware is insufficient, the system automatically routes the request to a cloud-based version of the model via an API.

While this hybrid approach mirrors the strategy used by Apple Intelligence—which toggles between on-device processing and "Private Cloud Compute"—it highlights the ongoing tension between privacy and performance. When the system is forced to the cloud, the "privacy edge" of local AI is temporarily surrendered. For Multiverse, however, the app is less a mass-market consumer product and more a "living laboratory" and a showcase for their true target audience: the enterprise.

Multiverse Computing pushes its compressed AI models into the mainstream

From Showcase to Infrastructure: The API Portal

The launch of Multiverse’s self-serve API portal marks the company’s transition into a direct infrastructure provider. By allowing developers to bypass third-party marketplaces like AWS or Google Cloud, Multiverse is offering enterprises a more direct line to efficient intelligence.

The API portal provides direct access to compressed models like HyperNova 60B 2602. Built upon the foundation of the publicly available gpt-oss-120b, HyperNova represents a milestone in the "less is more" philosophy. Multiverse claims that HyperNova delivers faster response times and lower operational costs than the original model, despite being half the size.

This efficiency is particularly critical for "agentic workflows." Unlike simple chatbots, AI agents are designed to perform multi-step, autonomous tasks—such as writing and debugging code or managing complex supply chain logistics. These workflows are computationally intensive and sensitive to latency. By using a compressed model that can process tokens faster and more cheaply, businesses can scale their agentic operations without the exponential cost curve typically associated with large language models (LLMs).

The Competitive Landscape: Small is the New Big

Multiverse is not alone in its pursuit of efficiency. The broader AI industry is currently undergoing a "small model" revolution. Mistral AI, the French champion of the European AI scene, recently released Mistral Small 4, a model optimized for reasoning and coding that rivals much larger competitors. Mistral also introduced "Forge," a platform that allows companies to customize and fine-tune small models to their specific needs, choosing the exact trade-offs between speed, cost, and intelligence.

However, Multiverse’s quantum-inspired approach offers a different mathematical path than the standard optimization techniques used by Mistral or Meta. This specialized expertise has allowed the Spanish startup to secure a high-profile client list that includes the Bank of Canada, Bosch, and Iberdrola. These organizations operate in sectors—finance, manufacturing, and energy—where data privacy is a legal mandate and where "resilience" means the ability to operate in environments with zero connectivity.

Industrial Use Cases and the Future of Sovereign AI

The true potential of compressed AI lies in settings where the cloud simply cannot reach. In the defense and aerospace sectors, the ability to embed high-level reasoning into drones or satellites is a game-changer. A satellite with on-board AI can process image data locally, identifying points of interest and transmitting only the relevant information back to Earth, saving precious bandwidth and time. Similarly, in industrial IoT, sensors equipped with compressed AI can perform real-time predictive maintenance on factory floors without needing to send sensitive proprietary data to a third-party server.

Beyond the technical advantages, there is a growing movement toward "Sovereign AI." Governments and large corporations are increasingly wary of relying on a handful of Silicon Valley giants for their cognitive infrastructure. By utilizing compression technology to run models on sovereign hardware, these entities can regain control over their technological destiny.

Financial Trajectory and the Path Ahead

Multiverse Computing’s rise reflects the maturing of the European tech ecosystem. After securing a $215 million Series B round last year, the company is reportedly in the process of raising an additional €500 million. If successful, this would propel the company’s valuation beyond €1.5 billion, cementing its status as a "unicorn" and a pillar of Spain’s burgeoning deep-tech sector.

The road ahead will require balancing research-heavy quantum physics with the practical, often messy requirements of enterprise software deployment. While the CompactifAI app currently sees modest download numbers, its role as a proof-of-concept is vital. It demonstrates that the "compute crisis" is not an unsolvable problem, but a catalyst for innovation.

As the industry moves away from the "bigger is always better" mentality, the focus is shifting toward "fit-for-purpose" intelligence. In this new world, the winners will not necessarily be those with the largest data centers, but those who can extract the most wisdom from the smallest number of bits. Multiverse Computing is betting that the future of AI isn’t just in the stars or the cloud—it’s in our pockets and on our desks, running silently, privately, and efficiently.

Scaling Down to Scale Up: The Rise of On-Device Intelligence and Multiverse Computing’s Play for AI Independence

ByMaman Suherman

The Science of Shrinking Intelligence

Navigating the Edge: The "Ash Nazg" System

From Showcase to Infrastructure: The API Portal

The Competitive Landscape: Small is the New Big

Industrial Use Cases and the Future of Sovereign AI

Financial Trajectory and the Path Ahead

By Maman Suherman

Related Post

Accelerating Autonomy: Inside the Billion-Dollar Alliance Between Uber and Rivian’s R2 Platform

The Connectivity Revolution: How Nvidia is Redefining the Architecture of the AI Era

Digital Ouroboros: The Growing Friction Between AI Pioneers and the Developers Who Built Them