Xinghan Multimodal Models

What are Multimodal Models?

Compared to unimodal models, which are limited to processing a single data type (e.g. text-only or image-only), multimodal models are advanced AI systems capable of simultaneously processing and deeply integrating multiple heterogeneous data types such as text, image, and video.

What can Multimodal Models do?

Based on the multimodal capabilities of the Dahua Xinghan M-series large model, it achieves efficient alignment and collaborative understanding between images and natural language, empowering diverse applications such as WizSeek (text-to-image search) and Text-Defined Alarms features.

WizSeek

Text-Defined Alarms

What is WizSeek?

Powered by the Xinghan Multimodal Models, WizSeek revolutionizes video investigation through natural language search. Simply describe your target (people, vehicle, animal or item, etc.) and WizSeek instantly retrieves matching footage across recorded video archives. By replacing manual review with intelligent, high-accuracy search, it delivers faster, more intuitive results.

Key Benefits

Search Widely Cover 400+ categories, from persons, vehicles, animals to signs, plants and beyond.

Search Accurately Powered by Dahua Xinghan AI models. Significantly improved search & translation accuracy.

Search Instantly Find targets in seconds. NVR processes up to 16 targets/second.

Search Friendly One-click access, fuzzy search, voice/text/image on DMSS. 30+ languages supported.

Text-Defined Alarms

What is Text-Defined Alarms?

The Text-Defined Alarms allows users to define custom alert rules through text descriptions. By developing new algorithms based on prompt text, it significantly lowers the development barrier and replaces traditional complex customization processes—which required training CNN models with thousands of annotated data samples and deploying them. Users can instantly create custom alerts using simple text rules, without coding or complicated procedures.

Key Benefits

Zero Technical Barriers Generate custom algorithms with just words, no coding needed.

Low-Cost Operation Slash expensive data collection and model training costs.

Multi-Scenario Adaptability Adapt to diverse scenarios with simple text inputs.

Real-time Alarm notification Support receiving real-time alarm information on Local/Web , DMSS and DSS.

How to optimize Text-Defined Alarms

By Self-Learning Algorithm to perform on-device training and optimization on the same IVSS, enabling algorithms to grow smarter and more accurate with every use.