You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

1

Qwen3-VL-30B-A3B-Instruct-abliterated

Qwen3-VL-30B-A3B-Instruct-abliterated is an abliterated (v1.0) variant of Qwen3-VL-30B-A3B-Instruct**, designed for Abliterated Reasoning and Captioning. This model leverages the Qwen3-VL-MoE (Mixture of Experts) architecture to deliver deeply descriptive, context-rich, and reasoning-oriented multimodal outputs. It handles complex, sensitive, and nuanced visual content while maintaining balanced interpretive coherence and multilingual adaptability.

1

Key Highlights

  • Abliterated / Uncensored Captioning and Reasoning Fine-tuned to bypass standard content filters while preserving factual accuracy, descriptive depth, and logical reasoning.

  • High-Fidelity Reasoning and Visual Understanding Generates detailed captions and structured reasoning for diverse visual categories—artistic, technical, abstract, or low-context.

  • Mixture of Experts (MoE) Efficiency Built on Qwen3-VL-MoE, dynamically routing computation through specialized experts for enhanced precision and scalability.

  • Aspect-Ratio Robustness Performs consistently across wide, tall, square, panoramic, and irregular visual formats.

  • Variational Detail Control Supports both concise summaries and highly detailed reasoning narratives, depending on prompt configuration.

  • Multilingual Output Capability Defaults to English but adaptable for multilingual use through prompt engineering.


Base Model Signatures:

This model has been re-sharded and optimized for the latest Transformers version from the base model: https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-30B-A3B-Instruct-abliterated.


Quick Start with Transformers

from transformers import Qwen3VLMoeForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

model = Qwen3VLMoeForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated-v1",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated-v1")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Provide a detailed caption and reasoning for this image."},
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)

inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
).to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=128)

generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

This model is suited for:

  • Generating detailed, uncensored captions and reasoning for complex or creative visual datasets.
  • Research in multimodal reasoning, safety evaluation, and content moderation studies.
  • Enabling descriptive captioning and analytical reasoning for datasets excluded from mainstream models.
  • Creative applications such as narrative generation, artistic interpretation, and visual storytelling.
  • Advanced reasoning over diverse visual structures and aspect ratios.

Limitations

  • May produce explicit, sensitive, or offensive content depending on input and prompt.
  • Not recommended for deployment in production systems that require strict moderation or filtering.
  • Style, tone, and reasoning detail can vary based on prompt phrasing.
  • May show variable performance on synthetic, abstract, or highly stylized visual inputs.
Downloads last month
30
Safetensors
Model size
31B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated-v1

Finetuned
(21)
this model
Quantizations
4 models

Collection including prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated-v1