Skip to content
View an1018's full-sized avatar

Block or report an1018

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 9,328 645 Updated Mar 27, 2025

[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.

2,313 144 Updated Mar 25, 2025

ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。

54,607 13,576 Updated Jan 1, 2025

ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平替们,包括文本大模型、多模态大模型等,为大家提供一些便利

2,030 201 Updated Aug 14, 2023

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,706 434 Updated Aug 7, 2024

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

613 41 Updated Mar 27, 2025

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Python 734 59 Updated Feb 1, 2024

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

Python 15,395 2,291 Updated Mar 13, 2025

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 7,736 778 Updated Aug 12, 2024

✨✨Latest Advances on Multimodal Large Language Models

14,509 935 Updated Mar 28, 2025

LaTeX template for USTC thesis

TeX 1,803 423 Updated Mar 28, 2025

On-device AI across mobile, embedded and edge for PyTorch

C++ 2,656 497 Updated Mar 29, 2025

VisionLLM Series

Python 1,036 44 Updated Feb 27, 2025

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,969 270 Updated Jun 4, 2024

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

JavaScript 122,167 16,391 Updated Mar 18, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 22,023 2,416 Updated Aug 12, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,992 2,612 Updated Mar 4, 2025

TensorFlow code and pre-trained models for BERT

Python 38,917 9,669 Updated Jul 23, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 142,130 28,458 Updated Mar 29, 2025

README and scripts for the Cityscapes Dataset

Python 2,230 608 Updated Oct 22, 2024

A topic-centric list of HQ open datasets.

62,624 10,080 Updated Nov 13, 2024

深度学习经典、新论文逐段精读

29,628 2,606 Updated Mar 22, 2025

这是一个clip-pytorch的模型,可以训练自己的数据集。

Python 219 28 Updated Apr 5, 2023

Collection of Remote Sensing Vision-Language Models

132 4 Updated May 13, 2024

cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理

12,486 2,292 Updated Apr 25, 2024

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,642 4,872 Updated Feb 23, 2025

Document Dewarping with Control Points

Python 169 34 Updated Oct 7, 2022

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)

Python 12,831 2,067 Updated Aug 7, 2024
Next
Showing results