Stars
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Arena-Hard-Auto: An automatic LLM benchmark.
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
Virtual whiteboard for sketching hand-drawn like diagrams
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
An incremental parsing system for programming tools
Distribute and run LLMs with a single file.
Generative AI extensions for onnxruntime
A high-throughput and memory-efficient inference and serving engine for LLMs
Open-Sora: Democratizing Efficient Video Production for All
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Interact with your documents using the power of GPT, 100% privately, no data leaks
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
High-speed Large Language Model Serving for Local Deployment
Example models using DeepSpeed
Semantic Evaluation for Text-to-SQL with Distilled Test Suites
A collection of GPT system prompts and various prompt injection/leaking knowledge.
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Crawl a site to generate knowledge files to create your own custom GPT from a URL
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Fast and memory-efficient exact attention